If you are an application developer, you should find a new job if you can't do multu-threaded programming. What if you are a DBA? Probably not, not to the same extent anyway.
Some in our community clearly realize the importance of being able to do things in parallel. Adam Machanic, for instance, has put in a lot of efforts in this area and is trying hard to spread the message on parallelism.
Unfortunately, the community in general does not seem to be as convinced. Perhaps when you've become accustomed to finding workarounds to drive nails with a wrench, you may not realize that hammers are a much better tool for that task.
Although doing parallelism does not necessarily mean that every DBA should become conversant in multi-threaded programming, I’d argue that it’s a good skill to have, and once you are comfortable with it, you’ll find plenty of opportunity to fruitfully apply it.
Here is a little anecdotal evidence from my recent experience.
So I needed to automate the removal and creation of a lot of replication setups, and for that I fully automated the generation of the replication delete and create scripts. I also automated the execution of these scripts. However, one potential issue with executing these scripts is that since these are kind of DDL scripts, and as you may know, any DDL change can easily get blocked. Thus, I had to automate the monitoring of blocking. Furthermore, I needed to automatically remove any blocking, if the execution ofthe a replication script got blocked.
Now, this handling of blocking needed to be granular and precise in that I needed to clear a spid only if it’s blocking the execution of a replication delete or create script. The solution I ended up with is to have the code that controlled the replication script execution spawn a monitoring thread for each of the servers (i.e. the publishers and subscribers), and as the main control code cycling through these servers to apply the replication scripts, it goes through the following logic:
- The main control code connects to the server on which a replication script is to be executed, and retrieves the spid of its own connection. It passes this spid to the monitoring thread so that the monitoring thread can focus on that connection only, and will not be bothered by any other activities that may be going on the same server.
- The monitoring thread connects to the same server and waits for the go-ahead.
- The main control code signals the monitoring thread to start monitoring.
- The main control code executes the replication script.
- The main control code signals the monitoring thread to stop monitoring and wait for a new replication script to be executed.
The actual logic is slightly more complex because when we execute the replication create/delete script on a publisher, the publisher will make a connection to the subscriber and/or the distributor, we don't want those connections be blocked either. The monitoring thread must take care of that.
Although most or many of the DBA work appear to be conducive to single-threaded automation, tasks like this are not. If you think using multi-threading for this task is an overkill or there is a better way, l’d appreciate it if you eave a comment.