This month’s TSQL Tuesday event is being hosted by Pat Wright (Blog | Twitter) and the topic this month is Automation!
“I figured that since many of you out there set a goal this year to blog more and to learn Powershell then this Topic should help in both of those goals. So the topic I have chosen for this month is Automation! It can be Automation with T-SQL or with Powershell or a mix of both. Give us your best tips/tricks and ideas for making our lives easier through Automation.”
Automation is a topic that is very near and dear to my heart. The only reason I survive day to day in my environment is because of Automation. Recently on the forums someone asked how you manage an environment with 100+ SQL Servers and thousands of user databases. A lot of the responses said to purchase a third party monitoring tool, but as someone that manages an environment of this size I know first hand the significant cost associated with purchasing tools for this large of an environment, and getting approval for that kind of expense isn’t something that happens overnight. For this reason, I didn’t consider this type of response to be an answer to the specific problem that the person was asking about. Below is the response that I posted to the question and what I would suggest to anyone that has a large environment that they are just taking over.
Hopefully you at least have a list of all the SQL Server Instances in your environment. If you don't have a complete list of all the instances, that is the first thing you are going to have to get. There are a couple of ways to go about doing this. The Microsoft Assessment and Planning Toolkit is a free tool from Microsoft that a Domain Administrator can run in the environment to scan all of the systems for SQL Server. You can also use the Quest® Discovery Wizard for SQL Server which is free and does a good job as well. I found servers with the MAP toolkit that weren't discovered by the Discovery Wizard from Quest and vice versa, so you will probably want to run them both and match the results for a fill picture.
Once you have a list of servers and instances, go to any SQL Server 2008 instance in your environment and setup the Central Management Server and add all of the SQL Servers into it. From there, you can use the CMS to accomplish a lot of tasks. I'd recommend that you download the Enterprise Policy Management Framework from Codeplex and set it up on the CMS. This can be used to automate a lot of things with Policy Based Management, even against 2000 and 2005.
As someone that manages a very large environment that is exactly what you describe I can tell you that the best thing I did when I took over the environment was learn to use Powershell. With your master list of instances in a text file, you can find and fix a lot of problems by leveraging SMO with Powershell. I'd recommend that you download PowerGUI, a free script editor for Powershell, also by Quest Software. PowerGui gives you AutoComplete and the ability to create complex scripts with debugging. You can find some examples of how you can leverage powershell to push changes to your entire environment on my blog posts:
Configuring SQL Server Audit Logging with Powershell
Setting SQL Server ErrorLog Retention and Rollover with Powershell
Working with the SQL Server Powershell Provider and Central Management Servers from Powershell
Others have recommended third party tools and I am not saying that you shouldn't do that, but I would be very careful about what tool I allowed in my environment. Some of the monitoring tools recommended on this post have a significant impact to the environment being monitored, so make sure you don't just jump out and buy a tool without a lot of research and due dilligence. Install any tool you are considering in a test environment, and then monitor its impact to the environment. Fire up a SQL Trace and watch the activity the tool generates against SQL Server. Look at the impact the tool has to the information captured and contained in the Default Trace, and most importantly, see if the tool hides its own activity from its reports because some vendors hide the activity being generated by their tool from you. Make sure you check out all the vendors and not just the big names:
Even if you could get immediate approval for the cost that the licenses for 100+ servers will cost, it takes time to get any monitoring software installed and configured so it doesn't drive you completely nuts. I switched from a third party SQL specific tool to Operations Manager because the it was the lowest cost solution for managing an environment that large, and I had experience from the past working with it. That said, it is also the most annoying monitoring system out there unless you have the time to configure it and set the necessary overrides and thresholds for the monitors so that it fits your environment. All of the tools I've ever tested tend to be noisy out of the gate, and you have to tune and set thresholds to get it to where its telling you what you want to know about only. Any of the vendors, Microsoft included can provide a resource during an implementation to customize the monitoring thresholds and teach you enough to be able to configure the tool long term.
You can also roll your own monitoring using things freely available online. Even with Operations Manager installed in my shop, I have a sqladmin database on every server that holds administrative objects. Those objects do all kinds of tasks, and write data to tables in the sqladmin database that have reports on top of them. I have a single SQL Script that deploys this database, setups up the Agent Jobs to do data collection for the monitoring, and configures Database Mail so that the jobs can send me notifications when things aren't right. You can find a lot of this stuff in the Automated DBA series of articles I wrote a few years back:
Hope that helps. Its a lot I know, but when I took over my current position there were 200+ SQL Servers in the environment, none of them best practice configurations, all of them full of problems, and I can feel your pain. A year later I have been able to consolidate down to 132 SQL Servers and fix a lot of problems. Powershell and automation are your friend here. You don't have time to look at every server once a month, you need them to tell you when somethings not right. The other thing I'd tell you is be willing to tell your employer you need help. I was hired into a single DBA position and over a year built a team of three with a fourth position open. Look at places where you could save money for your employer by having additional resources, and time is money. Feel free to contact me if you have any questions.
While I use Operations Manager for a lot of basic monitoring tasks, I rely more heavily on my own automation scripts in the day to day management of my environment. Managing a very large SQL Server environment is nearly impossible to do if you don’t take steps to automate your processes and operations. For a number of the tasks that can be automated, a lot of good code exists online already, minimizing the amount of work that you would have to do to automate most common tasks.