Migrating an Instance of SQL Server to New Disks
I get to do something pretty entertaining this week – migrate SQL instances on a 2008 cluster from one disk array to another! Zut alors! I am so excited I can hardly contain myself, so let’s get started. (Only a DBA could love this stuff, am I right? I know.) Anyway, here’s one method of many to migrate your data.
Assumption: this is a host-based migration, which just means I’m using the Windows file system to push the data from one set of SAN disks to another. Many SAN vendors have technology that will allow a SAN-based migration, where the SAN software transparently relocates all the data for you. For large sets of data, that’s often better. If you have that option, you should check it out. If not, read on, mon ami/e.
We have two disk arrays, let’s call them “old” and “new” (though, for reasons beyond the scope of this post, that isn’t technically accurate, it will make this much simpler to read). I have a two-node Windows Server 2008 Failover Cluster, with a SQL instance installed that uses five LUNs:
Root (a small 1GB disk that just provides a drive letter to host mount points for the four following disks)
In addition, the cluster has a Quorum disk (two node cluster, even number of machines, yadda yadda yadda) and there’s a Distributed Transaction Coordinator, which also has a disk for its log files.
Task: take all seven disks from the disk array “ye Olde Arraye” to the disk array “ye Shiney New Arraye.” We are allowed enough down time to copy the data and log files while the instance is offline.
Disclaimer: This is just an outline of a process, and there are almost certainly details of your environment that I cannot know. Please test carefully and take time to understand what you are doing before attempting this procedure. Rehearse and refine this for your environment before trying it in production.
Here we go:
- Have the SAN administrator present all the required LUNs from the new array to the servers that compose the cluster
- Connect to one of the cluster nodes
- In Disk Manager, locate the new disks and correlate them to the SAN LUNs (say, Disk 12 = root, Disk 14 = TempDB, whatever.) This depends somewhat on the vendor for your storage. In my case it’s EMC, so PowerPath has the LUN labels and disk numbers. I generally make notes of this relationship to refer back to in later steps.
- Identify which drive letters are unused/available on the cluster (Carefully!)
- For each new disk, use Disk Management to bring the disk online, then initialize it, and format it. Give every disk a drive letter from the supply of unused letters, even if the disk will eventually be connected via a mount point. Also label each volume clearly with the information from #3 so you can see what’s what later.
- Beware this cluster “gotcha.” Mount points are a beautiful thing on a SQL cluster, but it seems they have yet to be blessed with all the testing and kink-working-out that traditional drive letters have. Because of this, the clustering tools can bite you in the following way (I hope this is just a bug and will be fixed): You can bring a disk online, initialize and then format it with NO drive letter, which seems perfectly reasonable if the disk will be connected with a mount point later. However, when you use the cluster tools to create a disk resource for the disk, it will helpfully assign a drive letter. That wouldn’t be so bad, but it often uses a letter that’s already in use for one of your existing disks. That is not good, not at all. By temporarily assigning some arbitrary, available letter, we can prevent the cluster tools from creating a conflict with an existing disk.
We now have a collection of formatted disks, mounted by temporary drive letters. Next it’s time to hand them over to the cluster.
Give control of the disks to the cluster
- Open the Failover Cluster Manager
- Use the Add Disks task to add the new disks. This should create disk resources in the “available storage” group, with auto-generated names like Cluster Disk 1, Cluster Disk 2, etc. This is where the temporary drive letters are important, because this tool will use the drive letters you define, if they are present, but it will assign letters automatically if not.
- Use Right-Click > Properties on each new disk resource to change the name of the resource to something clear, again from step 3 above. I am obsessive about clear naming. Seriously – it’s worth it.
- If your system uses mount points, as mine does, then at this point arrange the required directories and mount points:
- Use Windows Explorer to make the empty directories required on your new SQL root disk
- Use Disk Management > Change Drive Letters or Paths to remove the temporary drive letters from the other, non-root, new SQL Server disks (data, log, backup) and replace them with mount points that use the empty directories. You should end up with a folder structure and mount points that match your existing SQL disk exactly, but where the root has a different letter (say, N: instead of E:)
- Why does this work? I’m glad you asked! This works because the drive letter/mount point mapping for a windows disk isn’t what the cluster uses to identify the disk. The cluster uses a unique identifier that is physically written onto each disk volume to identify and manage it (like a GUID, but not in the usual format of a GUID). The drive letter or mount point is just an OS construction overlaid after the disk is attached to the host. So when the disk is presented to the host and the cluster service is identifying it, and running through the rules about whether it should be mounted or not, the drive letter or mount point for access to the disk is completely irrelevant. That means we can change it. Cool, no?
Migrate the cluster quorum disk (if you have one)
- This is super simple: in the Failover Cluster Manager, right click the cluster itself in the tree and use the tool to reconfigure the cluster quorum, choosing a new disk: http://technet.microsoft.com/en-us/library/cc770620(WS.10).aspx > Steps for changing the quorum configuration in a failover cluster.
- We generally have a convention to use a drive “Q” for the quorum disk. To preserve the drive letter, at this point you can optionally restore the old drive letter to use for the new disk:
- In Failover Cluster Manager, locate the two disk resources for the quorum, the old one and the new one
- Right-click the old one and choose Change Drive Letter. Set it to <none>.
- Right-click the new disk and change the drive letter to the letter from the old disk.
Migrate the Distributed Transaction Coordinator
Note: with Server 2008, the configuration of Distributed Transaction Coordinator can be complicated, and can follow a variety of designs. These instructions are for the simplest case, where you have just one DTC resource in a group with a virtual name, IP address and disk resource. For more complex scenarios STOP HERE and look at MS documentation about all the possible designs.
Note: this will take the DTC offline for a short time, so if it’s used in your environment this might require a maintenance window and down time.
- Locate the existing MSDTC group. In its resources, open the properties for the name and the IP address of the DTC. Make a careful note of these values.
- Delete the DTC group
- Use the Configure a Service or Application wizard to recreate the DTC resources. Give the same name and IP address, but use the new disk. Optionally “fix” the drive letter assignment, as above, to use the same letter as the old DTC disk
Migrate all the SQL Server data directories
Last but not least, we pull the ol’ switcheroo for the data itself. This does require down time, just long enough to copy the data and log files from the old disks to the new. (If downtime is a problem, consider revising this process to use log shipping to “prime” the user databases on the new disks, so that cutover time is minimal.)
- In Failover Cluster Administrator, locate the new SQL disks in the Available Storage group
- Move each of the new SQL disk resources into your SQL Server group alongside the existing disks
- Take the SQL Server Agent service offline, and the Full Text Index service if you have one
- Take the SQL Server service offline. This will unlock all the data and log files so that they can be copied.
- Exchange the drive letters of the new root SQL disk and the old root SQL disk. For example, if the old disk was E:\, set it to something like O:\ and set the new disk to E:\. Unlike the cluster disk management function, the SQL Server will look for its files by drive letter. That means that later, when we fire SQL Server back up, it will attach data and log files using the same drive letter and paths as before – but that path will now reside on the new disks.
- Run a command prompt as Administrator
- Important: Use a tool like XCOPY that can retain permissions and ownership of files to copy the files. See XCOPY /? options, ACLS, ownership, file attributes. For example
At this point all the required MDF, NDF, LDF, etc. files for all your databases should be available on the new disk, but identified by the same drive letter as before. The paths should be identical. Back in Failover Cluster Manager, start the SQL Server service back up Start the SQL Agent and any other downstream services.
- XCOPY <OldDisk>:\SQLData\*.* <NewDisk>:\SQLData /O /K /E
- XCOPY <OldDisk>:\SQLLog\*.* <NewDisk>:\SQLLog /O /K /E
That basically gets the data across and back online. Don’t forget this last step, though, our you might be in for a surprise later: go through all the disks and services in the group and fix the dependencies, such that the SQL Service depends on the new disks instead of the old ones, and for mount points, the “children” depend on the root disk. Once those are resolved, the old disks can be retired.
I hope this is useful for anyone out there who is new to clusters and SAN storage. Takeaways: You don’t have to change the drive letters when performing a disk migration (that is, add a new disk with a new drive letter and try to reconfigure SQL Server to change drive letters). It’s actually quite painful and problematic to do that. It’s much simpler to just exchange the disk “out from under” SQL Server and keep the same path(s).
There are, I am sure, even more elegant ways to do this, so please share if you have more experience than I do in this area.