THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

SAN vs. Disk Arrays: It goes a long way to be slightly more specific!

In the SQL Server communities, it's common to hear people talking about HP SAN, EMC SAN, 3Par SAN, and so on as if there were such things as HP SAN, EMC SAN, etc.

Technically, SAN stands for Storage Area Network, but can be, and has been, used in two different ways. First, outside the storage communities, people often view everything beyond the drive at the OS level as the SAN with no regard to how that SAN is architected or configured as long as that drive is presented from some kind of SAN infrastructure. Typically, this is the way SQL Server professionals talk about SAN.

Within the storage communities, the interpretation is often different. To a storage engineer, there is a distinction between SAN and disk arrays: SAN is the network fabric that is made up of switches that provide your host a point-to-point path/link to the disks on some disk arrays. Disk arrays are the storage devices where physical disks are pooled together and managed by sophisticated software and additional controller hardware.

I personally have tried consciously to stick to the second interpretation, and refrained from using terms such as HP SAN or EMC SAN, primarily because this type of speaking adds no value, but introduces inaccuracies, and can be potentially misleading. 

First of all, even if you interpret SAN broadly, it’s still probably wrong, or inaccurate, to speak of EMC SAN or IBM SAN, for instance, because it’s highly unlikely that the SAN environment is entirely made up of the EMC or IBM devices (I have not seen one anyway). The disk arrays may be from EMC or IBM, but the switches are often from Broacade or Cisco, or a mix of switches from different vendors. SAN is not a monolithic piece.

Although SAN in terms of the switch fabric can be a limiting factor, especially when it comes to throughput, disk arrays are often where the disk I/O performance is determined, especially when it comes to disk I/O latency. Identifying from what disk array a LUN is carved automatically puts us in a better position in our performance analysis.

Even if we stop being ‘picky’ and interpret EMC SAN to mean an EMC disk array behind a SAN fabric, the phrase is not specific enough to add as much value as identifying it to be an EMC DMX-3 disk array, for instance. Note that there can be many different models, makes, and versions of disk arrays behind a SAN, and their performance characteristics can be vastly different. It does not help to say that your drive is presented from an EMC SAN because two drives presented from the same SAN can perform very differently even with the same number of drives, same type of drives, and same RAID configuration. Many factors of the disk array heavily influence the performance of LUNs carved from the disk array.

Although different LUNs can be carved from the same disk array and these LUNs can be configured to have different performance characteristics, identifying the disk array allows one to understand why two LUNs of the same configuration have different performance characteristics, or at least gives one an opportunity to look for more information, and make the conversations easier with the storage folks.

Saying that a drive is from an EMC SAN is simply too generic to be useful, even if we are willing to ignore the fact that there is no such thing as an EMC SAN.

Published Friday, June 26, 2009 6:45 PM by Linchi Shea
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS



Grumpy Old DBA said:

This is in part due to the fact most storage vendors sell monolithic storage solutions which are really networked storage rather than a storage area network. I suppose we are more inclined to consider the storage vendor for the name, so to speak. It may also be that a vendors management software is often proprietory thus not allowing a mixed environment.I find many people don't really understand the base concept of a storage area network - but in the end does it truly matter for the majority?

July 2, 2009 11:45 AM

Linchi Shea said:

In the end, it matters greatly in terms of whether one is actually conveying any useful information. Even if the disk arrays are from the same manufacturer, they vary greatly in their performance characteristics. The phrase "EMC SAN" is similar to the phrase "HP server". There are limited settings in which it's okay to just mention that your SQL instance run on an HP server. But for most of discussions (especially discussions on performance), such a phrase is completely devoid of meaning, and adds absolutely no value to the discussions. You might just as well stating that the SQL instance runs on a server.

But then, when a storage person starts to mention databases, he probably doesn't care about whether it's SQL Server, Oracle, or DB2, or whether it's SQL2008 Enterprise Edition or SQL2000 MSDE. To him, it's just a database, or just a SQL Server database. And when he starts to make statements like 'SQL Server sucks because it only supports 2GB of RAM', those folks who are content to use phrases like 'EMC SAN' or 'HP SAN' and pass judgements or make loaded remarks on these "SANs" probably should not feel too offended.

July 2, 2009 10:46 PM

Grumpy Old DBA said:

Agreed. However in the case where I was/still am "testing" the way in which the storage is presented to the SQL Server is very different from each vendor ( I am unable to mention names ) in this case it is my view that the way the particular vendor presents their storage has a direct impact on SQL Server performance, in this case the vendor is relevant. But yes you're totally correct in what you're saying although I'd hope the basics of the switches and HBAs would not have too much impact. In my case the data centre is externally managed so information is not always forthcoming and despite using a number of your blog posts as a reference point, on such matters as HBA queue depth there was still really what I can only describe as a disblief I might question any part of the storage setup. Thankyou for your continued posts.

July 7, 2009 4:39 AM

Leave a Comment


About Linchi Shea

Checking out SQL Server via empirical data points

This Blog


Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement