It never fails to surprise me how rarely filegroups are leveraged in client installations. I'm also surprised at how many SQL Server DBAs are new to the concept of filegroups whenever I teach or give a presentation that includes them. For many of you, this may seem simplistic, but I've decided there's enough need to spend some blog space explaining the usage and extolling the benefits of filegroup usage. I find that filegroups provide many benefits, both in performance and backup/recovery, or simply just organization. For this post I'm going to focus just on performance.
Just to cover the basics and make sure we're all on the same page (hey, I might not be - happens all the time):
1) Filegroups are essentially logical database constructs that organize physical data files together
2) Objects are created and stored on filegroups (but not necessarily one particular file within that filegroup)
3) SQL Server (all versions thusfar) have an initial filegroup called PRIMARY. That name can not be changed. PRIMARY also must hold the .MDF file, which contains metadata about the database. It can not be removed either. Since the .MDF file holds critical information about the database itself, mixing that data with production data can be dangerous. The larger a file gets (in this case the .MDF), the more likely it is to have some sort of corruption (simply by virtue of additional read/writes to a single large file). Personally I like to isolate the .MDF and the PRIMARY filegroup altogether.
4) One filegroup can have the "Default" property, which means if you don't specify where you're creating your table, index, etc. (with the clause ON [Filegroup_name] at the end of the CREATE statement), the object will be created on that "Default" filegroup.
I often find that while there may be multiple files in a SQL database, there is usually only one filegroup (PRIMARY). For smaller databases (less than 50GB) that could be alright, but for larger databases simply using PRIMARY as the only filegroup, the potential for improved database performance is significant. While this does of course depend on how the I/O subsystem is configured (splitting up objects amongst filegroups may not improve things if everything runs on a single RAID5 array).
My personal preference for configuring initial filegroups for any system is:
PRIMARY (This can't be changed) on minimal I/O path (not much I/O needed)
DATA on a separate I/O path; DATA gets the DEFAULT filegroup property
INDEX on a separate I/O path
On PRIMARY I place no additional files other than the .MDF file (which has to be in the PRIMARY filegroup), and no additional data goes into the .MDF file. It stays very small and isn't used for user data.
On DATA I place all of my clustered indexes or (if I have any) heaps. Since this is also the Default filegroup, if I accidentally forget the "ON [DATA]" clause at the end of my object creation statements, they'll go here instead of the PRIMARY filegroup (the reason for that is for a later post).
On the INDEX filegroup I place all of my non-clustered indexes.
This of course assumes that DATA and INDEX are both stored on separate I/O paths. As long as they are, I/O resources spent retrieving data from tables won't interfere with I/O searching through indexes. I/O resources spent performing updates, inserts and deletes will be split between the tables and the indexes. Will you see double the performance? Of course not. But you should see a measurable and visible increase in performance vs storing everything on the PRIMARY filegroup.
Yes, you could have just the PRIMARY filegroup and place data files on different I/O paths, but there's no way to guarantee that one object will be in one file. My approach here is a simplistic use of filegroups, but I have yet to see it slower than having a single PRIMARY filegroup. If you wanted to get even fancier, you could have a separate filegroup for seldom accessed data and put it on a separate, slower I/O path (RAID1). You could also place commonly join tables on different filegroups, each on separate I/O paths.
The bottom line is - filegroups - use them!!!!