It’s interesting to me that old and inaccurate performance tuning recommendations seem to have a life of their own. In some ways, old performance tuning recommendations are like the Undead from some kind of cheesy, 1970’s zombie movie – no matter how many times you shoot them, they just keep coming back.
Here’s a good case in point, http://msdn2.microsoft.com/en-us/library/ms345118.aspx, a white paper discussing Performance Optimizations for the XML data type in SQL Server 2005. The document states:
Multiple tempDB Files for Better Scalability of XML Variables and Parameters
XML variables and parameters use main memory as storage as long as their values are small. Large values, however, are backed by tempdb storage. In a multi-user scenario, if many large XML blobs occur, tempdb contention may become a bottleneck for good scalability. Creating multiple tempdb files reduces the storage contention and yields significantly better scalability. The next example illustrates how multiple tempdb files can be created.
Example: Creating Multiple tempdb Files
This example creates two additional data files for tempdb, each with an initial size of 8 MB, and two log files with an initial size of 1 MB.
Copy Code
USE TEMPDB
GO
ALTER DATABASE tempdb ADD FILE
(NAME = 'Tempdb_Data1',
FILENAME = 'C:\temp\Tempdb_Data1.MDF', SIZE = 8 MB),
(NAME = 'Tempdb_Data2',
FILENAME = 'C:\temp\Tempdb_Data2.MDF', SIZE = 8 MB)
GO
ALTER DATABASE tempdb ADD log FILE
(NAME = 'Tempdb_Log1',
FILENAME = 'C:\temp\Tempdb_Log1.LDF', SIZE = 1 MB),
(NAME = 'Tempdb_Log2',
FILENAME = 'C:\temp\Tempdb_Log2.LDF', SIZE = 1 MB)
GO
These files can be removed by using the ALTER DATABASE tempdb REMOVE FILE command. For more information, see SQL Server Books Online.
There was a time when this recommendation made sense. The intent behind multiple tempdb files prior to SQL Server 2005 was to avoid GAM contention on very high throughput scenarios. Later, SQL Server 2005 introduced segmented GAMs even within a single tempdb file.
This recommendation still holds for data files, especially on systems using NUMA processors, where IO should be aligned with one data file per NUMA socket. (It also does no harm on an SMP system to organize your data files this way, hence the standardized recommendation.)
However, the recommendation fails when you get to the log portion of the equation. Why? It’s because data file IO is written using the proportional file algorithm where each data file has data written to it in round-robin style. On the other hand, log files are written using the active file algorithm where LogFile1 is written first until full, then LogFile2 is written next until full, and so on… Long story made short (too late, I know) – you get no performance gain from having multiple files in the log because all writes occur on only one file. You can only get a performance gain from multiple files on the data portion of a database.
As an aside, Simon Sabin (http://sqlblogcasts.com/blogs/simons/) has a wonderful collection of addition myths, misunderstandings, and miscommunications. I’m not sure if they’re put together in a single blog post, so put his blog on your watch list.
Cheers,
-Kev
Twitter @ kekline
P.S. Thanks to Tony Rogerson (http://sqlblogcasts.com/blogs/tonyrogerson) and Geoff Hiten for the inspiration on this blog post, btw.
P.P.S. I like how the MSDN article refers to SQL Server 2005 as SQL Server 9.0. ;^)