<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://www2.sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Search results matching tag 'Internals'</title><link>http://www2.sqlblog.com/search/SearchResults.aspx?o=DateDescending&amp;tag=Internals&amp;orTags=0</link><description>Search results matching tag 'Internals'</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>Squishy Limits in SQL Server Express Edition</title><link>http://www2.sqlblog.com/blogs/kevin_kline/archive/2013/03/28/squishy-limits-in-sql-server-express-edition.aspx</link><pubDate>Thu, 28 Mar 2013 12:19:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:48447</guid><dc:creator>KKline</dc:creator><description>&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;It's an old story you've probably heard before. &amp;nbsp;Provide a free version of your software product with strict limitations on performance or other specific capabilities so that folks can give it a try without risk, while you minimize the chance of&amp;nbsp;cannibalizing&amp;nbsp;sales of your commercial products. &amp;nbsp;Microsoft has take this strategy with&amp;nbsp;&lt;a href="http://www.microsoft.com/en-us/sqlserver/editions/2012-editions/express.aspx"&gt;SQL Server Express Edition&lt;/a&gt;, not only to increase adoption in the student market but also to counter the threat of open-source (i.e. free) relational databases like MySQL for entry-level applications.&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;One such limitation of SQL Server Express Edition is that it supports no more than 1GB of RAM for the instance. &amp;nbsp;Of course, you could have many Express Edition instances on a single Windows server, each with its own 1GB of RAM.&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;But what does that metric of 1GB of RAM actually mean? &amp;nbsp;The key thing to remember is that the restriction is for&amp;nbsp;&lt;em&gt;&lt;strong&gt;buffer&lt;/strong&gt;&lt;strong&gt;&amp;nbsp;cache.&amp;nbsp;&lt;/strong&gt;&lt;/em&gt;&lt;strong&gt;&amp;nbsp;&lt;/strong&gt;Since SQL Server has many other caches, even when not counting the plan cache, there are plenty of other caches within SQL Server. &amp;nbsp;(Run a query against&amp;nbsp;&lt;em&gt;sys.dm_os_memory_clerks&lt;/em&gt;&amp;nbsp;if you'd like to see some of the others). &amp;nbsp;Because only the buffer cache has the strict 1GB limitation, you can actually watch SQL Server Express Edition's memory working set size grow to around 1.4-1.5GB due to the other memory caches at play.&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;Pawel Potasinski, a SQL Server MVP from Poland (&lt;a href="http://twitter.com/pawelpotasinski"&gt;Twitter&lt;/a&gt;&amp;nbsp;|&amp;nbsp;&lt;a href="http://sqlgeek.pl/"&gt;Blog&lt;/a&gt;), once&amp;nbsp;&lt;a href="http://sqlgeek.pl/2010/08/23/pl-sql-server-limity-w-sql-server-2008-r2-express-edition/"&gt;posted an interesting repro&lt;/a&gt;&amp;nbsp;for this behavior:&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;padding-left:30px;"&gt;&lt;span style="font-family:Consolas, Monaco, monospace;font-size:12px;line-height:18px;"&gt;-- Assess amount of databases resident in buffer cache&lt;/span&gt;&lt;/p&gt;&lt;pre style="font-size:12px;line-height:18px;font-family:Consolas, Monaco, monospace;padding-left:30px;"&gt;SELECT
 CASE
 WHEN database_id = 32767 THEN 'mssqlsystemresource'
 ELSE DB_NAME(database_id)
 END AS [Database],
 CONVERT(numeric(38,2),(8.0 / 1024) * COUNT(*)) AS [MB in buffer cache] 
FROM sys.dm_os_buffer_descriptors 
GROUP BY database_id 
ORDER BY 2 DESC; 
GO&lt;/pre&gt;&lt;pre style="font-size:12px;line-height:18px;font-family:Consolas, Monaco, monospace;padding-left:30px;"&gt;-- Assess amount of tables resident in buffer cache
SELECT
 QUOTENAME(OBJECT_SCHEMA_NAME(p.object_id)) + '.' +
 QUOTENAME(OBJECT_NAME(p.object_id)) AS [Object],
 CONVERT(numeric(38,2),(8.0 / 1024) * COUNT(*)) AS [MB In buffer cache] 
FROM sys.dm_os_buffer_descriptors AS d 
 INNER JOIN sys.allocation_units AS u ON d.allocation_unit_id = u.allocation_unit_id 
 INNER JOIN sys.partitions AS p ON (u.type IN (1,3) AND u.container_id = p.hobt_id) OR (u.type = 2 AND u.container_id = p.partition_id) 
WHERE d.database_id = DB_ID() 
GROUP BY QUOTENAME(OBJECT_SCHEMA_NAME(p.object_id)) + '.' + QUOTENAME(OBJECT_NAME(p.object_id))
ORDER BY [Object] DESC;
GO&lt;/pre&gt;&lt;pre style="font-size:12px;line-height:18px;font-family:Consolas, Monaco, monospace;padding-left:30px;"&gt;-- Fill up Express Edition's buffer allocation
IF OBJECT_ID(N'dbo.test', N'U') IS NOT NULL
 DROP TABLE dbo.test;
GO&lt;/pre&gt;&lt;pre style="font-size:12px;line-height:18px;font-family:Consolas, Monaco, monospace;padding-left:30px;"&gt;CREATE TABLE dbo.test (col_a char(8000));
GO&lt;/pre&gt;&lt;pre style="font-size:12px;line-height:18px;font-family:Consolas, Monaco, monospace;padding-left:30px;"&gt;INSERT INTO dbo.test (col_a)
 SELECT REPLICATE('col_a', 8000)
 FROM sys.all_objects 
 WHERE is_ms_shipped = 1;&lt;/pre&gt;&lt;pre style="font-size:12px;line-height:18px;font-family:Consolas, Monaco, monospace;padding-left:30px;"&gt;CHECKPOINT; 
GO 100&lt;/pre&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;padding-left:30px;"&gt;&lt;em&gt;&amp;nbsp;The bottom line for the hard memory limit of SQL Server Express Edition is "Yes, it's limited. &amp;nbsp;But it's a squishy limit. Not a hard limit."&lt;/em&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;span style="line-height:19px;"&gt;Although your mileage may vary, I'd bet a dollar that you'll find more than 1GB in the active working set for your instance of SQL Server Express Edition. &amp;nbsp;I am curious, however, if you're seeing much variation between versions and even service packs of SQL Server? &amp;nbsp;Let me know if you try this out on more than one version and/or service pack level of SQL Server. &amp;nbsp;Did it change much between versions? &amp;nbsp;Let me know!&lt;/span&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;Enjoy,&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;-Kevin&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;a href="http://twitter.com/kekline"&gt;-Follow me on Twitter!&lt;/a&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;a href="http://twitter.com/kekline"&gt;&lt;/a&gt;&lt;br&gt;&lt;a href="https://plus.google.com/u/1/113032055249023350257?rel=author"&gt;Google Author&lt;/a&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>Corruption case</title><link>http://www2.sqlblog.com/blogs/michael_zilberstein/archive/2013/03/21/48339.aspx</link><pubDate>Thu, 21 Mar 2013 23:49:32 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:48339</guid><dc:creator>mz1313</dc:creator><description>&lt;p&gt;Recently I had to take care of the most interesting corruption case I’ve even seen, so decided to share this experience with you. We’re talking about small accounting program which keeps its data in SQL Server Express – in this particular case in SQL Server 2005. The customer called today and sent me following error screen (nice screenshot – taken with cellular phone camera &lt;img style="border-bottom-style:none;border-left-style:none;border-top-style:none;border-right-style:none;" class="wlEmoticon wlEmoticon-smile" alt="Smile" src="http://sqlblog.com/blogs/michael_zilberstein/wlEmoticon-smile_11915A7C.png" /&gt;):&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/michael_zilberstein/image_364271F3.png"&gt;&lt;img style="background-image:none;border-right-width:0px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/michael_zilberstein/image_thumb_0BA69116.png" width="412" height="114" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Upon connecting to the server I’ve immediately noticed dumps with the same error. Here is entire error message:&lt;/p&gt;  &lt;p&gt;&lt;font size="1"&gt;&lt;em&gt;A time-out occurred while waiting for buffer latch -- type 2, bp 04268450, page 1:804, stat 0xc00009, database id: 5, allocation unit Id: 72057594108248064, task 0x00A186B8 : 0, waittime 300, flags 0x1a, owning task 0x00A0A4D8. Not continuing to wait.&lt;/em&gt;&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&amp;#160;&lt;a href="http://mssqlwiki.com/2012/09/07/latch-timeout-and-sql-server-latch/"&gt;This article&lt;/a&gt; was extremely helpful in analyzing this dump with WinDbg tool – flow and somewhat cryptic commands described there easily pinpointed the guilty thread and its call stack:&lt;/p&gt;  &lt;p&gt;&lt;font size="1"&gt;sqlservr!LatchBase::AcquireInternal      &lt;br /&gt;sqlservr!BUF::AcquireLatch       &lt;br /&gt;sqlservr!BPool::Get       &lt;br /&gt;sqlservr!PageRef::Fix       &lt;br /&gt;sqlservr!IndexPageManager::GetPageForLinkModification       &lt;br /&gt;sqlservr!RemoveBTreePageIfUnchangedInternal       &lt;br /&gt;sqlservr!RemoveBTreePageIfUnchanged       &lt;br /&gt;sqlservr!CleanVersionsOnBTreePage       &lt;br /&gt;sqlservr!IndexDataSetSession::CleanupVersionsOnPage       &lt;br /&gt;&lt;/font&gt;&lt;font size="1"&gt;&lt;font color="#ff0000"&gt;sqlservr!GhostExorciser::CleanupPage        &lt;br /&gt;sqlservr!TaskGhostCleanup::ProcessTskPkt         &lt;br /&gt;sqlservr!GhostRecordCleanupTask         &lt;br /&gt;sqlservr!CGhostCleanupTask::ProcessTskPkt         &lt;br /&gt;sqlservr!TaskReqPktTimer::ExecuteTask         &lt;br /&gt;sqlservr!OnDemandTaskContext::ProcessTskPkt         &lt;br /&gt;sqlservr!SystemTaskContext::ExecuteFunc         &lt;br /&gt;sqlservr!SystemTaskEntryPoint         &lt;br /&gt;sqlservr!OnDemandTaskContext::FuncEntryPoint         &lt;br /&gt;sqlservr!SOS_Task::Param::Execute         &lt;br /&gt;&lt;/font&gt;sqlservr!SOS_Scheduler::RunTask       &lt;br /&gt;sqlservr!SOS_Scheduler::ProcessTasks       &lt;br /&gt;sqlservr!SchedulerManager::WorkerEntryPoint       &lt;br /&gt;sqlservr!SystemThread::RunWorker       &lt;br /&gt;sqlservr!SystemThreadDispatcher::ProcessWorker       &lt;br /&gt;sqlservr!SchedulerManager::ThreadEntryPoint       &lt;br /&gt;msvcr80!_callthreadstartex       &lt;br /&gt;msvcr80!_threadstartex       &lt;br /&gt;&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;Highlighted part of the call stack indicates that Ghost Cleanup process caused this failure. Actually, I had a similar encounter with stuck ghost cleanup several years ago. The easiest way to verify that it is still stuck is to execute DBCC CHECKDB command. Indeed CHECKDB appeared to be blocked by ghost cleanup session.&lt;/p&gt;  &lt;p&gt;What’s next? You can’t kill system session. Indeed you can’t by you can start SQL Server without it – using &lt;a href="http://support.microsoft.com/kb/920093"&gt;trace flag 661&lt;/a&gt; as startup parameter (don’t forget to remove flag and restart service again if you use this flag). So, after restarting service, rebuilding index in question, removing trace flag and restarting service again, I’ve already thought that &lt;strike&gt;I’ve earned my beer&lt;/strike&gt; database is fixed. &lt;/p&gt;  &lt;p&gt;Not so fast. Now DBCC CHECKDB succeeds to complete but results are very very red. Including interesting messages like:&lt;/p&gt;  &lt;p&gt;&lt;font color="#ff0000" size="1"&gt;Msg 8992, Level 16, State 1, Line 1      &lt;br /&gt;Check Catalog Msg 3853, State 1: Attribute (object_id=1575689407) of row (object_id=1575689407,column_id=1) in sys.columns does not have a matching row (object_id=1575689407) in sys.objects.       &lt;br /&gt;Msg 8992, Level 16, State 1, Line 1       &lt;br /&gt;Check Catalog Msg 3853, State 1: Attribute (object_id=1575689407) of row (object_id=1575689407,column_id=2) in sys.columns does not have a matching row (object_id=1575689407) in sys.objects.       &lt;br /&gt;Msg 8992, Level 16, State 1, Line 1       &lt;br /&gt;Check Catalog Msg 3853, State 1: Attribute (object_id=1575689407) of row (object_id=1575689407,column_id=3) in sys.columns does not have a matching row (object_id=1575689407) in sys.objects.       &lt;br /&gt;Msg 8992, Level 16, State 1, Line 1       &lt;br /&gt;Check Catalog Msg 3853, State 1: Attribute (object_id=1575689407) of row (object_id=1575689407,index_id=0) in sys.indexes does not have a matching row (object_id=1575689407) in sys.objects.       &lt;br /&gt;Msg 8992, Level 16, State 1, Line 1       &lt;br /&gt;Check Catalog Msg 3855, State 1: Attribute (data_space_id=1) exists without a row (object_id=1575689407,index_id=0) in sys.indexes.&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;Nice, uh? After first wave of shock has passed, I checked and found out that indeed object with that id doesn’t exist. So it seems that all we need to do is to delete 3 rows from sys.columns and 1 row from sys.indexes. Ah, but those sys.something objects are views, aren’t they? And what are the real objects? The way to find real – internal – tables and columns is via execution plans:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/michael_zilberstein/image_043AAB9C.png"&gt;&lt;img style="background-image:none;border-right-width:0px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/michael_zilberstein/image_thumb_548C6742.png" width="736" height="606" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;So actually we need to delete rows from sys.syscolpars and from sys.sysidxstats (notice that column names are also different). How do we do it? Let’s try DAC (Dedicated Admin Connection)? No way – Express Edition &lt;a href="http://msdn.microsoft.com/en-us/library/ms189595(v=sql.90).aspx"&gt;doesn’t support DAC&lt;/a&gt;. Unless… unless we use &lt;a href="http://msdn.microsoft.com/en-us/library/ms188396(v=sql.90).aspx"&gt;trace flag 7806&lt;/a&gt; as startup parameter.&lt;/p&gt;  &lt;p&gt;Restart server again, connect using DAC, try to delete rows… Oops,&lt;/p&gt;  &lt;p&gt;&lt;font color="#ff0000" size="1"&gt;&lt;em&gt;Msg 259, Level 16, State 1, Line 1        &lt;br /&gt;Ad hoc updates to system catalogs are not allowed.&lt;/em&gt;&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;But for this we have &lt;a href="http://www.sqlskills.com/blogs/paul/using-the-dedicated-admin-connection-to-fix-msg-8992-corrupt-system-tables/"&gt;Paul Randal’s instructions&lt;/a&gt;. So: “ &lt;em&gt;sqlservr.exe -sInstanceName -m -T661 –T7806 “, &lt;/em&gt;then “ &lt;em&gt;sqlcmd -S.\InstanceName /A&lt;/em&gt; ” and finally…&lt;/p&gt;  &lt;div style="border-bottom:silver 1px solid;text-align:left;border-left:silver 1px solid;padding-bottom:4px;line-height:12pt;background-color:#f4f4f4;margin:20px 0px 10px;padding-left:4px;width:44.79%;padding-right:4px;font-family:'Courier New', courier, monospace;direction:ltr;height:127px;max-height:200px;font-size:8pt;overflow:auto;border-top:silver 1px solid;cursor:text;border-right:silver 1px solid;padding-top:4px;" id="codeSnippetWrapper"&gt;   &lt;div style="border-bottom-style:none;text-align:left;padding-bottom:0px;line-height:12pt;background-color:#f4f4f4;border-left-style:none;padding-left:0px;width:100%;padding-right:0px;font-family:'Courier New', courier, monospace;direction:ltr;border-top-style:none;color:black;border-right-style:none;font-size:8pt;overflow:visible;padding-top:0px;" id="codeSnippet"&gt;     &lt;pre style="border-bottom-style:none;text-align:left;padding-bottom:0px;line-height:12pt;background-color:white;margin:0em;border-left-style:none;padding-left:0px;width:100%;padding-right:0px;font-family:'Courier New', courier, monospace;direction:ltr;border-top-style:none;color:black;border-right-style:none;font-size:8pt;overflow:visible;padding-top:0px;"&gt;&lt;span style="color:#606060;" id="lnum1"&gt;   1:&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;DELETE&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; sys.syscolpars &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; id = 1575689407 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; number = 0&lt;/pre&gt;


    &lt;pre style="border-bottom-style:none;text-align:left;padding-bottom:0px;line-height:12pt;background-color:#f4f4f4;margin:0em;border-left-style:none;padding-left:0px;width:100%;padding-right:0px;font-family:'Courier New', courier, monospace;direction:ltr;border-top-style:none;color:black;border-right-style:none;font-size:8pt;overflow:visible;padding-top:0px;"&gt;&lt;span style="color:#606060;" id="lnum2"&gt;   2:&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;GO&lt;/span&gt;&lt;/pre&gt;


    &lt;pre style="border-bottom-style:none;text-align:left;padding-bottom:0px;line-height:12pt;background-color:white;margin:0em;border-left-style:none;padding-left:0px;width:100%;padding-right:0px;font-family:'Courier New', courier, monospace;direction:ltr;border-top-style:none;color:black;border-right-style:none;font-size:8pt;overflow:visible;padding-top:0px;"&gt;&lt;span style="color:#606060;" id="lnum3"&gt;   3:&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;DELETE&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; sys.sysidxstats &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; id = 1575689407 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; indid = 0&lt;/pre&gt;


    &lt;pre style="border-bottom-style:none;text-align:left;padding-bottom:0px;line-height:12pt;background-color:#f4f4f4;margin:0em;border-left-style:none;padding-left:0px;width:106.08%;padding-right:0px;font-family:'Courier New', courier, monospace;direction:ltr;border-top-style:none;height:20px;color:black;border-right-style:none;font-size:8pt;overflow:visible;padding-top:0px;"&gt;&lt;span style="color:#606060;" id="lnum4"&gt;   4:&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;GO&lt;/span&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Now stop the server, remove all trace flags, start server in a normal way and verify that DBCC CHECKDB returns nothing. Bingo! And… well deserved my own home-brewed Scottish Ale! &lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/michael_zilberstein/image_13D9F506.png"&gt;&lt;img style="background-image:none;border-right-width:0px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/michael_zilberstein/image_thumb_12190365.png" width="128" height="313" /&gt;&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Execution Plan Analysis: The Mystery Work Table</title><link>http://www2.sqlblog.com/blogs/paul_white/archive/2013/03/07/execution-plan-analysis-the-mystery-work-table.aspx</link><pubDate>Thu, 07 Mar 2013 19:42:04 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:48117</guid><dc:creator>Paul White</dc:creator><description>&lt;p align="left"&gt;&lt;a title="SQL Intersection" href="http://www.sqlintersection.com/" target="_blank"&gt;&lt;img title="Ill_Be_There4" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;float:right;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Ill_Be_There4" align="right" src="http://sqlblog.com/blogs/paul_white/Ill_Be_There4_2CF1D80C.jpg" width="140" height="110" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;I love SQL Server execution plans. It is often &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;easy to spot the cause of a performance problem just by looking at one. The task is considerably easier if the plan includes run-time information (a so-called ‘actual’ execution plan), but even a compiled plan can be very useful. &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;Nevertheless, there are still times where the execution plan does not tell the whole story, and we need to think more deeply about query execution to really understand a performance problem. This post looks at one such example, based on a recent &lt;a href="https://answers.sqlperformance.com/questions/392/there-are-2-identical-worksets-in-question-this-is.html" target="_blank"&gt;question&lt;/a&gt; posted on the &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;SQL Performance Q &amp;amp; A site.&lt;/font&gt;&lt;/p&gt;  &lt;h2&gt;The Execution Plan&lt;/h2&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_116CD609.png" target="_blank"&gt;&lt;img title="Original Query Plan" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Original Query Plan" src="http://sqlblog.com/blogs/paul_white/image_thumb_23D526BE.png" width="660" height="212" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;This plan is reasonably large (20MB cached plan size) but not massively complex once you break it down (click on the image above to view it full-size in a new window). The context of the question is that this query usually executes in less than a minute, but sometimes it runs for nearly twenty minutes – though the plan appears identical.&lt;/font&gt;&lt;/p&gt;  &lt;h3&gt;High-Cost Operators&lt;/h3&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;There are many different things to look for in execution plans. What you choose to look at first is as much a matter of personal preference as anything, but many people &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;are drawn to &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;high-cost operators, so I will start there. In this plan, the cost of one operator dominates all others, shown as being responsible for &lt;strong&gt;100% of the cost of the query&lt;/strong&gt;. It is highlighted in red in Plan Explorer; I have expanded the relevant plan section (the top right) below:&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_20738F16.png" target="_blank"&gt;&lt;img title="100% operator cost" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="100% operator cost" src="http://sqlblog.com/blogs/paul_white/image_thumb_10844A52.png" width="549" height="268" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="3" face="Calibri"&gt;&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;There is no doubt that this seek is busy little thing. It is executed &lt;strong&gt;249,484 times&lt;/strong&gt;, though it only produces a grand total of 167,813 rows over all iterations of the loop join – an average of just 0.7 rows per seek. There are all sorts of interesting details in the plan about this seek – I could write a whole blog post about it – but &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;two details that stand out are the “&lt;em&gt;Force Seek: True&lt;/em&gt;” and “&lt;em&gt;Partitioned: True&lt;/em&gt;” attributes. &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;These tell us that the base table is partitioned, and the query writer had to use a FORCESEEK table hint to get this plan.&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;Without this hint, the optimizer would almost certainly choose &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;a Hash Match or Merge Join rather than Nested Loops. This is understandable given the optimizer’s cost model and the simplifying assumptions it makes (such as assuming every query starts with a cold buffer cache). &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;That’s fine, but we can see from the query plan that &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;the inner-side table has &lt;strong&gt;643 million rows&lt;/strong&gt;. Left to its own devices, the optimizer would estimate that it would be faster to perform a sequential scan of 643 million rows (with large-block read-ahead) than it would be to run a quarter-million randomly-distributed seeks driven by a Nested Loops join.&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;I doubt that the optimizer’s reasoning here is sound (at least on any reasonably modern hardware) but there we go. The query author probably knows that a good fraction of this table is likely to be in cache, so &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;with all that in mind, I think we can reasonably assume at this stage that the FORCESEEK hint is genuinely needed here, and this part of the plan is at least reasonably optimal.&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;Important note: The seek certainly does not account for 100% of the runtime cost of this query. &lt;strong&gt;Remember cost percentages are always estimates – even in ‘actual’ plans&lt;/strong&gt;. It can be useful to check the reasons for high-estimated-cost operators, but they should never be used as a primary tuning metric.&lt;/font&gt;&lt;/p&gt;  &lt;h3&gt;Execution Warnings&lt;/h3&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_353561C9.png"&gt;&lt;img title="Sort Warning" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Sort Warning" src="http://sqlblog.com/blogs/paul_white/image_thumb_0597433D.png" width="310" height="173" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;This query was executed on SQL Server 2012, so there is a handy warning triangle on the Sort operator indicating that one or more sort runs had to be spilled to physical &lt;em&gt;tempdb&lt;/em&gt; disk. The plan clearly shows this spilling is a result of an inaccurate cardinality estimate at the Filter operator (the estimates are not bad at all prior to this). The Sort expects &lt;strong&gt;9,217 rows&lt;/strong&gt; totalling approximately 5MB, but actually encountered &lt;strong&gt;61,846 rows&lt;/strong&gt; in 35MB. As you may know, memory for sorts and hashes is allocated before execution starts, and generally cannot expand dynamically at run time.&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The spilled sort is undesirable, of course, but it is unlikely to be a major cause of the occasional poor performance given the small size of the spilled data. Nevertheless, this might be a good place to split this query up. The idea would be to write the results of the query (up to and including the Filter) to a temporary heap table using SELECT INTO, and then create a clustered index with the same keys as the Sort operator. The temporary table would not be large, and may well perform better overall than the spilled sort, including the cost of creating the clustered index. Of course, creating this index will involve a sort, but it will be one based on the known cardinality of the temporary heap table. The part of the plan that could be replaced by a temporary table is shown below:&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_31D3CA21.png"&gt;&lt;img title="Plan subtree replaced with a temp table" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Plan subtree replaced with a temp table" src="http://sqlblog.com/blogs/paul_white/image_thumb_40BAF93B.png" width="640" height="298" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;I am a big fan of simplifications like this. Smaller execution plans tend to optimize better for all sorts of reasons, and the source code usually becomes easier to maintain as well. I should mention t&lt;/font&gt;&lt;font size="3" face="Calibri"&gt;here is another warning triangle in the 2012 execution plan (shown on the root icon), which relates to some implicit conversions that I will mention later.&lt;/font&gt;&lt;/p&gt;  &lt;h3 align="left"&gt;I/O Information&lt;/h3&gt;  &lt;p&gt;&lt;font size="3" face="Calibri"&gt;The execution plan was captured with Plan Explorer, so we can also easily see I/O statistics for the two executions. The first is for a fast (sub-60-second) run:&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_3C80FBA9.png"&gt;&lt;img title="I/O data - fast" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="I/O data - fast" src="http://sqlblog.com/blogs/paul_white/image_thumb_41833958.png" width="664" height="260" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;Overall, these I/O numbers show pretty much what we would expect: a decent number of logical reads associated with the seeks into the Trans table (but certainly not 100% of the total, ha ha), a very small number of physical reads, and a small amount of read-ahead activity on a couple of tables.&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="3" face="Calibri"&gt;The second set of I/O data is from a slow run (18 minutes or so):&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_3846FE17.png"&gt;&lt;img title="I/O data - slow" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="I/O data - slow" src="http://sqlblog.com/blogs/paul_white/image_thumb_2F0AC2D6.png" width="650" height="278" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The very obvious difference is the appearance of a work table, with &lt;strong&gt;178 million logical reads &lt;/strong&gt;and &lt;strong&gt;130 million LOB logical reads&lt;/strong&gt;. It seems very likely this work table, and its &lt;strong&gt;300 million logical reads&lt;/strong&gt;, is responsible for the dramatic decrease in query performance. But given that the execution plans are identical (right down to the XML) what is causing this?&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;My answer to that question (on the Q &amp;amp; A site) was that it is related to the increased level of read-ahead activity, but to see why that is the case, we will need to reproduce the issue and dig a bit deeper.&lt;/font&gt;&lt;/p&gt;  &lt;h2 align="left"&gt;Execution Outline&lt;/h2&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;Before we really get going on this, it will be useful to take a look at what the execution plan is doing in outline. We saw the first part of the plan earlier when looking at the spilling sort. The data set at that point (which we would like to write to a temporary table, remember) essentially represents source data for a second query, which uses &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;a series of Nested Loops Left Joins to lookup information from other tables:&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_68AD5CC0.png"&gt;&lt;img title="Nested Loop Lookups" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Nested Loop Lookups" src="http://sqlblog.com/blogs/paul_white/image_thumb_4A7F9F0C.png" width="659" height="98" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The inner side of each join involves some reasonably involved logic, which is thankfully not important to the present discussion. What is important is that the result of each lookup is a LOB data type. This begins to shed some light on the LOB logical reads reported against the work table, but it does not explain why the work table (and the 300 million associated reads) do not appear when the query runs quickly (with the same execution plan).&lt;/font&gt;&lt;/p&gt;  &lt;h2 align="left"&gt;Reproducing the problem&lt;/h2&gt;  &lt;h3 align="left"&gt;Table Creation&lt;/h3&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The first part of the repro involves creating six tables that represent the lookup tables in the original query plan. Each table will have 10,000 rows, consisting of a sequential reference number and a second column containing a 2048 single-byte-character string. &lt;font size="3" face="Calibri"&gt;The source table used to drive the lookups will be a regular Numbers table containing just a single integer column.&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;  &lt;div id="codeSnippetWrapper" style="overflow:auto;cursor:text;font-size:8pt;border-top:silver 1px solid;font-family:'Courier New', courier, monospace;border-right:silver 1px solid;border-bottom:silver 1px solid;padding-bottom:4px;direction:ltr;text-align:left;padding-top:4px;padding-left:4px;margin:20px 0px 10px;border-left:silver 1px solid;line-height:12pt;padding-right:4px;max-height:200px;width:97.5%;background-color:#f4f4f4;"&gt;   &lt;div id="codeSnippet" style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;     &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;CREATE&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T1 (id &lt;span style="color:#0000ff;"&gt;integer&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;IDENTITY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;PRIMARY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;KEY&lt;/span&gt;, d &lt;span style="color:#0000ff;"&gt;char&lt;/span&gt;(2048));&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;CREATE&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T2 (id &lt;span style="color:#0000ff;"&gt;integer&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;IDENTITY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;PRIMARY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;KEY&lt;/span&gt;, d &lt;span style="color:#0000ff;"&gt;char&lt;/span&gt;(2048));&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;CREATE&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T3 (id &lt;span style="color:#0000ff;"&gt;integer&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;IDENTITY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;PRIMARY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;KEY&lt;/span&gt;, d &lt;span style="color:#0000ff;"&gt;char&lt;/span&gt;(2048));&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;CREATE&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T4 (id &lt;span style="color:#0000ff;"&gt;integer&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;IDENTITY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;PRIMARY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;KEY&lt;/span&gt;, d &lt;span style="color:#0000ff;"&gt;char&lt;/span&gt;(2048));&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;CREATE&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T5 (id &lt;span style="color:#0000ff;"&gt;integer&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;IDENTITY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;PRIMARY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;KEY&lt;/span&gt;, d &lt;span style="color:#0000ff;"&gt;char&lt;/span&gt;(2048));&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;CREATE&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T6 (id &lt;span style="color:#0000ff;"&gt;integer&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;IDENTITY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;PRIMARY&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;KEY&lt;/span&gt;, d &lt;span style="color:#0000ff;"&gt;char&lt;/span&gt;(2048));&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;GO&lt;/span&gt;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;INSERT dbo.T1 &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (TABLOCKX)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; REPLICATE(&lt;span style="color:#006080;"&gt;'A'&lt;/span&gt;, 2048)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.Numbers &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;BETWEEN&lt;/span&gt; 1 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; 10000;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;GO&lt;/span&gt;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;INSERT dbo.T2 &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (TABLOCKX)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; REPLICATE(&lt;span style="color:#006080;"&gt;'B'&lt;/span&gt;, 2048)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.Numbers &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;BETWEEN&lt;/span&gt; 1 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; 10000;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;GO&lt;/span&gt;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;INSERT dbo.T3 &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (TABLOCKX)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; REPLICATE(&lt;span style="color:#006080;"&gt;'C'&lt;/span&gt;, 2048)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.Numbers &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;BETWEEN&lt;/span&gt; 1 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; 10000;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;GO&lt;/span&gt;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;INSERT dbo.T4 &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (TABLOCKX)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; REPLICATE(&lt;span style="color:#006080;"&gt;'D'&lt;/span&gt;, 2048)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.Numbers &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;BETWEEN&lt;/span&gt; 1 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; 10000;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;GO&lt;/span&gt;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;INSERT dbo.T5 &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (TABLOCKX)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; REPLICATE(&lt;span style="color:#006080;"&gt;'E'&lt;/span&gt;, 2048)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.Numbers &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;BETWEEN&lt;/span&gt; 1 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; 10000;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;GO&lt;/span&gt;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;INSERT dbo.T6 &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (TABLOCKX)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; REPLICATE(&lt;span style="color:#006080;"&gt;'F'&lt;/span&gt;, 2048)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.Numbers &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; n &lt;span style="color:#0000ff;"&gt;BETWEEN&lt;/span&gt; 1 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; 10000;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The next step is to ensure that each lookup table is optimally organized for read-ahead:&lt;/font&gt;&lt;/p&gt;

&lt;div id="codeSnippetWrapper" style="overflow:auto;cursor:text;font-size:8pt;border-top:silver 1px solid;font-family:'Courier New', courier, monospace;border-right:silver 1px solid;border-bottom:silver 1px solid;padding-bottom:4px;direction:ltr;text-align:left;padding-top:4px;padding-left:4px;margin:20px 0px 10px;border-left:silver 1px solid;line-height:12pt;padding-right:4px;max-height:200px;width:97.5%;background-color:#f4f4f4;"&gt;
  &lt;div id="codeSnippet" style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;
    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;ALTER&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T1 REBUILD &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (MAXDOP = 1);&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;ALTER&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T2 REBUILD &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (MAXDOP = 1);&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;ALTER&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T3 REBUILD &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (MAXDOP = 1);&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;ALTER&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T4 REBUILD &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (MAXDOP = 1);&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;ALTER&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T5 REBUILD &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (MAXDOP = 1);&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;ALTER&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;TABLE&lt;/span&gt; dbo.T6 REBUILD &lt;span style="color:#0000ff;"&gt;WITH&lt;/span&gt; (MAXDOP = 1);&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;h3 align="left"&gt;Test Query&lt;/h3&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The original query translates into our simplified test rig as:&lt;/font&gt;&lt;/p&gt;

&lt;div id="codeSnippetWrapper" style="overflow:auto;cursor:text;font-size:8pt;border-top:silver 1px solid;font-family:'Courier New', courier, monospace;border-right:silver 1px solid;border-bottom:silver 1px solid;padding-bottom:4px;direction:ltr;text-align:left;padding-top:4px;padding-left:4px;margin:20px 0px 10px;border-left:silver 1px solid;line-height:12pt;padding-right:4px;max-height:200px;width:97.5%;background-color:#f4f4f4;"&gt;
  &lt;div id="codeSnippet" style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;
    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;DECLARE&lt;/span&gt; @d nvarchar(&lt;span style="color:#0000ff;"&gt;max&lt;/span&gt;) = &lt;span style="color:#0000ff;"&gt;NCHAR&lt;/span&gt;(10000);&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&amp;#160;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; n.n,&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;    DATALENGTH&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;    (&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;        CONCAT&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;        (&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;            (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; CONCAT(t.d, t.d, t.d, t.d, t.d, t.d, @d) &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T1 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;            (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; CONCAT(t.d, t.d, t.d, t.d, t.d, t.d, @d) &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T2 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;            (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; CONCAT(t.d, t.d, t.d, t.d, t.d, t.d, @d) &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T3 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;            (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; CONCAT(t.d, t.d, t.d, t.d, t.d, t.d, @d) &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T4 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;            (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; CONCAT(t.d, t.d, t.d, t.d, t.d, t.d, @d) &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T5 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;            (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; CONCAT(t.d, t.d, t.d, t.d, t.d, t.d, @d) &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T6 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;        )&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;    )&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.Numbers &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; n&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; n.n &lt;span style="color:#0000ff;"&gt;BETWEEN&lt;/span&gt; 1 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; 10000&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;ORDER&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;BY&lt;/span&gt; n.n&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;OPTION&lt;/span&gt; (LOOP &lt;span style="color:#0000ff;"&gt;JOIN&lt;/span&gt;, FORCE &lt;span style="color:#0000ff;"&gt;ORDER&lt;/span&gt;, MAXDOP 1);&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The broad idea there is to concatenate our 2048-character column to itself five times and include a Unicode character that was used in the original query as a delimiter that could not appear in the source data. Each lookup performs the same basic operation against its target table, and the final result is the result of concatenating all the intermediate results. The query hints are necessary to get the right plan shape, just because my test rig tables are so much smaller than the real ones.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;Note that the Unicode delimiter means the 2048-character single-byte data is implicitly converted to Unicode, doubling in size. It is not a crucial feature of the test, but it did appear in the original query and explains the type conversion warnings in the execution plan I mentioned earlier. The execution plan for the test query is (click to enlarge if necessary):&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_687DAD00.png" target="_blank"&gt;&lt;img title="Test query execution plan" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Test query execution plan" src="http://sqlblog.com/blogs/paul_white/image_thumb_50B64FCD.png" width="644" height="233" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;I should also stress that the CONCAT operator (new in SQL Server 2012) is not crucial either. If you are using an earlier version of SQL Server, an equivalent query (for present purposes) is shown below. I’m going to stick with CONCAT for the remainder of the post, however.&lt;/font&gt;&lt;/p&gt;

&lt;div id="codeSnippetWrapper" style="overflow:auto;cursor:text;font-size:8pt;border-top:silver 1px solid;font-family:'Courier New', courier, monospace;border-right:silver 1px solid;border-bottom:silver 1px solid;padding-bottom:4px;direction:ltr;text-align:left;padding-top:4px;padding-left:4px;margin:20px 0px 10px;border-left:silver 1px solid;line-height:12pt;padding-right:4px;max-height:200px;width:97.5%;background-color:#f4f4f4;"&gt;
  &lt;div id="codeSnippet" style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;
    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;DECLARE&lt;/span&gt; @d nvarchar(&lt;span style="color:#0000ff;"&gt;max&lt;/span&gt;) = &lt;span style="color:#0000ff;"&gt;NCHAR&lt;/span&gt;(10000);&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&amp;#160;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; n.n,&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;    DATALENGTH&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;    (&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;        (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; @d+t.d+t.d+t.d+t.d+t.d+t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T1 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) +&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;        (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; @d+t.d+t.d+t.d+t.d+t.d+t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T2 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) +&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;        (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; @d+t.d+t.d+t.d+t.d+t.d+t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T3 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) +&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;        (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; @d+t.d+t.d+t.d+t.d+t.d+t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T4 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) +&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;        (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; @d+t.d+t.d+t.d+t.d+t.d+t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T5 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) +&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;        (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; @d+t.d+t.d+t.d+t.d+t.d+t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T6 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;    )&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.Numbers &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; n&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; n.n &lt;span style="color:#0000ff;"&gt;BETWEEN&lt;/span&gt; 1 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; 10000&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;ORDER&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;BY&lt;/span&gt; n.n&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;OPTION&lt;/span&gt; (LOOP &lt;span style="color:#0000ff;"&gt;JOIN&lt;/span&gt;, FORCE &lt;span style="color:#0000ff;"&gt;ORDER&lt;/span&gt;, MAXDOP 1);&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;h3 align="left"&gt;Warm cache results&lt;/h3&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;With all data in memory, the test query (in either form) completes in about &lt;strong&gt;1.6 seconds&lt;/strong&gt; on my laptop. The result shows that each output row contains 147,468 bytes of Unicode character data. A typical set of I/O statistics follows:&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_5B933115.png"&gt;&lt;img title="image" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="image" src="http://sqlblog.com/blogs/paul_white/image_thumb_25420906.png" width="534" height="164" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;Nothing too exciting to see there, but this is just our baseline.&lt;/font&gt;&lt;/p&gt;

&lt;h3 align="left"&gt;Cold cache results&lt;/h3&gt;

&lt;div id="codeSnippetWrapper" style="overflow:auto;cursor:text;font-size:8pt;border-top:silver 1px solid;font-family:'Courier New', courier, monospace;border-right:silver 1px solid;border-bottom:silver 1px solid;padding-bottom:4px;direction:ltr;text-align:left;padding-top:4px;padding-left:4px;margin:20px 0px 10px;border-left:silver 1px solid;line-height:12pt;padding-right:4px;max-height:200px;width:97.5%;background-color:#f4f4f4;"&gt;
  &lt;div id="codeSnippet" style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;
    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;CHECKPOINT&lt;/span&gt;;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;DBCC&lt;/span&gt; DROPCLEANBUFFERS;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;With no data in memory, the test query now runs for &lt;strong&gt;18.6 seconds&lt;/strong&gt; – almost &lt;strong&gt;12x slower&lt;/strong&gt;. The I/O statistics show the expected (but still mysterious!) work table and its associated reads:&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_23913D32.png"&gt;&lt;img title="image" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="image" src="http://sqlblog.com/blogs/paul_white/image_thumb_5680CD99.png" width="646" height="182" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The Extended Events wait statistics show SQL Server spent very little of that time waiting on my laptop’s slow hard drive – just 402 ms:&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_54D001C5.png"&gt;&lt;img title="image" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="image" src="http://sqlblog.com/blogs/paul_white/image_thumb_4C6C2C6E.png" width="292" height="131" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 align="left"&gt;Explanation&lt;/h2&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The are a number of factors in play here that we will look at in turn.&lt;/font&gt;&lt;/p&gt;

&lt;h3 align="left"&gt;Nested Loops Prefetching&lt;/h3&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;One of the reasons the optimizer prefers Hash Match and Merge Join for larger inputs is that the data access patterns tend to favour large sequential read-ahead. Both hash and merge tend to scan (range-scan in the case of a seek) their inputs, and the SQL Server Storage Engine automatically issues read-ahead when it detects this type of access. There is nothing in the execution plan to show that a base table will be read with read-ahead, it just happens.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;A very basic implementation of Nested Loops join would not benefit from read-ahead at all on its inner side. The outer (driving) side of the loops join might well be a scan or range-scan of an index, and so benefit from automatic read-ahead, of course. The inner side is executed once per outer row, resulting in a rapid succession of small index seeks for different values. These small seeks will typically not be large enough to trigger the automatic read-ahead mechanism. Indeed, in our test, each inner side seek is for precisely one value.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;SQL Server improves on this by implementing a second read-ahead mechanism especially for Nested Loops joins (not all N-L joins, it is a cost-based decision the optimizer makes). The basic idea is to buffer extra rows from the outer side of the join, and to use the row values in the buffer to drive read-ahead for the inner side. The effect is that the Nested Loops join becomes a partly blocking operator as outer-side rows are read into the buffer and read-ahead issued based on buffered index key values.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;This read-ahead may be either order-preserving or not, and is indicated in the execution plan by the Nested Loop attributes &lt;em&gt;With Ordered Prefetch&lt;/em&gt; and &lt;em&gt;With Unordered Prefetch,&lt;/em&gt; respectively. When unordered prefetch occurs, the inner side is processed in whatever order the asynchronous reads happen to complete. With ordered prefetching, the mechanism is careful to ensure that the order of rows entering the join is preserved on the output.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;In the test rig, the ORDER BY clause means there is a need to preserve row order, so &lt;em&gt;Ordered Prefetch&lt;/em&gt; is used:&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_58F9D98A.png" target="_blank"&gt;&lt;img title="Ordered Prefetch" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Ordered Prefetch" src="http://sqlblog.com/blogs/paul_white/image_thumb_612DFF21.png" width="644" height="379" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The issue described in this post is &lt;strong&gt;not specific to ordered prefetching&lt;/strong&gt; – the same behaviour is just as likely with unordered prefetching. The point is that Nested Loops prefetching is one of the requirements.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;&lt;a href="http://support.microsoft.com/kb/920093" target="_blank"&gt;Documented&lt;/a&gt; trace flags 652 and 8744 may be used (with care, and after serious testing) to disable automatic read-ahead and Nested Loops prefetching respectively. This is sometimes beneficial where all data is expected to be in memory (in which case read-ahead processing consumes resources better used by query execution) or where the I/O subsystem is &lt;em&gt;extremely&lt;/em&gt; fast. In case you were wondering, there is no background thread for prefetching – all the work of checking whether the data is in memory, and issuing I/O if not, is performed by the worker thread executing the query.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;I should stress that read-ahead and Nested Loops prefetching is generally A Very Good Thing with typical storage solutions (e.g. SANs) and both work best (or at all) when indexes have low logical fragmentation.&lt;/font&gt;&lt;/p&gt;

&lt;h3 align="left"&gt;Manufactured LOBs&lt;/h3&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The issue described here also requires that a large object data type is manufactured before prefetching. The Compute Scalar operators in the test execution plan perform that function:&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_10BBF7E1.png"&gt;&lt;img title="Manufactured LOB" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Manufactured LOB" src="http://sqlblog.com/blogs/paul_white/image_thumb_00CCB31D.png" width="599" height="314" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;By ‘manufactured’, I mean that the source columns are not LOB types, but the expression output is – notice the implicit conversion to nvarchar(max). To be clear about it, the issue we are analysing here does &lt;strong&gt;not&lt;/strong&gt; occur when Nested Loops prefetching occurs with an expression that was a LOB type to begin with.&lt;/font&gt;&lt;/p&gt;

&lt;h3 align="left"&gt;The Outer Join&lt;/h3&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The optimizer is quite good, generally speaking, at moving scalar expressions around. If the query had featured inner joins (whether by query design or through optimizer activities) the chances are quite good that the problematic expressions (the LOB manufacturers) would have moved beyond the prefetching, and so out of harm’s way. It is quite tricky to preserve NULL-extension and other outer-join semantics properly when moving expressions above an outer join, so the optimizer generally does not even try. In essence, the outer join represents an optimization barrier to the LOB-manufacturing expressions.&lt;/font&gt;&lt;/p&gt;

&lt;h3 align="left"&gt;Memory Allocation&lt;/h3&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;When Nested Loops prefetching occurs with a manufactured LOB, the question arises of where to store the created LOBs when buffering rows for prefetch. If the source data were already a LOB type, the execution engine would already have memory structures in place to handle them. When prefetching encounters a manufactured LOB, it needs to store it somewhere, since the engine is no longer processing a stream of one row at a time. It turns out that there is a small memory buffer set aside for this eventuality, which empirical tests show to be 24KB.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;However, this 24KB (directly allocated, not via workspace memory grant) is shared across all concurrently executing prefetching joins in the query. With six such joins in the test rig plan and large manufactured LOBs, the buffer stands no chance. As a result, query execution engages a &lt;strong&gt;bail-out option&lt;/strong&gt;: a work table created in tempdb. Though the pages of the worktable may in fact remain memory-resident, overheads (including latching and using general-purpose code interfaces for access to the buffered rows) mean this is very much slower than using the direct-memory cache.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;As with most internal work tables, the logical reads reported on this work table indicate the number of &lt;strong&gt;rows&lt;/strong&gt; processed (not 8KB pages, as for regular I/O statistics). This fact, together with the large number of items processed via the worktable in our test, accounts for the millions of reads reported.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The creation and use of the work table depends on run time conditions and timing. If execution finds the data it needs is already in memory, the prefetch checks are still performed, but no asynchronous read requests end up being posted. The 24KB buffer is never filled, so the need to create a work table never arises. The more prefetch that &lt;em&gt;actually occurs&lt;/em&gt;, the higher the chances that the buffer will fill. It is quite possible to experience a low level of prefetch with manufactured LOBs without the engine needing to bail out to a work table, especially if the LOBs are not very big and the I/O system is quite fast.&lt;/font&gt;&lt;/p&gt;

&lt;h2 align="left"&gt;Workaround&lt;/h2&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;We can rewrite the query to avoid feeding manufactured LOB data to the prefetch buffer. The idea is to use OUTER APPLY to return the data that &lt;em&gt;contributes&lt;/em&gt; to the concatenation, rather than the &lt;em&gt;result&lt;/em&gt; of the concatenation. We can then perform the &lt;a href="http://msdn.microsoft.com/en-us/library/hh231515.aspx" target="_blank"&gt;CONCAT&lt;/a&gt; operation (which handles NULLs nicely without extra work) after the join, avoiding the prefetch buffer issue completely. In SQL Server versions prior to 2012, we would need to use direct string concatenation, and handle rows that are NULL-extended explicitly using ISNULL or COALESCE.&lt;/font&gt;&lt;/p&gt;

&lt;div id="codeSnippetWrapper" style="overflow:auto;cursor:text;font-size:8pt;border-top:silver 1px solid;font-family:'Courier New', courier, monospace;border-right:silver 1px solid;border-bottom:silver 1px solid;padding-bottom:4px;direction:ltr;text-align:left;padding-top:4px;padding-left:4px;margin:20px 0px 10px;border-left:silver 1px solid;line-height:12pt;padding-right:4px;max-height:200px;width:97.5%;background-color:#f4f4f4;"&gt;
  &lt;div id="codeSnippet" style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;
    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;DECLARE&lt;/span&gt; @d nvarchar(&lt;span style="color:#0000ff;"&gt;max&lt;/span&gt;) = &lt;span style="color:#0000ff;"&gt;NCHAR&lt;/span&gt;(10000);&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&amp;#160;&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; &lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;    n.n,&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;    DATALENGTH&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;    (&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;        CONCAT&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;        (&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;            CONCAT(oa1.i0, oa1.i1, oa1.i2, oa1.i3, oa1.i4, oa1.i5, oa1.i6),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;            CONCAT(oa2.i0, oa2.i1, oa2.i2, oa2.i3, oa2.i4, oa2.i5, oa2.i6),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;            CONCAT(oa3.i0, oa3.i1, oa3.i2, oa3.i3, oa3.i4, oa3.i5, oa3.i6),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;            CONCAT(oa4.i0, oa4.i1, oa4.i2, oa4.i3, oa4.i4, oa4.i5, oa4.i6),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;            CONCAT(oa5.i0, oa5.i1, oa5.i2, oa5.i3, oa5.i4, oa5.i5, oa5.i6),&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;            CONCAT(oa6.i0, oa6.i1, oa6.i2, oa6.i3, oa6.i4, oa6.i5, oa6.i6)&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;        )&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;    )&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.Numbers &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; n&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;OUTER&lt;/span&gt; APPLY (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; i0 = @d, i1 = t.d, i2 = t.d, i3 = t.d, i4 = t.d, i5 = t.d, i6 = t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T1 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; oa1&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;OUTER&lt;/span&gt; APPLY (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; i0 = @d, i1 = t.d, i2 = t.d, i3 = t.d, i4 = t.d, i5 = t.d, i6 = t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T2 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; oa2&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;OUTER&lt;/span&gt; APPLY (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; i0 = @d, i1 = t.d, i2 = t.d, i3 = t.d, i4 = t.d, i5 = t.d, i6 = t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T3 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; oa3&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;OUTER&lt;/span&gt; APPLY (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; i0 = @d, i1 = t.d, i2 = t.d, i3 = t.d, i4 = t.d, i5 = t.d, i6 = t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T4 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; oa4&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;OUTER&lt;/span&gt; APPLY (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; i0 = @d, i1 = t.d, i2 = t.d, i3 = t.d, i4 = t.d, i5 = t.d, i6 = t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T5 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; oa5&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;OUTER&lt;/span&gt; APPLY (&lt;span style="color:#0000ff;"&gt;SELECT&lt;/span&gt; i0 = @d, i1 = t.d, i2 = t.d, i3 = t.d, i4 = t.d, i5 = t.d, i6 = t.d &lt;span style="color:#0000ff;"&gt;FROM&lt;/span&gt; dbo.T6 &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; t &lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; t.id = n.n) &lt;span style="color:#0000ff;"&gt;AS&lt;/span&gt; oa6&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;WHERE&lt;/span&gt; n.n &lt;span style="color:#0000ff;"&gt;BETWEEN&lt;/span&gt; 1 &lt;span style="color:#0000ff;"&gt;AND&lt;/span&gt; 10000&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:white;"&gt;&lt;span style="color:#0000ff;"&gt;ORDER&lt;/span&gt; &lt;span style="color:#0000ff;"&gt;BY&lt;/span&gt; n.n&lt;/pre&gt;


    &lt;pre style="border-top-style:none;overflow:visible;font-size:8pt;border-left-style:none;font-family:'Courier New', courier, monospace;border-bottom-style:none;color:black;padding-bottom:0px;direction:ltr;text-align:left;padding-top:0px;border-right-style:none;padding-left:0px;margin:0em;line-height:12pt;padding-right:0px;width:100%;background-color:#f4f4f4;"&gt;&lt;span style="color:#0000ff;"&gt;OPTION&lt;/span&gt; (LOOP &lt;span style="color:#0000ff;"&gt;JOIN&lt;/span&gt;, FORCE &lt;span style="color:#0000ff;"&gt;ORDER&lt;/span&gt;, MAXDOP 1);&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The execution plan for the rewritten query looks visually similar to the problematic one:&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_2C30D417.png" target="_blank"&gt;&lt;img title="Rewritten query plan" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Rewritten query plan" src="http://sqlblog.com/blogs/paul_white/image_thumb_401A40E0.png" width="644" height="245" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;However, the Compute Scalars no longer manufacture a LOB data type, they just emit column and variable references:&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_444418A5.png"&gt;&lt;img title="image" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="image" src="http://sqlblog.com/blogs/paul_white/image_thumb_3B07DD64.png" width="389" height="388" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;All the concatenation work (and LOB manufacture) is performed by the final top-level Compute Scalar in a single monster expression [Expr1056]:&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_31CBA223.png"&gt;&lt;img title="image" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="image" src="http://sqlblog.com/blogs/paul_white/image_thumb_2E6A0A7B.png" width="697" height="419" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 align="left"&gt;Warm cache results&lt;/h3&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;With all data in memory, the new query completes in &lt;strong&gt;1.8 seconds&lt;/strong&gt; (very slightly up on 1.6 seconds before):&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_0B59990B.png"&gt;&lt;img title="image" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="image" src="http://sqlblog.com/blogs/paul_white/image_thumb_5ED06924.png" width="430" height="165" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3 align="left"&gt;Cold cache results&lt;/h3&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;When all data must be fetched from disk, the query issues optimal prefetching and completes in &lt;strong&gt;7.3 seconds&lt;/strong&gt; (down from 18.6 seconds) with &lt;strong&gt;no work table&lt;/strong&gt;:&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_0F173DDB.png"&gt;&lt;img title="image" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="image" src="http://sqlblog.com/blogs/paul_white/image_thumb_0D667207.png" width="536" height="165" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The Extended Events wait statistics now show 3.8 seconds spent waiting for my laptop’s slow spinning disk (which is a good thing!)&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_0BB5A633.png"&gt;&lt;img title="image" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="image" src="http://sqlblog.com/blogs/paul_white/image_thumb_1843534F.png" width="291" height="111" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 align="left"&gt;Final Thoughts&lt;/h2&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;Work tables can appear in STATISTICS IO output for a wide range of reasons, but if you encounter one with a very large number of reads – particularly LOB reads – you may be encountering this issue. The rewrite proposed above may not always be possible, but you should be able to refactor your query to avoid the issue now you know it exists.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;I am not a fan of doing large amounts of string manipulation in SQL Server. I am always particularly suspicious of the perceived need to split or concatenate large volumes of strings.&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;I am, however, a fan of always using explicit data types (rather than relying on implicit conversions) and generating relatively small query plans that offer the query optimizer clear and obvious choices. By necessity, this often means writing small SQL queries in logical steps (and no, long chains of common table expressions do not count!)&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;The real world does not always make these things possible, of course, but it is good to have goals :)&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;© 2013 Paul White – All Rights Reserved 
    &lt;br /&gt;email: &lt;a href="mailto:SQLkiwi@gmail.com"&gt;SQLkiwi@gmail.com&lt;/a&gt; 

    &lt;br /&gt;twitter: &lt;a href="http://twitter.com/SQL_Kiwi"&gt;@SQL_Kiwi&lt;/a&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p align="left"&gt;Screenshots acquired using &lt;a href="http://www.techsmith.com/snagit-gslp.html" target="_blank"&gt;SnagIt by TechSmith&lt;/a&gt;

  &lt;br /&gt;Query plan details obtained using &lt;a href="http://www.sqlsentry.net/plan-explorer/sql-server-query-view.asp#features" target="_blank"&gt;Plan Explorer PRO by SQLSentry&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Public Release, SQL Server File Layout Viewer</title><link>http://www2.sqlblog.com/blogs/merrill_aldrich/archive/2013/03/01/public-release-sql-server-file-layout-viewer.aspx</link><pubDate>Fri, 01 Mar 2013 21:36:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47991</guid><dc:creator>merrillaldrich</dc:creator><description>&lt;h2&gt;Version 1.0 is Now Available!&lt;/h2&gt;  &lt;p&gt;I’ve been working off and on, as my real job permits, on this visualization tool for SQL Server data files. This is an educational or exploratory tool where you can more readily &lt;i&gt;see&lt;/i&gt; how the individual data pages in MDF/NDF files are organized, where your tables and indexes live, what effect operations like index rebuild or index reorganize have on the physical layout of the data pages.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewerR1_6399E49C.png"&gt;&lt;img title="FileLayoutViewerR1" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;padding-top:0px;padding-left:0px;border-left:0px;display:inline;padding-right:0px;" border="0" alt="FileLayoutViewerR1" width="1028" height="494" src="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewerR1_thumb_228B6538.png"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;The viewer will scan a whole database, using only SQL and DBCC commands, and will render a color-coded representation of all the data pages represented in colored bands. Each partition of each index or heap in the database is assigned a color, so that you can see where all the bits and pieces of an object are located in the files. Above the colored bands there are grayscale or white pixels that show the page type in SQL Server (most are white, which are data pages. Unused/empty regions of the file show as gray). In the image above, for example, all the bright green areas are one index, all the purple areas are one index, and so on.&lt;/p&gt;  &lt;p&gt;There is mouse-over functionality. If you move the mouse cursor over the graph, then details about each page populate the text fields at right, including the object and index the page belongs to, the page type, whether the page represents a fragment, where the previous and next pages are for the same object, etc.&lt;/p&gt;  &lt;h2&gt;Why?&lt;/h2&gt;  &lt;p&gt;Why create something like this? I am a visual person, and I have a theory that many issues we have in computing come down to not being able to see what’s going on. This is especially true as we learn about unfamiliar technology – we have to develop a mental model of structures like B-trees or linked lists or files in order to understand what’s happening. I hope this tool, combined with other knowledge, will help people form an accurate understanding of how data file internals work in SQL Server, faster than working purely in the abstract with tools like DBCC Page or DBCC Ind.&lt;/p&gt;  &lt;h2&gt;Instructions&lt;/h2&gt;  &lt;ol&gt;   &lt;li&gt;Download the tool and unzip it. The package includes both an executable and the source code. If you don’t want the source, the .exe file is a standalone program and will run all on its own, so you are welcome to discard the source folder.&lt;/li&gt;    &lt;li&gt;Validate you have the required prerequisites from the Prereq’s section below.&lt;/li&gt;    &lt;li&gt;Locate a non-production/test database to analyze. The database can be local or on a remote server. I suggest something of a reasonable size, because scanning a really huge data set can take quite a long time.&lt;/li&gt;    &lt;li&gt;Run SQLFileLayoutViewer.exe and select a database to scan. If the database is on a remote server, type the SQL Server name/instance name into the dialog.&lt;/li&gt;    &lt;li&gt;Click Analyze.&lt;/li&gt;    &lt;li&gt;Examine the resulting graph, and mouse over it with the cursor to view detailed information about each page.&lt;/li&gt; &lt;/ol&gt;  &lt;h2&gt;Disclaimer&lt;/h2&gt;  &lt;p&gt;This is a freeware tool provided for your fun, education and entertainment. However, there is no warranty of any kind and you use it at your sole risk. The tool is free but offered under the GNU General Public License 3. If successful, and people are interested, I’ll move this work to some sort of open source project.&lt;/p&gt;  &lt;h2&gt;Prerequisites&lt;/h2&gt;  &lt;p&gt;The app requires .NET Framework 4.0 and the SQL Server management tools. I’ve tested it on Windows 7, Windows Server 2008 R2 and Windows 8. It can be run against a database on a local or remote SQL instance. I believe it will work on any database in SQL Server 2005 or later, but have not tested every possible scenario.&lt;/p&gt;  &lt;h2&gt;Risks?&lt;/h2&gt;  &lt;p&gt;I believe this tool to be relatively risk free, but I would avoid running it against live production data. The tool’s data collection is simple: it will issue a few system table selects to get things like object names, and then it will execute a DBCC PAGE statement against every page in the database. All other processing after that is done locally in the application itself. It does not modify the database.&lt;/p&gt;  &lt;h2&gt;Bugs?&lt;/h2&gt;  &lt;p&gt;I would love to hear about bugs you come across, or additional features you think would be valuable. Please contact me through this site. Note that I am a DBA first, and an amateur .NET developer a distant second, so please be gentle.&lt;/p&gt;  &lt;p&gt;Enjoy!&lt;/p&gt;</description></item><item><title>Why does SQL Server not compress data on LOB pages?</title><link>http://www2.sqlblog.com/blogs/hugo_kornelis/archive/2013/01/31/why-does-sql-server-not-compress-data-on-lob-pages.aspx</link><pubDate>Thu, 31 Jan 2013 09:08:03 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47406</guid><dc:creator>Hugo Kornelis</dc:creator><description>&lt;p&gt;Enabling compression on your database can save you a lot of space – but when you have a lot of varchar(max) or nvarchar(max) data, you may find the savings to be limited. This is because only data stored on the data and index pages is compressed, and data for the (max) data types is generally stored on other, special-purpose pages – either text/image pages, or row overflow data pages. (See &lt;a href="http://msdn.microsoft.com/en-us/library/ms190969%28SQL.105%29.aspx"&gt;Understanding Pages and Extents&lt;/a&gt; in Books Online). This is from the SQL Server 2008R2 Books Online, but it is still valid in SQL Server 2012 – but apparently, this page has been removed from newer Books Online editions).&lt;/p&gt;  &lt;p&gt;So why does SQL Server not compress the data that, perhaps, would benefit most from compression? Here’s the answer.&lt;/p&gt;  &lt;p&gt;SQL Server currently supports two compression methods for data in the database (backup compression is out of scope for this post).&lt;/p&gt;  &lt;p&gt;* Row compression: This is a simple algorithm to save storage space for individual rows. It has two elements. The first is a more efficient way to store the per-row metadata, saving a few bytes per row regardless of layout and content. The second element is storing almost all data types, even those that have a fixed length, as variable length. This mainly has benefits for the larger numerical types (e.g a bigint with a value of 1,000 is stored in two bytes instead of eight – only values that actually &lt;i&gt;need&lt;/i&gt; all eight bytes do not gain from this, and will instead take up more space because the actual length has to be stored somewhere) and for fixed-length string types with lots of trailing spaces. For Unicode data, the SCSU algorithm is used, which saves 15% to 50% depending on the actual content of the column. (According to &lt;a href="http://en.wikipedia.org/wiki/Standard_Compression_Scheme_for_Unicode"&gt;Wikipedia&lt;/a&gt;, the SCSU standard has gained very little adoption because it is not as effective as other compression schemes).&lt;/p&gt;  &lt;p&gt;See &lt;a href="http://msdn.microsoft.com/en-us/library/cc280576.aspx"&gt;Row Compression Implementation&lt;/a&gt; and &lt;a href="http://msdn.microsoft.com/en-us/library/ee240835.aspx"&gt;Unicode Compression Implementation&lt;/a&gt; in Books Online.&lt;/p&gt;  &lt;p&gt;* Page compression: When enabled, page compression is done *after* row compression. As the name implies, it's done on a per-page basis. It consists of two steps:&lt;/p&gt;  &lt;p&gt;1. Prefix compression. Within each column, the longest common prefix is used to build the &amp;quot;anchor record&amp;quot;. All columns than only indicate how many characters of the anchor value they use as prefix. So for example, if we have a first name column with the values Roger / Hugo / Hugh, the anchor value could be Hugh, and the data values would be stored as {0}Roger / {3}o / {4}. (Here, {3} is stored as a single byte, and {3}o means: first three characters of Hugh, followed by an o).&lt;/p&gt;  &lt;p&gt;2. Dictionary compression. Accross the entire page, columns that are now stored with the same bit pattern are replaced with a single value that points to the dictionary entry. Let's assume that the same page I use above also has a Lastname column, with values Plowman / Kornelis / Ploo. Here, Plowman would be the anchor value, and the data after prefix compression would be {7} / {0}Kornelis / {3}o. The dictionary encoding would then see that there is a {3}o in the population of the Firstname columnm and a {3}o in the population of the Lastname column. It would place {3}o as the first entry in the dictionary and replace both {3}o values with the reference [1].&lt;/p&gt;  &lt;p&gt;See &lt;a href="http://msdn.microsoft.com/en-us/library/cc280464.aspx"&gt;Page Compression Implementation&lt;/a&gt; in Books Online.&lt;/p&gt;  &lt;p&gt;All elements of page compression save space by eliminating repeated data between different column values, so they will only work when multiple values are stored on a page. For all LOB pages, the reverse is the case: a single value spans multiple pages. So by definition, page compression can never yield any benefits.&lt;/p&gt;  &lt;p&gt;For row compression, the more efficient storage of per-row metadata naturally only affects pages that have per-row metadata stored – data and index pages, but not LOB pages. And the conversion of fixed length to variable length data types also doesn’t affect LOB pages, since these can only be used for varying length data.&lt;/p&gt;  &lt;p&gt;Based on the above, it is obvious why SQL Server does not compress varchar and varbinary data stored on LOB pages – there would be zero benefit from any of the implemented compression methods. But how about Unicode compression for nvarchar(max) and overflowing nvarchar(&lt;i&gt;nnn&lt;/i&gt;) data? Wouldn’t that save some space?&lt;/p&gt;  &lt;p&gt;To answer that, I now have to go into speculation mode. And I see two possible theories:&lt;/p&gt;  &lt;p&gt;1. Because the SCSU standard saves less spacing than other algorithms, the SQL Server team deliberately made this choice in order to encourage people to compress these large values in the client before sending them to the server, thereby reducing not only storage space (by more than SCSU would have yielded), but also network traffic. The down side of this is that cool features such as &lt;a href="http://msdn.microsoft.com/en-us/library/ms142571.aspx"&gt;Full-Text Search&lt;/a&gt; and &lt;a href="http://msdn.microsoft.com/en-us/library/gg492075.aspx"&gt;Semantic Search&lt;/a&gt; don’t understand data that was compressed at the client – at least not without &lt;a href="http://blogs.msdn.com/b/sqlfts/archive/2011/07/13/getting-a-custom-ifilter-working-with-sql-server-2008-r2-ifiltersample.aspx"&gt;a lot of extra effort&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;2. Since all compression algorithms work on a per-page basis, they had a choice between either first breaking the LOB data into pages and then compressing (which makes no sense, as the rest of the page would remain empty and the amount of space actually used remains the same) or creating a separate algorithm for LOB data to first compress it and then split it over multiple pages. That would of course have cost a lot of extra engineering hours, and if my understanding of SCSU is correct, it would also have a big adverse side effect on operations that affect only a part of an nvarchar(max) value (like &lt;a href="http://msdn.microsoft.com/en-us/library/ms187748%28SQL.105%29.aspx"&gt;SUBSTRING&lt;/a&gt; or the &lt;a href="http://msdn.microsoft.com/en-us/library/ms177523%28SQL.105%29.aspx"&gt;.WRITE method of the UPDATE statement&lt;/a&gt;). That is because SCSU works by traversing the entire string from left to right and can’t handle operating on only a subset of the string.&lt;/p&gt;  &lt;p&gt;Bottom line: When you have to store large values and you want to save on storage size, your best course of action is probably to compress and decompress the values on the client side. But do beware the consequences this has for Full Text Search and Semantic Search!&lt;/p&gt;  &lt;p&gt;Final note: I didn’t spend as much time on this blog post as I normally do. That’s because this actually started as a reply to a question on an internet forum, but when I was busy I realized that the reply was long enough to be promoted to a blog post.&lt;/p&gt;</description></item><item><title>Visualizing Data File Layout III</title><link>http://www2.sqlblog.com/blogs/merrill_aldrich/archive/2013/01/29/visualizing-data-file-layout-iii.aspx</link><pubDate>Tue, 29 Jan 2013 05:45:50 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47372</guid><dc:creator>merrillaldrich</dc:creator><description>&lt;p&gt;This is part three of a blog series illustrating a method to render the file structure of a SQL Server database into a graphic visualization.&lt;/p&gt;  &lt;p&gt;Previous Installments:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/archive/2013/01/22/visualizing-data-file-layout-i.aspx"&gt;Part 1&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/archive/2013/01/23/visualizing-data-file-layout-ii.aspx"&gt;Part 2&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Those that have been reading this series might be be thinking, “Is he going to go there?” Well, the answer is “Yes.” This is the &lt;strong&gt;GUID clustered index post&lt;/strong&gt; that had to be. It’s inevitable with this tool.&lt;/p&gt;  &lt;p&gt;If you follow SQL Server at all, you are probably aware of the &lt;a href="http://www.sqlskills.com/blogs/kimberly/guids-as-primary-keys-andor-the-clustering-key/"&gt;long-standing&lt;/a&gt; &lt;a href="http://www.codinghorror.com/blog/2007/03/primary-keys-ids-versus-guids.html"&gt;debate&lt;/a&gt; about whether it is wise, desirable, smart, useful, or what have you, to identify rows using GUIDs. I won’t take a position on that, but I will show here, I hope objectively, a few things that the visualizer shows about file layout vs. distributed inserts, distributed inserts being one of the main challenges around using GUIDs as clustering keys. Just to recap the argument very, very briefly:&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Advantages&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;GUID keys can be generated at the client, which saves a round-trip to the database server to create a collection of related rows.&lt;/p&gt;  &lt;p&gt;GUID keys can make certain architectures like sharding, or peer to peer replication, or merging multiple source databases, simpler.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Disadvantages&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;GUID keys are wider, therefore they take more space in memory and on disk. The additional space is multiplied by their presence in both clustered and non-clustered indexes if they are a clustering key.&lt;/p&gt;  &lt;p&gt;GUID keys don’t only take more space in RAM and on disk because of their width. They also cause &lt;em&gt;distributed inserts&lt;/em&gt; into the clustered index – that is, new rows are added to any and all pages in the index. Each time a row has to be added, the target page must be read into memory, and at a checkpoint, the &lt;em&gt;whole&lt;/em&gt; changed page (both existing and new rows) must be written to disk. This has two effects: &lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;The amount of RAM and disk IO required for inserts is probably much higher, as pages with &lt;em&gt;existing&lt;/em&gt; data must come into cache, get changed, and then be written back out again. Essentially, large parts of the table have to be &lt;em&gt;rewritten&lt;/em&gt; to disk to append rows to pages that have data already.&lt;/li&gt;    &lt;li&gt;The pages that store the index will individually fill up, and have to split such that half the existing rows are written back out to the “old” page and half written out to a “new” page in a different location on disk. This causes the pages to be less full, the same number of rows to require more space on disk and in RAM, and the resulting index to be massively fragmented on disk.&lt;/li&gt; &lt;/ol&gt;  &lt;p&gt;I am not writing to argue these points, which have I think been established by both sides of the debate, only to see if the visualizer shows these effects clearly. Most of the argument isn’t actually about these facts (they are all true, as far as I know) but rather which are more important, and I think that is the main source of debate on the issue.&lt;/p&gt;  &lt;h2&gt;Visual Example of Distributed Inserts&lt;/h2&gt;  &lt;p&gt;It’s very easy to create an example of this with a small sample database. I created one called “VizDemo2.” VizDemo2 has a slightly modified structure to illustrate what’s going on here – I need two tables that are stored separately on disk, so that they cannot interfere with one another. The simplest way to do that is with a couple of file groups containing one file each. So here’s the structure:&lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;I created the database with a 50MB, single file, Primary file group&lt;/li&gt;    &lt;li&gt;I added a file group FG1 with one 75MB file&lt;/li&gt;    &lt;li&gt;I added a second file group FG2 with one 75MB file&lt;/li&gt; &lt;/ol&gt;  &lt;p&gt;When the database is empty, the visualizer shows only the system pages at the start of each file, as shown here:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_01_5BB678D6.png"&gt;&lt;img title="VizDemo2_01" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;padding-top:0px;padding-left:0px;border-left:0px;display:inline;padding-right:0px;" border="0" alt="VizDemo2_01" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_01_thumb_18DA345C.png" width="1028" height="494" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;To that database I added two sample tables identical in structure but with different clustering keys:&lt;/p&gt;  &lt;pre class="code"&gt;&lt;span style="color:blue;"&gt;USE &lt;/span&gt;VizDemo2
&lt;span style="color:blue;"&gt;GO

CREATE TABLE &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomersInt  &lt;span style="color:gray;"&gt;( 
    &lt;/span&gt;id &lt;span style="color:blue;"&gt;int identity&lt;/span&gt;&lt;span style="color:gray;"&gt;( &lt;/span&gt;1&lt;span style="color:gray;"&gt;, &lt;/span&gt;1 &lt;span style="color:gray;"&gt;) NOT NULL &lt;/span&gt;&lt;span style="color:blue;"&gt;PRIMARY KEY CLUSTERED&lt;/span&gt;&lt;span style="color:gray;"&gt;, 
    &lt;/span&gt;buncha &lt;span style="color:blue;"&gt;char&lt;/span&gt;&lt;span style="color:gray;"&gt;(&lt;/span&gt;500&lt;span style="color:gray;"&gt;) &lt;/span&gt;&lt;span style="color:blue;"&gt;default &lt;/span&gt;&lt;span style="color:red;"&gt;'A'&lt;/span&gt;&lt;span style="color:gray;"&gt;, 
    &lt;/span&gt;big &lt;span style="color:blue;"&gt;char&lt;/span&gt;&lt;span style="color:gray;"&gt;(&lt;/span&gt;500&lt;span style="color:gray;"&gt;) &lt;/span&gt;&lt;span style="color:blue;"&gt;default &lt;/span&gt;&lt;span style="color:red;"&gt;'B'&lt;/span&gt;&lt;span style="color:gray;"&gt;, 
    &lt;/span&gt;vals &lt;span style="color:blue;"&gt;char&lt;/span&gt;&lt;span style="color:gray;"&gt;(&lt;/span&gt;500&lt;span style="color:gray;"&gt;)  &lt;/span&gt;&lt;span style="color:blue;"&gt;default &lt;/span&gt;&lt;span style="color:red;"&gt;'C'
&lt;/span&gt;&lt;span style="color:gray;"&gt;) &lt;/span&gt;&lt;span style="color:blue;"&gt;ON &lt;/span&gt;[FG1]&lt;span style="color:gray;"&gt;;
&lt;/span&gt;&lt;span style="color:blue;"&gt;GO&lt;/span&gt;&lt;span style="color:blue;"&gt;
CREATE TABLE &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomersGuid  &lt;span style="color:gray;"&gt;( 
    &lt;/span&gt;id &lt;span style="color:blue;"&gt;uniqueidentifier &lt;/span&gt;&lt;span style="color:gray;"&gt;NOT NULL &lt;/span&gt;&lt;span style="color:blue;"&gt;PRIMARY KEY CLUSTERED DEFAULT &lt;/span&gt;&lt;span style="color:magenta;"&gt;NEWID&lt;/span&gt;&lt;span style="color:gray;"&gt;(), 
    &lt;/span&gt;buncha &lt;span style="color:blue;"&gt;char&lt;/span&gt;&lt;span style="color:gray;"&gt;(&lt;/span&gt;500&lt;span style="color:gray;"&gt;) &lt;/span&gt;&lt;span style="color:blue;"&gt;default &lt;/span&gt;&lt;span style="color:red;"&gt;'A'&lt;/span&gt;&lt;span style="color:gray;"&gt;, 
    &lt;/span&gt;big &lt;span style="color:blue;"&gt;char&lt;/span&gt;&lt;span style="color:gray;"&gt;(&lt;/span&gt;500&lt;span style="color:gray;"&gt;) &lt;/span&gt;&lt;span style="color:blue;"&gt;default &lt;/span&gt;&lt;span style="color:red;"&gt;'B'&lt;/span&gt;&lt;span style="color:gray;"&gt;, 
    &lt;/span&gt;vals &lt;span style="color:blue;"&gt;char&lt;/span&gt;&lt;span style="color:gray;"&gt;(&lt;/span&gt;500&lt;span style="color:gray;"&gt;)  &lt;/span&gt;&lt;span style="color:blue;"&gt;default &lt;/span&gt;&lt;span style="color:red;"&gt;'C'
&lt;/span&gt;&lt;span style="color:gray;"&gt;)&lt;/span&gt;&lt;span style="color:blue;"&gt;ON &lt;/span&gt;[FG2]&lt;span style="color:gray;"&gt;;
&lt;/span&gt;&lt;span style="color:blue;"&gt;GO&lt;/span&gt;&lt;/pre&gt;


&lt;p&gt;I’ll populate the two tables and we can see what the file layout looks like afterward:&lt;/p&gt;

&lt;pre class="code"&gt;&lt;span style="color:blue;"&gt;INSERT &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomersInt &lt;span style="color:blue;"&gt;DEFAULT VALUES&lt;/span&gt;&lt;span style="color:gray;"&gt;;
&lt;/span&gt;&lt;span style="color:blue;"&gt;INSERT &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomersGuid &lt;span style="color:blue;"&gt;DEFAULT VALUES&lt;/span&gt;&lt;span style="color:gray;"&gt;;
&lt;/span&gt;&lt;span style="color:blue;"&gt;GO &lt;/span&gt;20000&lt;/pre&gt;


&lt;h2&gt;Compare&lt;/h2&gt;

&lt;p&gt;After inserts, the resulting graphic does show some facts we know to be true:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_02_6193AD20.png"&gt;&lt;img title="VizDemo2_02" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;padding-top:0px;padding-left:0px;border-left:0px;display:inline;padding-right:0px;" border="0" alt="VizDemo2_02" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_02_thumb_7A25DB21.png" width="1028" height="494" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;First, the data in the integer-clustered index takes about eight bands of the diagram, while storing the same data in a GUID clustered index has required about twelve bands of data pages. The database itself supports that impression with space allocation – it reports these figures:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_03_50B199E1.png"&gt;&lt;img title="VizDemo2_03" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;padding-top:0px;padding-left:0px;border-left:0px;display:inline;padding-right:0px;" border="0" alt="VizDemo2_03" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_03_thumb_49262A74.png" width="737" height="75" /&gt;&lt;/a&gt;&lt;/p&gt;





&lt;p&gt;Part of the extra space required is the width of the key, but part of it is the empty space on each page resulting from page splits. If a page that needs a new row is too full, then half the rows from that page are moved to a net-new page, half left in place, and the new row added to one or the other of the resulting pages. Afterward, they are often both partly empty.&lt;/p&gt;

&lt;p&gt;Second, the whole graphic in the GUID clustered index area is a dark blue that the visualizer uses to show fragmentation – in fact, the object is almost perfectly fragmented, with practically no contiguous pages at all. The sequence of pages in the leaf level of the index is still a linked list, as always, but it it is physically stored in essentially random order on disk.&lt;/p&gt;

&lt;h2&gt;Does Re-Indexing Help?&lt;/h2&gt;

&lt;p&gt;The next question is whether we can combat these problems by doing huge amounts of index maintenance – if we rewrite the GUID index, will that make it take less space, or make it more efficient? The answer is, “well, sort of, temporarily.”&lt;/p&gt;

&lt;p&gt;First, re-indexing will put the table in “GUID” order. Whether that really helps or not is debatable, perhaps. It would enable read-ahead for the index, which is otherwise clobbered by the fragmentation. Having the table in “GUID” order might or might not be of any help to performance. Second, re-indexing will make the pages denser, or less dense, depending on the fill factor applied. For the sake of demonstration, let’s re-index with the default fill factor, because I think that happens a lot out in the world, and it may tell us something:&lt;/p&gt;

&lt;pre class="code"&gt;&lt;span style="color:blue;"&gt;ALTER INDEX &lt;/span&gt;&lt;span style="color:gray;"&gt;ALL &lt;/span&gt;&lt;span style="color:blue;"&gt;ON &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomersGuid &lt;span style="color:blue;"&gt;REBUILD&lt;/span&gt;&lt;span style="color:gray;"&gt;;&lt;/span&gt;&lt;/pre&gt;


&lt;p&gt;After re-indexing, this is a view just of the second file group with the GUID clustered table (note that I scrolled down in the display):&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_04_3266E2EB.png"&gt;&lt;img title="VizDemo2_04" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;padding-top:0px;padding-left:0px;border-left:0px;display:inline;padding-right:0px;" border="0" alt="VizDemo2_04" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_04_thumb_1B3B686D.png" width="1028" height="494" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The arrow shows where the data was moved from the old data pages into a new region of the file. And, sure enough, it’s not fragmented (note the lighter color) and it takes less space in the file.&lt;/p&gt;

&lt;p&gt;That might sound good, but if this is a real database, inserts probably will continue. In the int clustered case, as we know, new data will be appended to the end of the page sequence, but in this case, new data will have to be inserted into most of the existing pages on disk. Those are all full now, and will have to be split 50/50 to create new pages for the new data, both the old and new pages will have to be written out, and the new pages by definition can’t be in index order with the existing pages.&lt;/p&gt;

&lt;pre class="code"&gt;&lt;span style="color:blue;"&gt;INSERT &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomersGuid &lt;span style="color:blue;"&gt;DEFAULT VALUES&lt;/span&gt;&lt;span style="color:gray;"&gt;;
&lt;/span&gt;&lt;span style="color:blue;"&gt;GO &lt;/span&gt;20000&lt;/pre&gt;


&lt;p&gt;What we get after more rows are added to the table is what a layperson might call a “hot mess:”&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_05_1D0BBE34.png"&gt;&lt;img title="VizDemo2_05" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;padding-top:0px;padding-left:0px;border-left:0px;display:inline;padding-right:0px;" border="0" alt="VizDemo2_05" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo2_05_thumb_7A4A8676.png" width="1028" height="494" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here everything is fragmented – back to that dark blue – even the pages we just re-indexed a moment ago, because they all split. The table has &lt;em&gt;more than&lt;/em&gt; doubled in size, even though we just doubled the number of rows, because the individual pages contain less data.&lt;/p&gt;

&lt;p&gt;Would appropriate fill factor be a workaround? In some measure, yes, but it really only combats the issue. The write activity on the table, even with a low fill factor, will still be higher as more existing pages have to be flushed at checkpoints. The pages will still be less dense, and therefore take up more space on disk and in cache. In short – maybe helpful but no silver bullet.&lt;/p&gt;

&lt;p&gt;What about Sequential GUIDs? Here I will venture my opinion. Sequential GUIDs have never made sense to me. They solve one part of this problem – the distributed insert part – but &lt;em&gt;at the expense of the very things GUIDs might be good for&lt;/em&gt;, namely not demanding a visit to the database to generate an identifier. If you have to come to the database, you already lost this whole argument. Use an integer and solve the rest of the problem at the same time. I can only see it as a sort of band-aid for existing systems that could not be refactored, but, like a bad SUV that combines the worst properties of a car and a truck, it feels like a really poor compromise to me.&lt;/p&gt;

&lt;p&gt;I hope this helps to illustrate some of the physical database design challenges that surround the use of GUID cluster keys. In the next installment I’m planning to demonstrate the interleaving of objects, which is one argument for multiple file groups.&lt;/p&gt;</description></item><item><title>Visualizing Data File Layout II</title><link>http://www2.sqlblog.com/blogs/merrill_aldrich/archive/2013/01/23/visualizing-data-file-layout-ii.aspx</link><pubDate>Wed, 23 Jan 2013 23:40:02 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47268</guid><dc:creator>merrillaldrich</dc:creator><description>&lt;p&gt;Part 2 of a blog series visually demonstrating the layout of objects on data pages in SQL Server&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/archive/2013/01/22/visualizing-data-file-layout-i.aspx"&gt;Part 1&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;In Part 1 of this series, I introduced a little demo app that renders the layout of pages in SQL Server files by object. Today I’ll put that app through its paces to show, in vivid color (well, teal, anyway) the destructive power of the famous&lt;strong&gt; Re-Index Then Shrink&lt;/strong&gt; anti-pattern for index maintenance.&lt;/p&gt;  &lt;p&gt;This one is very easy to demo, so let’s go!&lt;/p&gt;  &lt;p&gt;First, I created a demo database &lt;strong&gt;VizDemo1&lt;/strong&gt;, with a single 200 MB data file. Into that database I placed a canonical table – highly simplified for this example – clustered on an ever-increasing integer, using identity():&lt;/p&gt;  &lt;pre class="code"&gt;&lt;span style="color:blue;"&gt;USE &lt;/span&gt;VizDemo1
&lt;span style="color:blue;"&gt;GO

CREATE TABLE &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomers  &lt;span style="color:gray;"&gt;( 
    &lt;/span&gt;id &lt;span style="color:blue;"&gt;int identity&lt;/span&gt;&lt;span style="color:gray;"&gt;( &lt;/span&gt;1&lt;span style="color:gray;"&gt;, &lt;/span&gt;1 &lt;span style="color:gray;"&gt;) NOT NULL &lt;/span&gt;&lt;span style="color:blue;"&gt;PRIMARY KEY CLUSTERED&lt;/span&gt;&lt;span style="color:gray;"&gt;, 
    &lt;/span&gt;buncha &lt;span style="color:blue;"&gt;char&lt;/span&gt;&lt;span style="color:gray;"&gt;(&lt;/span&gt;500&lt;span style="color:gray;"&gt;) &lt;/span&gt;&lt;span style="color:blue;"&gt;default &lt;/span&gt;&lt;span style="color:red;"&gt;'A'&lt;/span&gt;&lt;span style="color:gray;"&gt;, 
    &lt;/span&gt;big &lt;span style="color:blue;"&gt;char&lt;/span&gt;&lt;span style="color:gray;"&gt;(&lt;/span&gt;500&lt;span style="color:gray;"&gt;) &lt;/span&gt;&lt;span style="color:blue;"&gt;default &lt;/span&gt;&lt;span style="color:red;"&gt;'B'&lt;/span&gt;&lt;span style="color:gray;"&gt;, 
    &lt;/span&gt;vals &lt;span style="color:blue;"&gt;char&lt;/span&gt;&lt;span style="color:gray;"&gt;(&lt;/span&gt;500&lt;span style="color:gray;"&gt;)  &lt;/span&gt;&lt;span style="color:blue;"&gt;default &lt;/span&gt;&lt;span style="color:red;"&gt;'C'
&lt;/span&gt;&lt;span style="color:gray;"&gt;);
&lt;/span&gt;&lt;span style="color:blue;"&gt;GO
&lt;/span&gt;&lt;/pre&gt;


&lt;p&gt;Then we populate that table with some dummy data:&lt;/p&gt;

&lt;pre class="code"&gt;&lt;span style="color:blue;"&gt;INSERT &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomers &lt;span style="color:blue;"&gt;DEFAULT VALUES&lt;/span&gt;&lt;span style="color:gray;"&gt;;

&lt;/span&gt;&lt;span style="color:blue;"&gt;GO &lt;/span&gt;40000&lt;/pre&gt;


&lt;p&gt;And finally, fire up the little visualizer app and process the database:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_01_4683F23D.png"&gt;&lt;img title="VizDemo1_01" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;float:none;padding-top:0px;padding-left:0px;margin-left:auto;border-left:0px;display:block;padding-right:0px;margin-right:auto;" border="0" alt="VizDemo1_01" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_01_thumb_53EA0543.png" width="1028" height="494" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The small color bands at the top left corner of the image are the system tables and such that are in every “empty” database to make it run. The blue/green/teal area is the new table we created and populated with sample data, and the gray area represents empty regions in the file.&lt;/p&gt;

&lt;p&gt;As expected, the table started writing into the first available space, and, because the cluster key is increasing, pages were allocated to the end of the page sequence in order, and we end up with a crazy-perfect, contiguous linked list on disk.&lt;/p&gt;

&lt;p&gt;You can see small darker bars at intervals within the table – most of the pages in the index are “type 1” pages, which are the leaf-level/rows in the clustered index. Those bars are “type 2” index pages that have the upper level(s) of the index. The reason they are darker is that those are a disruption in the leaf level linked list, and the app shades such disruptions as a way to see fragmentation. The list has to “hop over” those pages and then continue on the other side. It’s technically fragmentation, but at this point not harmful at all – but remember that darker color that shows a break in the page order.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;A side note: in the midst of the gray area you can see one orange line (and another in the sea of teal). Those are “type 11” PFS pages, which happen on a fixed interval in every file. I don’t think they ever move – they track file allocation and free space metadata. They are like rocks in the stream…&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now, what happens if we re-index this bad boy? Well, a re-index operation has to write all the pages for the object into new, blank pages in the file, and then abandon the old pages. I run:&lt;/p&gt;

&lt;pre class="code"&gt;&lt;span style="color:green;"&gt;-- This &amp;quot;moves&amp;quot; all the data toward the end of the file, into free areas
&lt;/span&gt;&lt;span style="color:blue;"&gt;ALTER INDEX &lt;/span&gt;&lt;span style="color:gray;"&gt;ALL &lt;/span&gt;&lt;span style="color:blue;"&gt;ON &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomers &lt;span style="color:blue;"&gt;REBUILD&lt;/span&gt;&lt;span style="color:gray;"&gt;;
&lt;/span&gt;&lt;span style="color:blue;"&gt;GO&lt;/span&gt;&lt;/pre&gt;


&lt;p&gt;Then re-analyze the file. As expected, the table has “moved” toward the end of the file, and left free space toward the beginning. It’s still not fragmented, because we had enough room, and it was written in order into that new area by the rebuild:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_02_61501849.png"&gt;&lt;img title="VizDemo1_02" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;float:none;padding-top:0px;padding-left:0px;margin-left:auto;border-left:0px;display:block;padding-right:0px;margin-right:auto;" border="0" alt="VizDemo1_02" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_02_thumb_3EFB1381.png" width="1028" height="494" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can see the gray area near the top is all the “abandoned” pages where the index was, and the data has all moved down into the free area. Ah, but that seems wasteful to some people, am I right? All that empty space – the file could be smaller!&lt;/p&gt;

&lt;p&gt;Let’s see the damage that Shrink File does. Imagine that I do this:&lt;/p&gt;

&lt;pre class="code"&gt;&lt;span style="color:blue;"&gt;DBCC &lt;/span&gt;SHRINKFILE &lt;span style="color:gray;"&gt;(&lt;/span&gt;&lt;span style="color:red;"&gt;N'VizDemo1' &lt;/span&gt;&lt;span style="color:gray;"&gt;, &lt;/span&gt;70&lt;span style="color:gray;"&gt;)
&lt;/span&gt;&lt;span style="color:blue;"&gt;GO
&lt;/span&gt;&lt;/pre&gt;


&lt;p&gt;First, before we shrink, let’s just scroll down and look at the end of the file:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_03_4E9DAF43.png"&gt;&lt;img title="VizDemo1_03" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;float:none;padding-top:0px;padding-left:0px;margin-left:auto;border-left:0px;display:block;padding-right:0px;margin-right:auto;" border="0" alt="VizDemo1_03" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_03_thumb_10A41E85.png" width="1028" height="494" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We have two conditions – the gray part of the file is sort of OK to shrink. There’s just a lone PFS page out there, and removing that does no harm. But once we get into that blue area, the data has to be moved back up into the beginning of the file. Here’s where the problem lies, as I learned from Mr. Paul Randal – the shrink routine will move a page at a time back into that free space, starting from the end, going backward. That makes the pages land in approximately &lt;em&gt;reverse order&lt;/em&gt; from the correct index order. Perfect fragmentation. Let’s see if this tool proves him right. Shrink, then re-analyze:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_04_595D9749.png"&gt;&lt;img title="VizDemo1_04" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;float:none;padding-top:0px;padding-left:0px;margin-left:auto;border-left:0px;display:block;padding-right:0px;margin-right:auto;" border="0" alt="VizDemo1_04" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_04_thumb_500462C6.png" width="1028" height="494" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Yep, it’s not immediately apparent, perhaps, but that teal color is a darker shade that indicates every page is a fragment boundary in most of the index – perfect fragmentation! Here’s a better view:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_04a_51D4B88D.png"&gt;&lt;img title="VizDemo1_04a" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;float:none;padding-top:0px;padding-left:0px;margin-left:auto;border-left:0px;display:block;padding-right:0px;margin-right:auto;" border="0" alt="VizDemo1_04a" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_04a_thumb_665A080B.png" width="408" height="284" /&gt;&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;So, how can we clean that up? Well, with a rebuild. But … we need that bigger file. In fact, practically any database in production needs this overhead of available space to be able to perform index maintenance. It’s not “wasted” space at all.&lt;/p&gt;

&lt;pre class="code"&gt;&lt;span style="color:blue;"&gt;ALTER INDEX &lt;/span&gt;&lt;span style="color:gray;"&gt;ALL &lt;/span&gt;&lt;span style="color:blue;"&gt;ON &lt;/span&gt;dbo&lt;span style="color:gray;"&gt;.&lt;/span&gt;SampleCustomers &lt;span style="color:blue;"&gt;REBUILD&lt;/span&gt;&lt;span style="color:gray;"&gt;;
&lt;/span&gt;&lt;span style="color:blue;"&gt;GO&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;After the rebuild, the index is back toward the end of the file, but it’s also back in order:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_05_13DB27CF.png"&gt;&lt;img title="VizDemo1_05" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;float:none;padding-top:0px;padding-left:0px;margin-left:auto;border-left:0px;display:block;padding-right:0px;margin-right:auto;" border="0" alt="VizDemo1_05" src="http://sqlblog.com/blogs/merrill_aldrich/VizDemo1_05_thumb_1C5E8719.png" width="1028" height="494" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, in light of this information, imagine nightly re-indexing on a database with … &lt;strong&gt;AutoShrink!&lt;/strong&gt; &amp;lt;shudder&amp;gt;&lt;/p&gt;</description></item><item><title>Visualizing Data File Layout I</title><link>http://www2.sqlblog.com/blogs/merrill_aldrich/archive/2013/01/22/visualizing-data-file-layout-i.aspx</link><pubDate>Tue, 22 Jan 2013 22:50:34 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47250</guid><dc:creator>merrillaldrich</dc:creator><description>&lt;p&gt;Part 1 of a blog series visually demonstrating the layout of objects on data pages in SQL Server&lt;/p&gt;  &lt;p&gt;Some years ago a gentleman called &lt;a href="http://sqlblogcasts.com/blogs/danny/default.aspx"&gt;Danny Gould&lt;/a&gt; created a free tool called &lt;a href="http://internalsviewer.codeplex.com/"&gt;Internals Viewer for SQL Server&lt;/a&gt;. I’m a visual sort of guy, and I always thought it would be fun and educational to make a simple visualizer, like the one he created, in order to view how objects are laid out in SQL Server files, and to use it to demonstrate how operations like re-index and shrink affect the layout of files.&lt;/p&gt;  &lt;p&gt;To that end, and a little bit reinventing the wheel truth be told, I spent this past holiday creating a simple .NET app that renders the file layout of a database into a color-coded bitmap:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewer01_48CAE32A.png"&gt;&lt;img title="FileLayoutViewer01" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="FileLayoutViewer01" src="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewer01_thumb_586D7EEC.png" width="824" height="376" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Fig 1&lt;/p&gt;  &lt;p&gt;The app can scan the pages in a database, grab the header output from DBCC PAGE, parse that, and create a structure with a few key bits of information about every page. It then renders a bitmap from those structures showing a few things (Fig 1):&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;Each data object (index or table) partition is identified with a unique partition ID in SQL Server. Those IDs are used in this tool to color-code the output by object, from a color lookup table. Each color in the example screenshot represents the pages that are dedicated to a single partition of an object. This screenshot shows AdventureWorks, which doesn’t use the Enterprise Edition partitioning feature, so for this case each color represents one object – every object having exactly one partition in Standard Edition (or in databases that don’t use partitioning).&lt;/p&gt;    &lt;p&gt;Unallocated pages are shown as gray gaps. These are regions that are part of the physical file(s), but not used to store anything.&lt;/p&gt;    &lt;p&gt;The app flags pages at the end of any fragment of an object using a darker colored band, so it will reveal any non-contiguous structures in the data file(s). Sometimes these happen at the end of a region of the file where one object is stored, but, interestingly, sometimes these can happen in the middle – as shown in the image above where a dark band interrupts a continuous region of the same color.&lt;/p&gt;    &lt;p&gt;The app has some very basic mouse-over capability where you can run the mouse over the image and the text fields at right will reveal information about the pages, including the object schema.table.index and partition, and also whether the page represents a fragmentation boundary.&lt;/p&gt;    &lt;p&gt;Finally, the app shows what page types are located where in the file using the narrower white/gray/black bands. White represents data or index pages, while other shades of gray or black indicate other kinds of system pages, per Paul Randal’s excellent blog post &lt;a href="http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-anatomy-of-a-page/"&gt;here&lt;/a&gt;.&lt;/p&gt; &lt;/blockquote&gt;  &lt;h2&gt;The Pixels Already Tell a Story&lt;/h2&gt;  &lt;p&gt;So, what can we learn about the sample database in this image? Here are a few things:&lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;The part of the file shown in the bitmap is fairly dense. There aren’t big regions of unallocated space in the file. A gap in the allocated pages looks like this (enlarged):      &lt;br /&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewer02_6C869B75.png"&gt;&lt;img title="FileLayoutViewer02" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="FileLayoutViewer02" src="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewer02_thumb_1354B1B6.png" width="206" height="194" /&gt;&lt;/a&gt;       &lt;br /&gt;Empty Region       &lt;br /&gt;&amp;#160; &lt;br /&gt;&lt;/li&gt;    &lt;li&gt;Objects in the file are not contiguous, and may “hop around.” That is, if you follow the linked list of pages that compose an index, a bunch of them will be in a row, and then there will be a page that links to the next page composing the index but it’ll be in a different location in the file. I’ve called these “frag boundaries” – pages that do link to another page, but where that next logical page isn’t the next physical page in the file. In the graphic the frag boundary pages are colored with a darker dithered pattern. You can mouse over these and look in the text fields at the right in the app, and see the page they link to.      &lt;br /&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewer03_495BB6DC.png"&gt;&lt;img title="FileLayoutViewer03" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;padding-top:0px;padding-left:0px;border-left:0px;display:inline;padding-right:0px;" border="0" alt="FileLayoutViewer03" src="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewer03_thumb_7029CD1C.png" width="238" height="188" /&gt;&lt;/a&gt;       &lt;br /&gt;Fragment Boundaries       &lt;br /&gt;&amp;#160; &lt;br /&gt;      &lt;br /&gt;Sometimes the end of a fragment will be adjacent to pages from another object, but it can be the case that there’s a fragment boundary in the middle of the pages for one object – it’s just that the linked list goes up to that point in the file, but then the next page in the index (in index order) isn’t the next page in the file, even though the next page in the file is part of the same object. Imagine a page split in the “middle” of an index – the existing page with half the rows stays in place, and a new page with the other half of the rows gets created in the middle of the logical index order but possibly stored in some other location in the physical file.       &lt;br /&gt;&lt;/li&gt;    &lt;li&gt;Right at the very beginning of the file there’s a special sequence of metadata pages that describe the database, allocations, and so on (again, well documented by Paul Randal). In our diagram this shows up as a series of pages at top left with varying page type (the gray and white) indicators:      &lt;br /&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewer04_020660EA.png"&gt;&lt;img title="FileLayoutViewer04" style="border-top:0px;border-right:0px;background-image:none;border-bottom:0px;padding-top:0px;padding-left:0px;border-left:0px;display:inline;padding-right:0px;" border="0" alt="FileLayoutViewer04" src="http://sqlblog.com/blogs/merrill_aldrich/FileLayoutViewer04_thumb_28D4772A.png" width="268" height="206" /&gt;&lt;/a&gt;       &lt;br /&gt;Database and file metadata pages &lt;/li&gt; &lt;/ol&gt;  &lt;p&gt;In the next installment, I’ll run some test databases through this and we can see what more severe fragmentation looks like, the effect of GUID cluster keys, shrink, and how the data moves around in a re-index operation.&lt;/p&gt;  &lt;p&gt;Here’s a short demo video of the mouse-over working (quality is You-Tube limited):&lt;/p&gt;  &lt;div id="scid:5737277B-5D6D-4f48-ABFC-DD9C333F4C5D:a5c43597-8d84-4a1a-b897-db2f19356e72" class="wlWriterEditableSmartContent" style="float:none;padding-bottom:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;"&gt;&lt;div&gt;&lt;a href="http://www.youtube.com/watch?v=q47a0L0_9Ws" target="_new"&gt;&lt;img src="http://sqlblog.com/blogs/merrill_aldrich/videof83e47a6713d_7D8FE022.jpg" style="border-style:none;" alt=""&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="width:448px;clear:both;font-size:.8em;"&gt;Animated Screen Cap of Mouse-over&lt;/div&gt;&lt;/div&gt;</description></item><item><title>Geek City: Accessing Distribution Statistics</title><link>http://www2.sqlblog.com/blogs/kalen_delaney/archive/2013/01/18/accessing-distribution-statistics.aspx</link><pubDate>Fri, 18 Jan 2013 20:08:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47218</guid><dc:creator>Kalen Delaney</dc:creator><description>&lt;p&gt;Distribution statistics are one of the most important sources of information that the Query Optimizer uses to determine a good query plan. In this post, I’m not going to tell you everything about distribution statistics. I’m just going to show you a few tricks for getting access to the statistics.&lt;/p&gt;  &lt;p&gt;If you want a deeper understanding of what the statistics keep track of, and you don’t have any of my &lt;strong&gt;&lt;em&gt;SQL Server Internals&lt;/em&gt;&lt;/strong&gt; books handy, check out this whitepaper: &lt;a href="http://msdn.microsoft.com/en-us/library/dd535534(v=SQL.100).aspx"&gt;Statistics Used by the Query Optimizer in Microsoft SQL Server 2008&lt;/a&gt;&amp;nbsp; &lt;/p&gt;  &lt;p&gt;Microsoft does provide us a tool called DBCC SHOW_STATISTICS for examining the distribution statistics. &lt;/p&gt;  &lt;p&gt;Microsoft has gradually been making more of the more of the old DBCC commands available as DMVs, even some undocumented ones. For example, one of my favorites, DBCC IND, has now been replaced in SQL Server 2012&amp;nbsp; by &lt;em&gt;sys.dm_db_database_page_allocations&lt;/em&gt;.&lt;/p&gt;  &lt;p&gt;I have been wishing for several versions that Microsoft would make the DBCC SHOW_STATISTICS information available as a DMV. But it hasn’t happened yet, and I’m tired of waiting, so I decided to do something about it. &lt;/p&gt;  &lt;p&gt;My solution is not quite as easy to use as a DMV might be, but it allows you to get the information that DBCC SHOW_STATISTICS provides into a set of three tables that can then be saved into a more permanent location of your choice, and/or queried as desired. &lt;/p&gt;  &lt;p&gt;DBCC SHOW_STATISTICS returns three sets of information, with different columns in the output, so three different tables are needed. The DBCC SHOW_STATISTICS command can be called with an argument that specifies that you just want one of the three sets returned. The options are&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;strong&gt;WITH STAT_HEADER&lt;/strong&gt; – returns basic info such as last update date, and number of rows in the table/index. Also reports number of steps returned for HISTOGRAM section.&lt;/p&gt;    &lt;p&gt;&lt;strong&gt;WITH DENSITY_VECTOR&lt;/strong&gt; – returns density info for each left-based subset of columns in the index. For example, an index on (lastname, firstname, city) would have a density value for (lastname), for (lastname, firstname), and for (lastname, firstname, city). Each density value is a single number representing the average number of occurrences and depends on the number of distinct values. For example, if there are only 2 possible values in the column, the density would be 0.5. Multiplying density by the number of rows in the STAT_HEADER section would give the average expected rowcount if a query was executed looking for an equality on the specified column(s). &lt;/p&gt;    &lt;p&gt;&lt;strong&gt;WITH HISTOGRAM&lt;/strong&gt; – returns a set of ordered values from the first column of the index, creating a histogram. This histogram provides the optimizer with selectivity information for specific values or ranges of values in the first column of the index. &lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;To collect this info, I will use one of my favorite tricks, which is to create a table in the &lt;em&gt;master&lt;/em&gt; database with a name starting with sp_. (I’ve written about this trick several times, including in this &lt;a href="http://sqlblog.com/blogs/kalen_delaney/archive/2008/08/11/geek-city-system-objects.aspx"&gt;earlier blog post.)&lt;/a&gt; Once I have the table(s) created, I can access them from any database. So here are the three tables:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;USE Master;        &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;IF&amp;nbsp; (SELECT object_id('sp_stat_header')) IS NOT NULL        &lt;br&gt;&amp;nbsp; DROP TABLE sp_statsheader;         &lt;br&gt;GO         &lt;br&gt;CREATE TABLE sp_stat_header         &lt;br&gt;(&amp;nbsp;&amp;nbsp; Name sysname,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Updated datetime,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Rows bigint,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Rows_sampled bigint,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Steps smallint,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Density numeric (10,9),         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Average_key_length smallint,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; String_index char(3),         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Filter_expression nvarchar(1000),         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Unfiltered_rows bigint);         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;     &lt;br&gt;&lt;font face="Consolas"&gt;IF&amp;nbsp; (SELECT object_id('sp_density_vector')) IS NOT NULL        &lt;br&gt;&amp;nbsp; DROP TABLE sp_density_vector;         &lt;br&gt;GO         &lt;br&gt;CREATE TABLE sp_density_vector         &lt;br&gt;(&amp;nbsp; all_density numeric(10,8),         &lt;br&gt;&amp;nbsp;&amp;nbsp; average_length smallint,         &lt;br&gt;&amp;nbsp;&amp;nbsp; columns nvarchar(2126) );         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;IF&amp;nbsp; (SELECT object_id('sp_histogram')) IS NOT NULL        &lt;br&gt;&amp;nbsp; DROP TABLE sp_histogram;         &lt;br&gt;GO         &lt;br&gt;CREATE TABLE sp_histogram         &lt;br&gt;(&amp;nbsp;&amp;nbsp; RANGE_HI_KEY sql_variant,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; RANGE_ROWS bigint,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; EQ_ROWS bigint,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; DISTINCT_RANGE_ROWS bigint,         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; AVG_RANGE_ROWS bigint);         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;The second trick is to use INSERT … EXEC to execute a DBCC statement and populate the tables. I will build the DBCC&amp;nbsp; command dynamically, after capturing the schema, table and index names in variables. You of course could take this code and turn it into a stored procedure, for which the schema, table and index names are passed as parameters. I’ll use as an example a table in the &lt;em&gt;AdventureWorks2008&lt;/em&gt; sample database, just so you can try running the code, and I can verify that it actually works!&lt;/p&gt;  &lt;p&gt;I will use the table &lt;em&gt;Sales.SalesOrderDetail&lt;/em&gt; and the index&lt;em&gt; IX_SalesOrderDetail_ProductID&lt;/em&gt;. So the object name (@oname) is &lt;em&gt;SalesOrderDetail&lt;/em&gt;, the schema name (@sname) is &lt;em&gt;Sales&lt;/em&gt;, and the index name (@iname) is &lt;em&gt;IX_SalesOrderDetail_ProductID.&lt;/em&gt;&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SET NOCOUNT ON;        &lt;br&gt;USE AdventureWorks2008;         &lt;br&gt;GO         &lt;br&gt;DECLARE @oname sysname,&amp;nbsp; @iname sysname, @sname sysname&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;SELECT @oname = 'SalesOrderDetail',&amp;nbsp; @sname = 'Sales', @iname = 'IX_SalesOrderDetail_ProductID';        &lt;br&gt;&amp;nbsp; &lt;br&gt;-- Update the object name to include the schema name, because that is the format the DBCC command expects         &lt;br&gt;SELECT @oname = @sname +'.' + @oname;         &lt;br&gt;&lt;/font&gt;&lt;/p&gt;   &lt;font face="Consolas"&gt;TRUNCATE TABLE sp_stat_header;      &lt;br&gt;INSERT INTO sp_stat_header       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; EXEC ('DBCC SHOW_STATISTICS(['+ @oname + '],' + @iname +') WITH STAT_HEADER'); &lt;/font&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;TRUNCATE TABLE sp_density_vector;        &lt;br&gt;INSERT INTO sp_density_vector         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; EXEC ('DBCC SHOW_STATISTICS(['+ @oname + '],' + @iname +') WITH DENSITY_VECTOR');&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;TRUNCATE TABLE sp_histogram;        &lt;br&gt;INSERT INTO sp_histogram         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; EXEC ('DBCC SHOW_STATISTICS(['+ @oname + '],' + @iname +') WITH HISTOGRAM');&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;      &lt;p&gt;So now you can look at the values collected and filter or query in any way, or use SELECT INTO to save them into another table, so the &lt;em&gt;sp_&lt;/em&gt; tables can be used the next time you want to capture distribution statistics information. &lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT * FROM sp_stat_header;        &lt;br&gt;        &lt;br&gt;SELECT * FROM sp_density_vector;         &lt;br&gt;        &lt;br&gt;SELECT * FROM sp_histogram;&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&amp;nbsp;&amp;nbsp; &lt;/p&gt; &lt;/blockquote&gt;    &lt;p&gt;Let me know if you find this useful, and especially if you embellish it to create a procedure or an automated process of your own!&lt;/p&gt;  &lt;p&gt;Thanks!&lt;/p&gt;  &lt;p&gt;&lt;font color="#ff0080" size="4"&gt;&lt;strong&gt;~Kalen&lt;/strong&gt;&lt;/font&gt;&lt;/p&gt;</description></item><item><title>Geek City: What Triggered This Post?</title><link>http://www2.sqlblog.com/blogs/kalen_delaney/archive/2012/12/31/what-triggered-this-post.aspx</link><pubDate>Tue, 01 Jan 2013 00:00:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46911</guid><dc:creator>Kalen Delaney</dc:creator><description>&lt;p&gt;I’d really like to get another post up onto my much neglected blog before the end of 2012. This will also start one of my New Year’s resolutions, which is to write at least one blog post a month. I’m going to tell you about a change in SQL Server that wasn’t announced in any “What’s New” list that I ever saw, perhaps because it was just a chance in internal behavior, and nothing that required any change in user applications. &lt;/p&gt;  &lt;p&gt;Do you retest what you know is true for every new version? When I update my books, I do test all the scripts, but if there isn’t a script, I don’t retest every ‘fact’ that I have known for years is true. And sometimes, things change. And sometimes my reviewers notice those unreported changes, and sometimes they don’t. &lt;/p&gt;  &lt;p&gt;You might be aware of the fact that SQL Server can perform UPDATE operations in two different ways. The UPDATE can be performed as a two-step process: delete the old row and then insert a whole new row, or, the UPDATE can be performed (much more efficiently) as an update-in-place.&amp;nbsp; When the two-step UPDATE is performed, it is a LOT more work. Not only does SQL Server have to log the entire old row and the entire new row, but each nonclustered index is also modified twice, and each of those index changes also has to be logged. So it’s nice when an update-in-place is done, because only the bytes changed are logged, and only indexes on the updated columns are affected. &lt;/p&gt;  &lt;p&gt;Prior to SQL Server 7, there were actually four different ways that UPDATE could be done. The two-step UPDATE had some variations that could make it even slower in some cases! But that was a long time ago, so I’m not going to go into the details now. But I will say that back then, in order to get an update-in-place to occur, there was a big long list of prerequisites that had to be met and if you missed just one, you’d get one of the slower UPDATE operations. &lt;/p&gt;  &lt;p&gt;As of SQL Server 7, update-in-place became the default. The only time it doesn’t happen is when the row can’t stay in the same location (such as when you update a clustered index key column) or when SQL Server really needs the old and new versions of the row. &lt;/p&gt;  &lt;p&gt;In SQL 7, one of the places that SQL needed the old and new version of the updates rows was when processing triggers. Triggers need the transaction log to get the contents for the DELETED and INSERTED pseudo-tables. And because triggers needed the entire old and new versions of the updated rows, the UPDATE was performed as a two-step operation. DELETE the old row, log the entire old row, and the INSERT the new row with the new values, and log the entire new row. &lt;/p&gt;  &lt;p&gt;But as of 2005, we now have the version store, primarily used for SNAPSHOT isolation, but available for other uses as well. In SNAPSHOT isolation, the version stores stores ‘old versions’ of rows that have been updated or deleted.&amp;nbsp; I knew that the version store was also used for triggers, but it only occurred to me just recently that maybe, because the old and new versions of the row were not needed from the log, perhaps UPDATEs did not always need to be performed internally as a two-step UPDATE. &lt;/p&gt;  &lt;p&gt;So I decided to test it out.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;-- DEMO: If there is an UPDATE trigger, are updates logged as DELETE + INSERT?        &lt;br&gt;-- First build a new database.&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;USE master;        &lt;br&gt;GO         &lt;br&gt;IF (SELECT db_id('TestTrigger')) IS NOT NULL         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; DROP DATABASE TestTrigger;         &lt;br&gt;GO         &lt;br&gt;CREATE DATABASE TestTrigger;         &lt;br&gt;GO         &lt;br&gt;ALTER DATABASE TestTrigger SET RECOVERY SIMPLE;         &lt;br&gt;GO         &lt;br&gt;SELECT db_id('TestTrigger');         &lt;br&gt;GO &lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;USE TestTrigger;        &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;-- Just for a warmup, look at the function fn_dblog, which works in the current database&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;SELECT * FROM fn_dblog(null, null);        &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;-- Create a new table to work with        &lt;br&gt;IF (SELECT object_id('objects')) IS NOT NULL         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; DROP TABLE objects;         &lt;br&gt;GO         &lt;br&gt;SELECT TOP 100 * INTO objects FROM sys.objects;         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;-- Create a clustered index on the table        &lt;br&gt;CREATE CLUSTERED INDEX objects_clustered on objects(name);         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;-- First examine an update we know is NOT done in place,        &lt;br&gt;-- i.e. updating a clustered key value&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;UPDATE objects SET name = 'newrowsets' WHERE name = 'sysrowsets';        &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;-- Look at last 10 rows; notice a LOP_DELETE_ROWS and LOP_INSERT_ROWS        &lt;br&gt;-- The AllocUniteName column shows the object affected is the clustered index on dbo.objects         &lt;br&gt;SELECT Operation, [Transaction ID], AllocUnitName FROM fn_dblog(null, null);         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;-- Now examine an update we know is&amp;nbsp; done in place,        &lt;br&gt;-- i.e. updating an unindexed column on a table with no triggers         &lt;br&gt;UPDATE objects SET parent_object_id = 1 WHERE name = 'sysfiles1';         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;-- Look at last 3 rows; notice a LOP_MODIFY_ROW on the dbo.objects allocation unit        &lt;br&gt;SELECT Operation, [Transaction ID], AllocUnitName FROM fn_dblog(null, null);         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;-- Create an update trigger        &lt;br&gt;-- Will the update be done with the siple LOP_MODIFY_ROW or with the LOP_DELETE_ROWS and LOP_INSERT_ROWS         &lt;br&gt;CREATE TRIGGER trg_update_objects ON objects FOR UPDATE         &lt;br&gt;as         &lt;br&gt;SELECT * FROM DELETED; SELECT * FROM INSERTED;         &lt;br&gt;RETURN;         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;-- Now perform update again        &lt;br&gt;UPDATE objects SET parent_object_id = 10 WHERE name = 'sysfiles1';         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;-- Look at last 3 rows; notice a LOP_MODIFY_ROW        &lt;br&gt;SELECT * FROM fn_dblog(null, null);         &lt;br&gt;GO&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Since the database is in SIMPLE recovery model, you can issue a CHECKPOINT before each UPDATE if you want to reduce the number of rows in the log to make it easier to examine. &lt;/p&gt;  &lt;p&gt;So it seems that I need to update my course and some of my writings. There might also be special cases that still require that an two-step UPDATE be performed in the presence of triggers, but it seems like a two-step UPDATE is not ALWAYS required anymore. That is very good news!&lt;/p&gt;  &lt;p&gt;I hope you all have a wonder-filled and joyous New Year!&lt;/p&gt;  &lt;p&gt;&lt;font color="#800040" size="4"&gt;~Kalen&lt;/font&gt;&lt;/p&gt;</description></item></channel></rss>