THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Merrill Aldrich

File Layout Viewer vs. Drop Clustered Index

I had a very educational exchange the other day on Twitter (also known as “getting schooled”) by my SQL compatriots Argenis Fernandez (@DBArgenis) and Hugo Kornelis (@Hugo_Kornelis). A guy called Chris Switalski (@crswit) asked a really interesting question on #sqlhelp, which I proceeded to answer incorrectly, which led to a correction by my friends, which in turn took me back into the SQL File Layout Viewer, which quickly uncovered a bug there.

So, learn from your friends on Twitter!

Anyway, here’s the question, and corrected demo using an updated SQL File Layout Viewer build.

Question (paraphrasing):

When you create a clustered index on a heap, the data from the heap is rewritten and ordered, and the upper levels of the index are added. If you drop that clustered index, what happens? Does the set of leaf-level pages stay in place, and the index pages just get unlinked and deallocated? Or is the data rewritten again? Does it stay in the same logical or physical order, even though the ordering requirement is removed?

I had said that I imagine the data is rewritten in both cases, but that we could look and see to be sure. I was totally wrong – the leaf level of a clustered index is made of data pages, and a heap is made of data pages too, as Argenis pointed out. It turns out that, since they are not different structures down at that level, in order to drop a clustered index, the server will indeed just remove the upper levels of the index and leave all the leaf level data pages in place. Cool optimization! It turns out that my original release of the File Layout Viewer doesn’t always display this situation correctly, so I had to make some changes. This demo, though, I believe shows the behavior correctly.

Answer:

First, we make a test database and create a heap, and populate it with some dummy data:

USE master
IF EXISTS ( SELECT 1 FROM sys.databases WHERE name = 'HeapTest' ) DROP DATABASE HeapTest;
CREATE DATABASE HeapTest;
GO

ALTER DATABASE HeapTest SET RECOVERY SIMPLE;
GO
ALTER DATABASE HeapTest MODIFY FILE ( NAME = N'HeapTest', SIZE = 102400KB );
GO
ALTER DATABASE HeapTest MODIFY FILE ( NAME = N'HeapTest_log', SIZE = 10240KB );
GO

USE HeapTest;

CREATE TABLE SampleData ( 
    Data uniqueidentifier DEFAULT NEWID(), 
    Padding char(100) DEFAULT ( CAST( CHECKSUM( NEWID() ) as CHAR(100) ) ) 
);
GO

INSERT SampleData DEFAULT VALUES;
GO 100000

If we run the File Layout Viewer, we see a nice, contiguous set of data pages (type 1) in teal:

FLVDropClust01

Next, we’ll add a NON clustered index, just so we can track what happens to that structure too:

CREATE NONCLUSTERED INDEX Nonclust ON SampleData ( Padding );

That shows up in the next part of the data file, in bright blue:

FLVDropClust02

Now if we make a clustered index on the heap, it will have to be rewritten into another region of the file, so that it can be stored in the right logical order. We should expect it to “move” in the visualization down into the empty part of the file. The non clustered index will also be rewritten, because it will switch from using physical row locations to point to data in the heap to using the clustering key to point to data in the new clustered index. Let’s see what it looks like:

CREATE CLUSTERED INDEX Clust ON SampleData ( Data );

The new clustered index and new version of the NON clustered index are both created in another region of the data file:

FLVDropClust03

You can see, in lighter gray, the regions that were emptied where the two structures had been stored, and the two new structures now in the center of the graphic.

Now for the part of the question I got wrong: what happens when we drop that clustered index?

DROP INDEX SampleData.Clust;

This is really neat: the leaf level of the clustered index (remember data pages are data pages) stays right where it is, and the upper level index pages are just deallocated/unlinked.

The NON clustered index does have to be rebuilt yet again to change those clustered index keys out for row location pointers that point to the rows in the heap, so those pages do get rewritten into yet another region of the file:

FLVDropClust04

So, to summarize:

When you add or remove a clustered index from a table, any NON clustered indexes always have to be rebuilt - and will be automatically - to exchange row pointers for clustering keys in the non clustered index itself. This generally takes care of itself, but it can be a performance issue, or cause longer running transactions or a large amount of logging.

When you CREATE a clustered index on a heap, the data has to be rewritten to enforce the logical order of the index.

When you DROP a clustered index, however, the server can keep all the leaf level pages in place and just strip away the index levels above.

Published Monday, June 03, 2013 12:41 AM by merrillaldrich
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Patrick said:

Very interesting post. Thanks

June 11, 2013 8:46 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement