THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Davide Mauri

A place for my thoughts and experiences the Microsoft Data Platform

GUID, fragmentation and performance: the eternal flame.

In the last months a lot of discussion has been done around the topic of GUIDs and their usage as Clustered Keys. We had a very “hot” discussion on that several years in Italy, and looking on the web you can see that this kind of question surface gain periodically. Recently I discussed it also with my friend Thomas. Discussion that gave me some food for thought, as usual. So with this post I’d like to share my findings, hoping to write a blog post that can help to better understand this phenomenon that is always an hot topic, in any DBA and DEV discussion.

Actually, the question is only a part of the more broader topic of “fragmentation and performance”, that is – IMHO – much  more interesting. It is well known that if you try to insert rows in a table as fast as possible, and you’re using an IDENTITY as your clustered index, this, instead of helping, will pose a barrier to the performance you can reach. A nice whitepaper explain the problem in detail:

Diagnosing and Resolving Latch Contention on SQL Server

Not only that, ORM like Hibernate or the Entity Framework loves to work with GUID since in that case you can created values without having to do a round-trip to SQL Server, allowing developers to be more free and independent from SQL Server, build a cleaner and easier to maintain code. (At least this is the common idea, of course I don’t agree with it at all, since it cover only one side of the problem: the developer point of view. But this is another story.)

Also Microsoft uses it *a lot*. I you ever worked with Dynamics CRM you know that GUID simply pours out of the server. They are everywhere.

So, are they a good thing or a bad one?

As usual, forget to have simple, catch-all, answer. Computer science is so wide that it’s almost impossible to have a black or white approach. But luckily we work in a field where numbers and measurement still have a lot of importance, so I did some test to really understand how much fragmentation can impact the performance of a query.

Here’s the test I did and the results they show.

Test Setup

First of all I created two tables


and loaded them with ostress:

ostress -Slocalhost -E -dIndexTest -Q"insert into dbo.IndexTest3F default values" -oC:\FT\xxx\TMP\ -q -n70 -r50000

After that I reindex the table IndexTest3N so that I can do some test with a badly fragmented (IndexTest3F) and a total unfragmented (IndexTest3N) table using the following command:

alter index pk__3N on dbo.IndexTest3N rebuild with (fillfactor = 50)

I used fillfactor 50 to try to keep the page count for both table at the same level. As you can notice, and as quite obvious from the row size, this is not possible, since exactly two rows can fit in any page. For some reason, without reindexing the table, the fill factor is lower than expected.


Unfragmented Table Tests

To make sure that Buffer Cache was completely empty, I ran the following commands:


And than I searched for a value I know it wasn’t in the table, using a non-indexed column, so that a table scan would be the only possible plan. Here’s the results, with I/O and TIME statistics set to on:


This is a Fast Track machine (32 Cores, no HT, 256 GB Ram, 4 P2000 SANs) where I expect 6GB/Sec of transfer rate. Doing the math, this is the exact throughput I got


Monitoring the Avg.Disk Bytes/Reads counter, I noticed that SQL Server was able to do I/O up to 512 Kb, as I would expect in this situation: SQL Server is doing an Index-Order Scan, and pages are all in the correct physical order.

Forcing an Allocation-Order Scan, using a TABLOCK hint, shows the very same results.

If table is all in memory, so we’re not doing any physical I/O, performance are phenomenal, as one would expect. Scanning of all 13GBs of data, happens to take little more than 100msec, with a CPU a little over 3000 msec.


So far so good. Now let’s see what happens if we’re using an heavily fragmented table

Fragmented Table Tests

As soon as we run the same reference query again, we have some very different results:


The query took more than 10 seconds instead of the original 2! This means being 5 time slower, with a throughput of less than 2GB/Sec. Due to such high fragmentation, and since it’s using the Index-Order scan strategy, SQL Server was never able to do I/O bigger than 64Kb, bringing the performance bar down a lot.

Helping SQL Server to do an Allocation-Order scan with a TABLOCK hint helps a bit:


but still performance a much lower than the one obtained with a non-fragmented table.

Now, let’s see if something changes also with warm cache. We all know that memory is so fast, when compared to spindles, that fragmentation should not give any problem. Well, the tests shows something very different:


Of course performance is great here if compared with cold-cache results, after all we’re still getting an answer only after 500msec. But look at CPU time. It’s 16 SECONDS! 16 seconds instead of 3, again, 5 time (and more) more expensive than the non-fragmented table, even when in memory.

I must admit that I wasn’t expecting such difference. Differences that disappear when going for the Allocation-Order scan, where fragmentation is not key factor anymore, since SQL Server doesn’t follow the double-linked list found at the leaf leave of index pages:


Other Tests

I ran the very same test also on a much cheaper machine (the Fast Track machine used here has a cost, in hardware only, of more the 100.000$), not optimized for DWH workload and using SSD as a storage, and without any SAN, but using DAS.

The test machine was a VM on my notebook, with 2 vCore, 8GB of RAM and Samsung 840 SSD for the storage.

Results showed the very same behavior.


Well, it’s pretty clear that fragmentation has a non-trivial impact on table scan performances. So is a non-sequential GUID such a bad thing? Here’s my opinion, give the test I ran:

Do It When
if you need a table where you only need to persist data and you need to have a very high insert rate (persist = insert row and then access only by PK, and only for singleton lookups), then GUID or any other better way to avoid contention in the last page is good. In my test on the Fast Track I could only reach 15.000 Batch/Sec with the usual IDENTITY on the ID column, while I went up to 50.000 Batch/sec with the GUID. In both cases I had 140 concurrent connections (using ostress.exe) doing a very simple INSERT…DEFAULT VALUES and CPU usage were around 20%. With GUID bottleneck was then the PFS page. I only have one filegroup (in addition to the PRIMARY), since this is a DWH machine and was configured to optimize that specific workload, so I’m sure that on a proper OLTP configured machine, the saturation point is much after that value.

Don’t Do It When
if you are doing anything else where you cannot foretold and be 100% sure that you’re never going to do a table scan, I wouldn’t recommend the usage of a value (GUID or not) that brings your table to a 99% fragmentation guaranteed. For example if you’re developing and ERP or accounting system, or a website or anything else where the optimization of insert-rate is not your primary goal…well, think twice before adopting GUID blindly.

I’m sure this won’t be the last post on that subject, but at least there are now some numbers that we can use to discuss on it. And I would really love to see the SQL family act as scientists would do (aren’t we Data Scientists? Winking smile): do that same tests on your machines, post the results and let’s discuss them!

Published Sunday, April 06, 2014 7:48 PM by Davide Mauri

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS



John Paul Cook said:

How about guid performance in an in memory database, which is latch free?

April 6, 2014 2:29 PM

Jeremiah Peschka said:

How about a realistic row size instead of engineering a workload that has limited resemblance to reality?

It's easy to prove that a Ferrari is bad at hauling bricks. It's also easy to prove that a truck is a bad race cat. A well engineered benchmark is far note important than an arbitrary one that sends to exist only to prove your point.

April 6, 2014 9:22 PM

Jeremiah Peschka said:

And thanks to auto complete, I talked about "race cats" instead of "race cars".

April 6, 2014 9:37 PM

Davide Mauri said:

@John: yeah, the test are done on classic b+tree clustered indexes. With Hekaton the story is totally different since it is optimised for singleton access and for really heavy write intensive workload. If using a Hash Index, fragmentation, if we can call it so, is strictly related to the dimension of your hash bucket, as already showed by Klaus here:

It would be interesting to test the Range Index on Hekaton, to check if an how fragmentation also happen here and which impact has on performances

April 7, 2014 2:33 AM

Davide Mauri said:

@Jeremia: I don't think that changing the row size will change the results. All that matters in fragmentation is page order, which is not affected at all by row size. Even with a smaller row size (say, 300bytes) GUID will generate 99% of fragmentation.

It would be interesting to understand what you mean by "well engineered benchmark" and I ask you to provide one, done by you, that shows the reality is opposite, as you are implying by your comment.

Follow a scientific method, please, and show us the numbers. Otherwise there is not basis for discussion, do you think so?

PS I decided to use fat rows just to be able to show (in a session I delivered recently in Italy) that GUID + wide rows table can bring problems faster than one would expect, even on SSD drives that, on the contrary of what a lot of people think, are not the solution to all I/O problems. Design still matter. A lot.

April 7, 2014 2:42 AM

Thomas Kejser said:

Nice test Davide. Did you try to run the test without steaming the result to the client? Having to do the logical/physical lookup to send the row if of cores more expensive - even in memory.

Also - as you mention, allocation order scans are about the same when in memory. In other words: building hash tables (the typical case for table scan in OLTP) is unaffected by frag.

How about concurrency? Try the unfragmented scans with many queries in parallel (the typical case)

April 7, 2014 4:36 AM

Greg Linwood said:

Discussions like this about insert fragmentation resulting from the randomness of GUIDs often conveniently leave out the equally important discussion about randomness of NCIX keys, which is something that cannot be engineered out of a database design by simply changing types.

You might be able to change the insert behaviour of the table's physical storage but this isn't so easy with NCIX keys which have widely varying degrees of insertion randomness depending on the individual DB

And if you really want fast row inserts, try using a HEAP which limits  insert fragmentation to NCIXs which can't be avoided anyhow

April 7, 2014 5:04 AM

Davide Mauri said:

@Thomas: the result set was always an empty set...for a very simple SELECT statement as the one used in the test, it should be the same of discarding results (in term of performance), isn't it?

@Greg you're right.

I'll do the test you both suggest as soon as I can work again on the test machine.

April 7, 2014 7:46 AM

Andy Yun said:

Did you consider the additional I/O cost of Page Splits & Transaction Log, when using a GUID as a Clustering Key?  Depending on your activity pattern, the cost of bad page splits will be a far worse tradeoff than fighting latch contention.  

April 7, 2014 11:50 AM

Greg Linwood said:

Andy - GUIDs are random & therefore distributes evenly over the free space in pages allocated by fillfactor.

This means that GUIDs can actually cause LESS Page Splits than other clustering key choices & fillfactor can be managed more easily when insert rates are well known & index maintenance can be scheduled.

For example, if you know that 30% fillfactor will cater for 3 months of inserts, you can rebuild your CIX with 30% & be fairly confident that no page splits will occur from inserts in that period

The randomness of GUIDs can actually be used to minimise Page Splits in this way

April 7, 2014 12:36 PM

Andy Warren said:

And I was just about to look up race cat

April 16, 2014 9:43 AM

DWalker said:

Do these discussions ever mention sequential GUIDs?  That will make a difference in page fragmentation.  At least they should be considered.

April 23, 2014 11:33 AM

DWalkerf said:

What are NCIX keys?  I couldn't find out for sure using a web search.  Numeric clustered index?

March 11, 2015 5:18 PM

Davide Mauri said:

NCIX is just the way I named the NonClustered Index :)

March 12, 2015 7:31 AM

Leave a Comment


About Davide Mauri

Director of Software Development & Cloud Infrastructure @ Sensoria, an innovative smart garments and wearable company. After more than 15 year playing with the Microsoft Data Platform, with a specific focus on High Performance databases, Business Intelligence, Data Science and Data Architectures, he's now applying all his skills to IoT, defining architectures to crunch numbers, create nice user experiences and provide meaningful insights, all leveraging Microsoft Azure cloud. MVP on Data Platform since 2006 he has a very strong background development and love both the ER model and OO principles. He is also a fan of Agile Methodology and Automation, which he tries to apply everywhere he can, to make sure that "people think, machines do".

This Blog


Privacy Statement