In the last months a lot of discussion has been done around the topic of GUIDs and their usage as Clustered Keys. We had a very “hot” discussion on that several years in Italy, and looking on the web you can see that this kind of question surface gain periodically. Recently I discussed it also with my friend Thomas. Discussion that gave me some food for thought, as usual. So with this post I’d like to share my findings, hoping to write a blog post that can help to better understand this phenomenon that is always an hot topic, in any DBA and DEV discussion.
Actually, the question is only a part of the more broader topic of “fragmentation and performance”, that is – IMHO – much more interesting. It is well known that if you try to insert rows in a table as fast as possible, and you’re using an IDENTITY as your clustered index, this, instead of helping, will pose a barrier to the performance you can reach. A nice whitepaper explain the problem in detail:
Diagnosing and Resolving Latch Contention on SQL Server
Not only that, ORM like Hibernate or the Entity Framework loves to work with GUID since in that case you can created values without having to do a round-trip to SQL Server, allowing developers to be more free and independent from SQL Server, build a cleaner and easier to maintain code. (At least this is the common idea, of course I don’t agree with it at all, since it cover only one side of the problem: the developer point of view. But this is another story.)
Also Microsoft uses it *a lot*. I you ever worked with Dynamics CRM you know that GUID simply pours out of the server. They are everywhere.
So, are they a good thing or a bad one?
As usual, forget to have simple, catch-all, answer. Computer science is so wide that it’s almost impossible to have a black or white approach. But luckily we work in a field where numbers and measurement still have a lot of importance, so I did some test to really understand how much fragmentation can impact the performance of a query.
Here’s the test I did and the results they show.
First of all I created two tables
and loaded them with ostress:
ostress -Slocalhost -E -dIndexTest -Q"insert into dbo.IndexTest3F default values" -oC:\FT\xxx\TMP\ -q -n70 -r50000
After that I reindex the table IndexTest3N so that I can do some test with a badly fragmented (IndexTest3F) and a total unfragmented (IndexTest3N) table using the following command:
alter index pk__3N on dbo.IndexTest3N rebuild with (fillfactor = 50)
I used fillfactor 50 to try to keep the page count for both table at the same level. As you can notice, and as quite obvious from the row size, this is not possible, since exactly two rows can fit in any page. For some reason, without reindexing the table, the fill factor is lower than expected.
Unfragmented Table Tests
To make sure that Buffer Cache was completely empty, I ran the following commands:
And than I searched for a value I know it wasn’t in the table, using a non-indexed column, so that a table scan would be the only possible plan. Here’s the results, with I/O and TIME statistics set to on:
This is a Fast Track machine (32 Cores, no HT, 256 GB Ram, 4 P2000 SANs) where I expect 6GB/Sec of transfer rate. Doing the math, this is the exact throughput I got
Monitoring the Avg.Disk Bytes/Reads counter, I noticed that SQL Server was able to do I/O up to 512 Kb, as I would expect in this situation: SQL Server is doing an Index-Order Scan, and pages are all in the correct physical order.
Forcing an Allocation-Order Scan, using a TABLOCK hint, shows the very same results.
If table is all in memory, so we’re not doing any physical I/O, performance are phenomenal, as one would expect. Scanning of all 13GBs of data, happens to take little more than 100msec, with a CPU a little over 3000 msec.
So far so good. Now let’s see what happens if we’re using an heavily fragmented table
Fragmented Table Tests
As soon as we run the same reference query again, we have some very different results:
The query took more than 10 seconds instead of the original 2! This means being 5 time slower, with a throughput of less than 2GB/Sec. Due to such high fragmentation, and since it’s using the Index-Order scan strategy, SQL Server was never able to do I/O bigger than 64Kb, bringing the performance bar down a lot.
Helping SQL Server to do an Allocation-Order scan with a TABLOCK hint helps a bit:
but still performance a much lower than the one obtained with a non-fragmented table.
Now, let’s see if something changes also with warm cache. We all know that memory is so fast, when compared to spindles, that fragmentation should not give any problem. Well, the tests shows something very different:
Of course performance is great here if compared with cold-cache results, after all we’re still getting an answer only after 500msec. But look at CPU time. It’s 16 SECONDS! 16 seconds instead of 3, again, 5 time (and more) more expensive than the non-fragmented table, even when in memory.
I must admit that I wasn’t expecting such difference. Differences that disappear when going for the Allocation-Order scan, where fragmentation is not key factor anymore, since SQL Server doesn’t follow the double-linked list found at the leaf leave of index pages:
I ran the very same test also on a much cheaper machine (the Fast Track machine used here has a cost, in hardware only, of more the 100.000$), not optimized for DWH workload and using SSD as a storage, and without any SAN, but using DAS.
The test machine was a VM on my notebook, with 2 vCore, 8GB of RAM and Samsung 840 SSD for the storage.
Results showed the very same behavior.
Well, it’s pretty clear that fragmentation has a non-trivial impact on table scan performances. So is a non-sequential GUID such a bad thing? Here’s my opinion, give the test I ran:
Do It When
if you need a table where you only need to persist data and you need to have a very high insert rate (persist = insert row and then access only by PK, and only for singleton lookups), then GUID or any other better way to avoid contention in the last page is good. In my test on the Fast Track I could only reach 15.000 Batch/Sec with the usual IDENTITY on the ID column, while I went up to 50.000 Batch/sec with the GUID. In both cases I had 140 concurrent connections (using ostress.exe) doing a very simple INSERT…DEFAULT VALUES and CPU usage were around 20%. With GUID bottleneck was then the PFS page. I only have one filegroup (in addition to the PRIMARY), since this is a DWH machine and was configured to optimize that specific workload, so I’m sure that on a proper OLTP configured machine, the saturation point is much after that value.
Don’t Do It When
if you are doing anything else where you cannot foretold and be 100% sure that you’re never going to do a table scan, I wouldn’t recommend the usage of a value (GUID or not) that brings your table to a 99% fragmentation guaranteed. For example if you’re developing and ERP or accounting system, or a website or anything else where the optimization of insert-rate is not your primary goal…well, think twice before adopting GUID blindly.
I’m sure this won’t be the last post on that subject, but at least there are now some numbers that we can use to discuss on it. And I would really love to see the SQL family act as scientists would do (aren’t we Data Scientists? ): do that same tests on your machines, post the results and let’s discuss them!