We all know that putting too many indexes (I’m talking of non-clustered index only, of course) on table may produce performance problems due to the overhead that each index bring to all insert/update/delete operations on that table.
But how much? I mean, we all agree – I think – that, generally speaking, having many indexes on a table is “bad”. But how bad it can be? How much the performance will degrade? And on a concurrent system how much this situation can also hurts SELECT performances? If SQL Server take more time to update a row on a table due to the amount of indexes it also has to update, this also means that locks will be held for more time, slowing down the perceived performance of all queries involved.
I was quite curious to measure this, also because when teaching it’s by far more impressive and effective to show to attended a chart with the measured impact, so that they can really “feel” what it means!
To do the tests, I’ve create a script that creates a table (that has a clustered index on the primary key which is an identity column) , loads 1000 rows into the table (inserting 1000 row using only one insert, instead of issuing 1000 insert of one row, in order to minimize the overhead needed to handle the transaction, that would have otherwise ), and measures the time taken to do it.
The process is then repeated 16 times, each time adding a new index on the table, using columns from table in a round-robin fashion. Test are done against different row sizes, so that it’s possible to check if performance changes depending on row size.
The result are interesting, although expected. This is the chart showing how much time it takes to insert 1000 on a table that has from 0 to 16 non-clustered indexes.
Each test has been run 20 times in order to have an average value. The value has been cleaned from outliers value due to unpredictable performance fluctuations due to machine activity.
The test shows that in a table with a row size of 80 bytes, 1000 rows can be inserted in 9,05 msec if no indexes are present on the table, and the value grows up to 88 (!!!) msec when you have 16 indexes on it
This means a impact on performance of 975%. That’s *huge*!
Now, what happens if we have a bigger row size? Say that we have a table with a row size of 1520 byte. Here’s the data, from 0 to 16 indexes on that table:
In this case we need near 22 msec to insert 1000 in a table with no indexes, but we need more that 500msec if the table has 16 active indexes! Now we’re talking of a 2410% impact on performance!
Now we can have a tangible idea of what’s the impact of having (too?) many indexes on a table and also how the size of a row also impact performances. That’s why the golden rule of OLTP databases “few indexes, but good” is so true! (And in fact last week I saw a database with tables with 1700bytes row size and 23 (!!!) indexes on them!)
This also means that a too heavy denormalization is really not a good idea (we’re always talking about OLTP systems, keep it in mind), since the performance get worse with the increase of the row size.
So, be careful out there, and keep in mind the “equilibrium” is the key world of a database professional: equilibrium between read and write performance, between normalization and denormalization, between to few and too may indexes.
Tests are done on a VMWare Workstation 7 VM with 2 CPU and 4 GB of Memory. Host machine is a Dell Precsioni M6500 with i7 Extreme X920 Quad-Core HT 2.0Ghz and 16Gb of RAM. Database is stored on a SSD Intel X-25E Drive, Simple Recovery Model, running on SQL Server 2008 R2. If you also want to to tests on your own, you can download the test script here: Open TestIndexPerformance.sql