Columnstore index in SQL Server 2014 contains 2 new important features: it can be clustered and it is updateable. So I decided to play with both. As a “control group” I’ve taken my old columnstore index demo from one of the ISUG (Israeli SQL Server Usergroup) sessions. The script itself isn’t important – it creates partition function with 7 partitions (actually 8 but one remains empty), table on it and populates the table with 63 million rows – 9 million in each partition. So I used the same script for SQL 2014 but this time created clustered columnstore index on empty table. So population should have updated the index. I had 2 issues to check:
2) According to the following article, columnstore index is impossible to update “in-place” due to its highly compressed structure. So “deltastores” are used. What I wanted to check is whether metadata accuracy is affected.
Regarding sizing, as expected, columnstore index provides better compression than anything else. Table in SQL 2012 contained clustered index that used page compression and non-clustered columnstore index. Clustered index size was 3.8GB, non-clustered columnstore index “weighted” 3.04GB. Clustered columnstore index in SQL 2014 “weighted” 2.87GB only. So instead of 6.84GB in SQL 2012, in SQL 2014 we’ve used only 2.87GB of diskspace.
Now to the metadata part. While script for table population had been running in SQL 2012, both sp_spaceused and rows column in sys.partitions showed pretty accurate picture. Not so with SQL 2014 when clustered columnstore index is being updated. Here is the output of sp_spaceused and from sys.partitions after population script finished to run and inserted 63 million rows into a table (7 partition, 9 million rows in each one):
Not even close, especially sp_spaceused. By the way, you can notice that I even tried to use second parameter of sp_spaceused: @updateusage. Nothing happened. As anticipated, what fixed the problem was index rebuild. That’s what we indeed have in the table:
So beware that the deltastore algorithm that is used for updating columnstore index leaves metadata very inaccurate, especially sp_spaceused. So don’t rely on it unless Microsoft finds a way to fix it.
Next question, much more important for SQL Server proper functioning is whether statistics also enter such a weird condition after massive columnstore index update. If so that’s not a “no go”, it just means that manual update of statistics is required after every massive columnstore index update.