A post I’ve been meaning to write for a while, good that it fits with this month’s T-SQL Tuesday, hosted by Joey D’Antoni (@jdanton)
Ever since I got into databases, I’ve been a fan. I studied Pure Maths at university (as well as Computer Science), and am very comfortable with Set Theory, which undergirds relational database concepts. But I’ve also spent a long time as a developer, and appreciate that that databases don’t exactly fit within the stuff I learned in my first year of uni, particularly the “Algorithms and Data Structures” subject, in which we studied concepts like linked lists. Writing in languages like C, we used pointers to quickly move around data, without a database in sight. Of course, if we had a power failure all this data was lost, as it was only persisted in RAM. Perhaps it’s why I’m a fan of database internals, of indexes, latches, execution plans, and so on – the developer in me wants to be reassured that we’re getting to the data as efficiently as possible.
Back when SQL Server 2005 was approaching, one of the big stories was around CLR. Many were saying that T-SQL stored procedures would be a thing of the past because we now had CLR, and that obviously going to be much faster than using the abstracted T-SQL. Around the same time, we were seeing technologies like Linq-to-SQL produce poor T-SQL equivalents, and developers had had a gutful. They wanted to move away from T-SQL, having lost trust in it. I was never one of those developers, because I’d looked under the covers and knew that despite being abstracted, T-SQL was still a good way of getting to data. It worked for me, appealing to both my Set Theory side and my Developer side.
CLR hasn’t exactly become the default option for stored procedures, although there are plenty of situations where it can be useful for getting faster performance.
SQL Server 2014 is different though, through Hekaton – its In-Memory OLTP environment.
When you create a table using Hekaton (that is, a memory-optimized one), the table you create is the kind of thing you’d’ve made as a developer. It creates code in C leveraging structs and pointers and arrays, which it compiles into fast code. When you insert data into it, it creates a new instance of a struct in memory, and adds it to an array. When the insert is committed, a small write is made to the transaction to make sure it’s durable, but none of the locking and latching behaviour that typifies transactional systems is needed. Indexes are done using hashes and using bw-trees (which avoid locking through the use of pointers) and by handling each updates as a delete-and-insert.
This is data the way that developers do it when they’re coding for performance – the way I was taught at university before I learned about databases. Being done in C, it compiles to very quick code, and although these tables don’t support every feature that regular SQL tables do, this is still an excellent direction that has been taken.