What happens when NUMA affinity meets query parallelism? It gets no respect!
SQL Server allows you to affinitize a TCP port to a specific NUMA node or a group of NUMA nodes. Books Online has an article on How to: Map TCP/IP ports to NUMA Nodes. And this BOL article discusses various NUMA affinity scenarios.
Recently, I have been playing with NUMA affinity on various servers with hardware NUMA, such as those with Intel X5690 and Intel E7-4870, running SQL Server 2008 R2 RTM (10.50.1600) and SQL Server 2008 R2 SP1 (10.50.2500), both Enterprise X64 Edition. Although the feature largely behaved as advertised, I did run into some surprises! In this post and a future post, I’d like to document what surprised me. Perhaps, it’s just a misunderstanding on my part. But I want to throw it out there in case someone has a good explanation.
When I was running OLTP workloads with maxdop set to 1, everything behaved exactly as documented. In other words, when I affinitized port 50001 to NUMA node 1 and connected all the clients to the SQL Server 2008 R2 instance on port 50001, I was able to verify that only the processors on node 1 were driven busy. In addition, I loaded all the data from the test table via the connection to port 50001 on the server, and was able to confirm that all the pages were loaded into the memory local to node 1. This was expected as the NUMA node had 64GB of physical memory, far more than enough to fit the entire table that was about 10GB in size. Moreover, I was pleased to see that there was only local memory access. Again, this was expected because all the connections from the OLTP workload driver were made on port 50001 which was affnitized to node 1.
Then, I ran into two surprises. I’ll report a big surprise in this post and report the other one in a follow-up post.
When I set the reporting query maxdop to 2, 10, and 20 (or anything other than 1), things started to go beyond what I could find in SQL Server Books Online. For instance, with the query maxdop set to 2, I found that SQL Server appeared to avoid the affinitized NUMA node rather than favor it. In all my tests, I consistently observed that the query submitted to port 50001 was processed by nodes other than the affinitized node (in this node 1). The following charts offer some samples of the supporting evidence.
In the above chart, option(maxdop 2) was set, and the processors on node 2 were observed to be busy. The Remote node page lookups/sec counter shows that all the page lookups were remote page lookups. That’s because the pages were cached in memory local to node 1, but the processors on node 2 were doing the work.
The next chart shows a second run of the same query still with option (maxdop 2). In this case, SQL Server decided to process the query with the processors on node 3.
Similarly, the next two charts show that the query with option (maxdop 2) was processed on a node other than node 1.
The point is that although the query was submitted via port 50001 and port 50001 was affinitized to NUMA node 1, SQL Server consistently chose not to use node 1 in processing the query. This does not seem to be the expected behavior because nowhere can I find in BOL that says that the NUMA affinity feature only applies to non-parallel plans. If you spot it in BOL, please let me know.
I also tested other degrees of parallelism, and the story is the same. For instance, with maxdop set to 10, the following chart shows that all the 10 cores on node 2 were used when I’d expect SQL Server to use the 10 cores on node 1.
With maxdop set to 20, most of the cores, including the cores on node 1, were used, as shown in the following chart. Note that on node 1 it was local node page lookups, whereas on the other nodes it was all remote page lookups.
This behavior came as a big surprise. To me, this is a serious bug because it defeats the very purpose of the NUMA affinity feature. But hey, you never know. After all, how many times have you run into a bug only to be told it’s a feature by design? In this case, I have no idea. Hopefully, someone in the know can shed some light.