THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Benjamin Nevarez

How the Query Optimizer Uses Statistics – Part II

I got a question from a reader of my post How the Query Optimizer Uses Statistics ( and I thought that it would be a good idea to post my answer here. Basically the request was to expand the previous example for a predicate with two columns.


First, a reminder that the histogram only shows the values of the first column of the statistics object. So, how the Query Optimizer does this?


One way to find the answer to this is, well, just to run an example and see what the Query Optimizer does. Let us run a query, see which statistics are automatically created and inspect those statistics.


This is the code sent by the reader. Run this to create a table and populate it with some data.


create table MyStatsTest (

      id int identity(1, 1),

      ProductGroupID int,

      ProductID int



declare @i int

set nocount on

set @i = 1

while @i < 100000


insert MyStatsTest (ProductGroupID, ProductID)

select datepart(millisecond, getdate()) % 2, datepart(millisecond, getdate())

      set @i = @i + 1



Make sure you start with no statistics objects in the table. You can run this to verify that there are no statistics


select * from sys.stats

where object_id = object_id('MyStatsTest')


If there are some statistics, perhaps after running a query, you can drop them using a command like this (the name of your statistics objects may be different)


drop statistics MyStatsTest._WA_Sys_00000002_7D78A4E7


Run the first query to see the statistics created automatically by the Query Optimizer.


select * from MyStatsTest

where ProductId < 17 and ProductGroupId = 1


Now you can run the previous query again and notice that two statistics objects were created. Use DBCC SHOW_STATISTICS to display both histograms like in the next example


dbcc show_statistics('MyStatsTest', _WA_Sys_00000003_014935CB)


You can use the first histogram to estimate the number of records for the ProductId < 17 predicate using the method I described in my previous post. I got 1,744 rows which is the sum of the values 392, 302, 313, 419 and 318 (Your table has different data so you will get a different value).




Using the second histogram you can notice that 60.0316003% of the records have ProductGroupid = 1 (This is 60,031 divided by the total number of records 99,999).






Since this is using the AND operator you need to obtain the 60.0316003% of 1,744. This is 1,046.95 which is the estimated number of rows shown on the execution plan of the query.




You can do the same calculation for ProductGroupId = 0 and you will get an estimated number of rows of 697.049 which are also shown on the execution plan. Notice that you may need to run DBCC FREEPROCCACHE between tests to clear the plan cache, otherwise it may show a cached execution plan.


Finally, let us change the query to use OR operator (instead of AND)


select * from MyStatsTest

where ProductId < 17 or ProductGroupId = 1


The plan shows 60,728 estimated rows. Since this is using an OR operator this value could be obtained as the union of the following two queries


select * from MyStatsTest

where ProductId < 17   




select * from MyStatsTest

where ProductId >= 17 and ProductGroupId = 1


The first query estimates 1,744 rows and the second one 58,984, for a total of 60,720 rows. You can use the previous method to estimate these values.


The same value could also be obtained as the union of these two queries


select * from MyStatsTest

where ProductId < 17 and ProductGroupId = 0




select * from MyStatsTest

where ProductGroupId = 1


The first one shows 697.049 estimated rows and the second one 60,031, for a total of 60,728.049 estimated rows.

Published Friday, September 11, 2009 11:32 PM by Ben Nevarez
New Comments to this post are disabled

About Ben Nevarez

Benjamin Nevarez is a database professional based in Los Angeles, CA, and author of "Inside the SQL Server Query Optimizer". He has also contributed to other SQL Server books including "SQL Server 2012 Internals". Benjamin has 20 years of experience with relational databases and has been working with SQL Server since version 6.5. He holds a Master’s Degree in Computer Science and has been a speaker at many SQL Server conferences, including the PASS Summit and SQL Server Connections. Benjamin's blog is at and can be reached by e-mail at admin at benjaminnevarez dot com and on twitter at @BenjaminNevarez.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement