THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Elisabeth Redei

Lies, Damned Lies And Statistics – Part I

One of the biggest call generators at Microsoft SQL Server Support when SQL Server 7.0 was released was issues with statistics. In the beginning, they tended to be about the Autoupdate statistics feature interfering with user activity but when SQL Server 2000 came, and systems running SQL Server became larger, issues with skewed statistics/data and problems related to the sample rate used when gathering statistics emerged and in 2005, the challenges are the same. There are some new statistics features in 2008 but overall, this post applies to 2008 as well.

First a few words on the symptoms that are typical for issues related to statistics. What they all have in common is seemingly strange variations in duration when the server has no resource problems and you have confirmed that there is no blocking:

- The same query (with the same values in the search argument or WHERE clause) varies in duration without any related blocking or resource problem

- Some queries degrade in performance as time passes since the last Index rebuild

- The same procedure will perform different depending on what values are passed into it

- Intermittent performance problem on some tables without any related blocking or resource problem

Although all of these symptoms may be related to other issues, I always look into the quality of the statistics before I spend time on more time consuming troubleshooting.

Background - Statistics Default Settings

Even though the default settings (which is Autoupdate and Autocreate Statistics ON for all tables and indexes), will take you a long way, it is absolutely necessary to have a more granular approach as your databases and your workload grows. There is no tuning available that can work around a problem with lack of accurate statistics. The problems related to statistics (or lack thereof) are either about:

- How often the statistics are updated

- The sample rate used when the statistics is built

One thing that you might not have thought about is when Autoupdate statistics will be triggered. A lot of people will say it is when a certain threshold for changes is reached (more details below) but that’s not the whole truth – it does not automatically start when the threshold is reached. It is part of the compilation or optimization process and will only be triggered when/if the threshold has been reached AND the index/column is touched by a query. The thresholds are really Recompilation thresholds and not Statistics thresholds.

Simplified it works like this:

clip_image002

Picture based on diagram in http://technet.microsoft.com/en-us/library/cc966425.aspx

There are some other processes that will trigger an update of the statistics. And not only that, it will trigger an update of the statistics WITH FULLSCAN (i.e. using all rows in the index to build statistics as opposed to Autoupdate statistics which will use some sample rate):

- DBCC REINDEX (not INDEXDEFRAG)

- CREATE INDEX …. WITH DROP_EXISTING

If you do any of the above on an index, you will get the best statistics possible and it is not too uncommon to see cases when switching from Index Rebuild to UPDATE STATISTICS … WITH FULLSCAN, gives the same performance boost. It is also important to know that once the Autoupdate statistics process runs, it will reduce the sample rate of the statistics down to the default.

If your maintenance window is getting crowded, and index rebuild is part of your maintenance plans, you might want to experiment with a combination of index rebuilds and statistics updates (needless to say updating statistics is not going to do anything for fragmentation should that be an issue).

Now to the reasons why Autoupdate statistics might not work well in all situations:

1. Statistics gets out-of-date or stale

This is a problem related to large tables. What happens is that it takes more and more modifications for the Recompilation Threshold to be met. The formula for tables with more than 500 rows is:

Rowmodctr 500 + 20% of total rows (rowcnt)

(the two columns, rowmodctr and rowcnt, can be found in the sys.sysindexes compatibility view).

The problem following this, is that the Optimizer will have somewhat misleading information about the cardinality of tables and indexes (i.e. a WHERE or JOIN clause returns more or less rows than expected).

2. Again for large tables, the default sample rate (used by the Autoupdate Statistics feature) may not be enough because it gets relatively lower as the tables grow larger (to avoid issues with Autoupdate Statistics process taking too long).

3. Data in the column is naturally "skewed" (e.g. LastName; Smith vs. Redei for instance)

4. The Autoupdate Statistics process interrupts your user activity

How do you diagnose and address these issues? In this post I will explain how to do that for the first 2 items in the list above; stale statistics and problems with default sample rate.

Stale statistics
===========
To diagnose, you would typically look at two things:

- When is the last time the statistics were updated?

- What is the ratio rowcnt/rowmodctr (for tables/indexes with rowcnt > somenumber)?

The solution to this is quite simple; you just need to schedule manual UPDATE STATISTICS jobs on the tables when a threshold that you deem reasonable is met:

SELECT

ssi.id

, object_name(ssi.id) AS tblName

, stats_date(ssi.id,ssi.indid) as StatsDate

, ssi.indid

, ssi.rowcnt

, ssi.rowmodctr

, cast(ssi.rowmodctr as decimal)/cast(ssi.rowcnt as decimal) as ChangedRowsRatio

, ss.no_recompute AS IsAutoUpdateOff

FROM sys.sysindexes ssi left join sys.stats ss

ON ssi.name = ss.name

WHERE ssi.id > 100

AND indid > 0

AND ssi.rowcnt > 500

AND (ssi.rowmodctr/ssi.rowcnt) > 0.15 -- enter a relevant number

ORDER BY 3

You would probably want to change the ssi.rowcnt and the ratio to something that is relevant for your database.

The output will look something like this:

clip_image003

In this case, I have some statistics on a non-clustered index (indid > 1) which has not been updated for 3 years. The IsAutoUpdateOff = 0 tells me that Autoupdate statistics feature is ON for this particular index (this can also be viewed, or changed, with the sp_autostats procedure).

Once you have confirmed that stale statistics is an issue, you have two options:

- Update the statistics for all indexes in the database using sp_updatestats (if Autoupdate statistics is turned on, there is no point in using the ‘resample’ of option of sp_updatestats)

- Update statistics only for the tables you have identified with UPDATE STATISTICS (use UPDATE STATISTICS <tablename> WITH ALL if you want to update all indexes of a table)

Low Sample Rate

=============

This is slightly trickier than the problem with stale statistics; both to diagnose and to solve. The way you would know that sample rate is an issue, is if a query that you think have suspiciously high duration runs faster after you have run an UPDATE STATISTIC ... WITH FULLSCAN on it (or if you did an Index rebuild that gave you statistics with a full scan).

It involves finding the point when the sample rate is high enough to create useful statistics but low enough for the process of updating statistics to fit into your maintenance window. You also need to decide how often the job should run.

Your starting point should be the current sample rate which you will find with DBCC SHOW_STATISTICS:

DBCC SHOW_STATISTICS ([schemaname.tablename], indexname) WITH STAT_HEADER

Since this output cannot be presented with table results, I have attached a script that will present the output from DBCC SHOW_STATISTICS for all indexes for all tables and schemas in the current database. The output will look something like this:

clip_image004

What you want to look for is large discrepancies between Rows and RowsSampled (in the above output a full scan has been used to create the statistics so the numbers are equal).

As mentioned, for the large tables where you believe that this might be an issue, you should start with some number above the default sample rate (RowsSampled/Rows) to get an idea of how long it will take to update the statistics with a certain sample rate. Then move the sample rate up until the execution time of UPDATE STATISTICS hits the threshold given by your maintenance window.

Once you have found your ideal sample rate, you then need to set up jobs to run UPDATE STATISTICS ... WITH SAMPLE RATE XX, NO_RECOMPUTE and turn off the Autoupdate feature with:

EXEC sp_autostats ‘tablename’, ‘OFF’

If the NO_RECOMPUTE clause is omitted, Autoupdate statistic will be switched on and the next time that particular index passes the Recompilation Threshold, Autoupdate Statistics will be run with the default sample rate (which you have just concluded is not enough to provide the optimizer with accurate statistics).

Next post I will show you how you can diagnose issues with “skewed” data/statistics.

Published Sunday, March 01, 2009 11:51 PM by Elisabeth Redei

Attachment(s): CheckSampleRate.zip

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

jchang said:

why not come out and say it: SQL Server 7 and 2000 does not randomly sample rows, it samples the entire set of rows in given pages (which are probably selected)

so if there is any correlation with page location, the sample is seriously deficient.

to account for the statblob/statstream storage format that allows a max of 200 steps, in certain cases, I have created separate tables for generating identities in certain ranges, so large customers get one range of ids, medium another, and small a third. the identity from the special tables are then used in the actual Customer table. MS might help by formalizing this.

March 1, 2009 6:42 PM
 

Elisabeth Redei said:

It's the same in later versions.If you have a sample rate less than fullscan, SQL Server will actually try to find an access path that is NOT sorted on the first column of the statistics in order to get more accurate statistics.

It's a compromise between getting sufficiently representative data and performance of the statistics update process in itself.

March 2, 2009 1:33 AM
 

Chris Wood said:

Elisabeth,

We set Auto Update Statistics Asynchronously to True. Does this change anything of you analysis?

Thanks

Chris

March 3, 2009 9:15 AM
 

Chris Wood said:

Elisabeth,

We have Auto Update Statistics Asynchronously set to TRUE. Can you cemment on this please?

Thanks

Chris

March 3, 2009 10:04 AM
 

Elisabeth Redei said:

Hi Chris,

So that would actually adress the 4th issue in the list which is when the Autoupdate statistics process is intruding on your user transactions. If that is not on, any query that causes a Recompilation (due to statistics update) would have to wait for the Autoupdate process to finish. Consequently, that instance of the query will run with outdated statistics (which may or may not have a big impact).

I have never seen cases where this setting has had any negative side effects but others may be able to comment on that.

March 3, 2009 2:49 PM
 

Chris Wood said:

Thanks Elisabeth.

We do not have auto create stats set so for the automatic updating it would only be on table/indexes that actually have stats. I have a few that return NULL from the stats_date function even though I am using a sysadmin account. What would make this show as NULL?

Chris

March 4, 2009 1:53 PM
 

Chris Wood said:

Elisabeth,

I used your query to look at our tables and indexes and saw some NULL dates. Looking in BOL it does say that the Stats_Date function may not return valid data because of permissions but I was running with an account with sysadmin authority so it cannot be that. Any ideas?

Thanks

Chris

PS. I posted this question yesterday but it has not appeared. I had the same problem with my previous post.

March 5, 2009 9:58 AM
 

Chris Wood said:

Elisabeth,

What would cause the stats_date function to return NULL when I am running the query as a sysadmin? We have auto update stats but do not have auto create stats set.

Thanks

Chris

March 9, 2009 10:24 AM
 

Jeff Banschbach said:

Elisabeth,

The symptoms you listed match up almost exactly with what I'm seeing in my production environment.  It appears that running DBCC FREEPROCCACHE relieves the symptoms for a short period of time.  Can you confirm if running this would temporarily relieve the symptoms you listed above?

Thanks,

March 9, 2009 11:35 AM
 

Elisabeth Redei said:

Hi,

If stats_date returns NULL and assuming no permission issues (which there shouldn't be for sysadmin), it means there are no statistics on that particular column/index. If autocreate statistics is turned OFF, you need to explicitly create the statistics with sp_createstats or CREATE STATISTICS.

I have yet to see issues with Autocreate statistics turned ON, although plenty where it has been OFF.

March 9, 2009 11:43 AM
 

Chris Wood said:

Thanks Elisabeth. I will try updating the ones with NULL values and see what the select shows me.

Chris

March 9, 2009 12:16 PM
 

Elisabeth Redei said:

Hi Jeff,

If DBCC FREEPROCCACHE *consistently* (and with consistently I mean that all issued queries perform better after you have flushed the cache) gives you better performance for all queries issued after the DBCC FREEPROCCACHE, I would be more inclined to think memory pressure. I worked with issues when the procedure cache was flooded with Ad hoc statements that were not being re-used and we would run DBCC FREEPROCCACHE to work around this but this was 2000. However, I kind of squirm about the thought of using it in a production environment because keeping things in cache is conceptually a necessity for performance. Which version are you on?

A quick test could be to issue:

dbcc freesystemcache ('sql plans') -- to clear ad hoc plans

.. and see what happens. And then:

dbcc freesystemcache ('object plans')

The one thing that happens when  you flush the procedure cache is obviously that all incoming queries have to be compiled again. If it were not for the fact that you get consistently better performance I would say you have a problem with parameter sniffing.

There is a wealth of good information about this phenomenom in this blog post:

http://blogs.msdn.com/queryoptteam/archive/2006/03/31/565991.aspx

.. and Bart Duncan lists a few workarounds in this post:

http://blogs.msdn.com/bartd/archive/2006/07/27/680518.aspx

Statistics are a part of the picture because the problem stems from the fact that you might have a plan in cache where you looked for an "atypical" value or vice versa, you run a stored procedure and pass it an atypical value but in cache, you have plan that was created when it was run with a typical value.

March 9, 2009 3:45 PM
 

Efelito said:

Elisabeth,

Thank you very much for the response.  I am running SQL Server 2005 Enterprise x64 Edition SP3 + CU1.  I too squirm when I think about having to run it on a scheduled basis, but right now it's the only way I can keep the system healthy.  I will try switcing to dropping the plans and seeing how it goes.  I have also opened a ticket with Microsoft to get a little more help with the issue.  We do run a pretty large number of ad-hoc queries that are not re-used.  Do you know of any options on the connection or command objects which would tell SQL Server not to bother caching a query plan?

Thanks again for the info and the links.

Jeff Banschbach

March 9, 2009 6:15 PM
 

SeanElliott said:

Hi Elisabeth. Your blog was very timely. I needed to update statistics for tables that had chnaged enough to upset performance but not enough to make auto stats update the statistics. I suspect the indexes are poor too but that is another story. Anyway I took some of your ideas, improved them and made them work for SQL 2000. Perhaps it would have been better to just run sp_updatestats but not as educational. The code below is "self documenting" and seems to work in general. I did get a divide by zero for one database but it worked in about 30 OK. You have to run it twice because when the clustered indexes have rowmodctr reset by update statistics it makes other indexes meet criteria.

/***************************************************************************************************

Description:   SQL Server 2000 script to identify indexes that might need a manual update stats

              and execute it. This is to avoid the situation where auto stats fails to update

              resulting in performance degradation.

              This script relies on sysindexes.rowcnt having the correct value which is not generally true

              unless you routinely run:

              dbcc updateusage('database name') with count_rows

              Inspired by:

http://sqlblog.com/blogs/elisabeth_redei/archive/2009/03/01/lies-damned-lies-and-statistics-part-i.aspx

Change History:

Ver Date         Developer        Change

1.0 19-Mar-2009  Sean Elliott     Created

***************************************************************************************************/

set nocount on

declare @nrc_mask int

declare @Row int

declare @MaxRow int

declare @sql varchar(255)

create table #all_indexes

(

  MyId         int   identity,

  OwnerName    sysname,

  TableName    sysname,

  IndexName    sysname

)

insert #all_indexes (OwnerName, TableName, IndexName)

select sus.name, sob.name, sid.name

from sysindexes sid

join sysobjects sob on sob.id = sid.id

join sysusers sus on sus.uid = sob.uid

where sid.id > 100

and   sid.indid > 0

and   sid.indid < 255

and (sid.status & 64) = 0   -- filters autostats for columns i.e. _WA_xxxxx stats - see sp_helpindex

order by sob.name, sid.name

set @MaxRow = (select count(*) from #all_indexes)

set @Row = 1

-- Run dbcc updateusage for each index seperatley to minimize blocking

while @Row <= @MaxRow

begin

  set @sql = (select 'dbcc updateusage(0, [' + OwnerName + '.' + TableName + '], [' + IndexName + ']) with count_rows' from #all_indexes where MyId = @Row)

  print @sql

  exec(@sql)

  set @Row = @Row + 1

end

drop table #all_indexes

-- see code for sp_autostats

set @nrc_mask = 16777216

create table #candidates

(

  MyId              int identity,

  OwnerName         sysname,

  TableName         sysname,

  IndexName         sysname,

  StatsDate         datetime,

  AutoUpdateStats   char(3),

  RealRowCount      bigint,

  RealRowModCtr     int,

  ChangedRowsRatio  float

)

insert #candidates

(

  OwnerName,

  TableName,

  IndexName,

  StatsDate,

  AutoUpdateStats,

  RealRowCount,

  RealRowModCtr,

  ChangedRowsRatio

)

select   sus.name as OwnerName,

        sob.name AS tblName,

        sid.name as IdxName,

        stats_date(sid.id, sid.indid) as StatsDate,

        case (sid.status & @nrc_mask)

           when @nrc_mask THEN 'OFF'

           else 'ON'

        end as AutoUpdate,

        case

           when sid.indid <= 1 then sid.rowcnt

           else (select mix.rowcnt from sysindexes mix where mix.id = sid.id and mix.indid <= 1)

        end as real_rowcnt,

        case

           when sid.indid <= 1 then sid.rowmodctr

           -- Real rowmodctr for non clustered index is sum of that for clustered plus non clustered

           else sid.rowmodctr + (select mix.rowmodctr from sysindexes mix where mix.id = sid.id and mix.indid <= 1)

        end as real_rowmodctr,

        case

           when sid.indid <= 1 then cast(sid.rowmodctr as float) / cast(sid.rowcnt as float)

           else cast((select mix.rowmodctr from sysindexes mix where mix.id = sid.id and mix.indid <= 1) + sid.rowmodctr as float) /

                cast((select mix.rowcnt from sysindexes mix where mix.id = sid.id and mix.indid <= 1) as float)

        end as ChangedRowsRatio

from sysindexes sid

join sysobjects sob on sob.id = sid.id

join sysusers sus on sus.uid = sob.uid

where  sid.id > 100

and    sid.indid > 0

and    sid.indid < 255

-- and    sid.rowcnt >= 500   -- different algorithm used when <500 rows in table but "so what"

and    sid.rowcnt > 0      -- avoid divide by zero but also filters autostats for columns i.e. _WA_xxxxx stats

-- 5% - tune as required

and    cast(sid.rowmodctr as float) / cast(sid.rowcnt as float) > 0.05

order by tblName, IdxName

select * from #candidates where AutoUpdateStats = 'OFF'

if @@rowcount > 0

begin

  print '************************************************************'

  print '***** WARNING ***** Some tables have auto stats switched off'

  print '************************************************************'

end

select * from #candidates

select @MaxRow = count(*) from #candidates

if @MaxRow > 0

begin

  set @Row = 1

  while @Row <= @MaxRow

  begin

     set @sql = (select 'update statistics [' + OwnerName + '].[' + TableName + '] [' + IndexName + ']' from #candidates where MyId = @Row)

     print @sql

     exec(@sql)

     set @Row = @Row + 1

  end

end

drop table #candidates

March 30, 2009 9:45 AM
 

Elisabeth Redei said:

Lies damned lies and statistics - Part II In previous post I listed situations where your statistics

August 10, 2009 8:44 AM
 

Daniel Adeniji said:

Very nice, well covered topic.

Brings out the best as to what what science & blogging is about:

Upon returning to the article, I found out that duplicate entries were being returned.  Corrected by amending the join clause:

SELECT

ssi.id

, object_name(ssi.id) AS objectName

, ssi.name as indexName  --2012-01-05 added by Daniel Adeniji

, stats_date(ssi.id,ssi.indid) as StatsDate

, ssi.indid

, ssi.rowcnt

, ssi.rowmodctr

, cast(ssi.rowmodctr as decimal)/cast(ssi.rowcnt as decimal) as ChangedRowsRatio

, ss.no_recompute AS IsAutoUpdateOff

FROM sys.sysindexes ssi

left join sys.stats ss

ON  ssi.id = ss.object_id

AND ssi.name = ss.name

WHERE ssi.id > 100

AND indid > 0

AND ssi.rowcnt > 500

AND ssi.rowcnt > 0   --2012-01-05 added by Daniel Adeniji

AND (ssi.rowmodctr/ssi.rowcnt) > 0.15 -- enter a relevant number

ORDER BY stats_date(ssi.id,ssi.indid) asc

January 5, 2012 2:05 PM
 

mordechai danielov said:

Hi Elizabeth,

tried running your script and found that #dbccStat table was missing two columns:

, FilterExp nvarchar(200)

, UnFilteredRows int

were they recently added to the output of DBCC SHOW_STATISTICS?

June 26, 2012 4:06 AM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement