A question we are often asked by both our Virtual-DBA and Consulting customers is: Why do I have to update statistics if auto-update statistics is turned on? The short answer is that your data changes, and even if you have auto-update statistics turned on the change in your data can result in a bad execution plan before the threshold for updating statistics is reached.
When does auto-update do its work? In SQL 2005 we could look at the value of rowmodctr in sysindexes to see how many modifications had been made to an object. The following is a somewhat over-simplified example of when statistics auto-update
If a tables has less than 6 records then statistics auto-update after 6 modifications
If the number of records is > 6 and <=500 then statistics auto-update after every 500 modifications
If the number of records >500 then statistics auto-update after ( 500 + 20%) modifications
In Sql 2000 the 20% refers to number of records and in SQL 2005 and greater it refers to 20% of the data changing
In SQL 2008 and SQL 2012 things get a bit more complicated. The modifications are tracked in a hidden system table sys.sysrscols via the rcmodified field and also in the non-hidden system view sys.system_internals_partition_columns in the modified_count field.
What are statistics
Statistics are information about the distribution of data in an object. The primary parts of a statistic are:
The Histogram – This essentially divides the data into as many as 200 steps. It is a frequency analysis of the data in the object.
Density – The density is 1/(number of distinct value) and is a measure of the number of duplicate values.
Header – The header contains general information.
String Statistics – This is information stored in a B-Tree like structure that can substantially improve the performance of queries that use operators such as ‘LIKE’
How are statistics used
Statistics allow the SQL Server query optimizer to make the best decision possible on how to execute a query. The query optimizer is cost-based and statistics are used to determine the costs of a step in an execution plan. When auto-create statistics is enabled the Query can also create a new statistic if it requires one.
Depending on the composition of your data and the way it changes you may have incorrect information in your statistics before auto-update statistics can fix things. Manually updating statistics can help improve performance in these cases.
We do not recommend that you use the maintenance plan step to update statistics – it is a ‘brute-force’ approach that doesn’t check on whether stats have recently been updated before updating. Updating statistics can be a very IO-intensive process. We also recommend you choose the sample size carefully when updating statistics. The sample size determines how much of the data is examined to update the statistics. You’ll want to find a balance between improved performance and excessive system IO. We suggest you use the stored procedure sp_updatestats in place of the maintenance plan step since it is selective about what statistics get updated. This proc only updates statistics where an update is required.