update all statistics

Description

Updates all statistics information for a given table, including histograms on all columns, regardless of whether they are indexed. You can run update all statistics on a single data partition.

Syntax

update all statistics
	table_name [partition data_partition_name]
	[using step values]
	[with consumers = consumers] 
	[, sampling=N [percent]]
	[, no_hashing | partial_hashing | hashing] 
		[, max_resource_granularity = N [percent]]
		[, histogram_tuning_factor = int ]
		[, print_progress = int]

Parameters

table_name: is the name of the table for which statistics are being updated.

data_partition_name: is the name of the partition to be updated. Statistics for each local index partition on the data partition is updated. Does not update statistics for global indexes.

using step values: specifies the number of histogram steps. The default value is 20, for columns where no statistics exist. To change the default, use sp_configure to modify the number of histogram steps parameter. If statistics for a column already exist in sysstatistics, the default value is the current number of steps.

The steps are applied to each partition of a partitioned table—for example, update index statistics uses the default value of 20 steps for each data and index partition involved in the scan for updating statistics. If global statistics are generated through an index scan of a global index, then 20 steps are applied by default. If partition statistics are generated, either through a data scan or local index scan, then by default, 20 steps are applied for each partition.

If the histogram steps specified through using step values is M, and the histogram_tuning_factor parameter is N, then update index statistics uses between 0 and M*N steps, depending on the number of frequency cells that update index statistics isolates and whether any range cells exist.

with consumers = consumers: specifies the number of consumer processes to be used for a sort when column_list is provided and parallel query processing is enabled. The consumers option specifies the degree of parallelism applied to a sort performed for statistics update on a single data partition. For example, if update statistics with a column list is applied to a table with three data partitions, data from each of the three partitions is sorted separately and the consumers option is applied during each of the sort. The three sorts themselves are not performed in parallel.

The value for the max parallel degree configuration parameter must be greater than the value for with consumers. For example, if with consumers is set to 2, then max parallel degree must be at least 3.

with sampling = N percent: specifies the percentage of the column to be randomly sampled to gather statistics. The value for N is any number between 1 and 100.

[no_hashing | partial_hashing | hashing: indicates the level of hash-based statistics update all statistics gathers. One of:

no_hashing – updates all statistics uses the algorithm from Adaptive Server versions earlier than 15.7, gathering sort-based statistics.

partial_hashing – updates all statistics uses the algorithm for columns with fewer than 65536 unique values. If updates all statistics encounters unique column counts greater or equal to the 65536 threshold, it uses an extra scan with a sort.

hashing – updates all statistics uses low-domain and high-domain hashing to create the histograms.

The default for these parameters is the configured value for update statistics hashing.

max_resource_granularity = N percent: limits the amount of tempdb buffer cache used with the update all statistics and hashing.

histogram_tuning_factor = integer: determines update all statistics’s distribution granularity.

print_progress = int: Determines if update all statistics displays progress messages.

0 – (the default) command does not display any progress messages

1 – command displays progress messages

Examples

Example 1

Updates all statistics for the salesdetail table:

update all statistics salesdetail

Example 2

Updates all statistics for the smallsales partition on salesdetail table:

update all statistics salesdetail partition smallsales

Example 3

Updates hash-based statistics for the authors table:

update all statistics authors with hashing

Usage

update all statistics updates all the statistics information for a given table. Adaptive Server keeps statistics about the distribution of pages within a table, and uses these statistics when considering whether or not to use a parallel scan in query processing on partitioned tables, and which index (es) to use in query processing. The optimization of your queries depends on the accuracy of the stored statistics.
Histogram statistics are created on each column, either through an index scan of a leading column, a projection of the column into a work table followed by a sort, or by using hashing to concurrently generate histograms on several columns for respective scans.
Density statistics are created for all the prefix subsets of the columns of index (es) whose statistics is being updated. For example, if an index is on columns c1, c2 and c3, then the prefix subsets are (c1,c2) and (c1, c2, c3).
When you run update all statistics on a single data partition, histograms are generated for each leading column of the local indexes using an index scan. If hash-based statistics gathering is enabled, then statistics are gathered on all the minor attributes as well during the index scans. For all other columns, including leading columns of a global index, update all statistics performs either a data scan followed by a sort or performs a data scan that uses hash-based statistics gathering.

The advantage of hash-based statistics is that one data scan can collect histograms on all columns, whereas sort-based statistics uses a separate scan for each column.
update statistics commands create partition-specific statistics. Global statistics are implicitly created during partition statistics creation. The partition statistics serve as input to global statistics creation and enable per-partition DDL operations. Global statistics are used by the optimizer.
update all statistics regenerates and update the table statistics stored in systabstats for each data and index partition of the table. If the update all statistics command is run for a specific data partition, the table statistics are generated and updated only for that data partition and any local index partitions. Global indexes are skipped.

Using hash-based statistics

The partial_hashing parameter uses hashing on columns only when it produces histograms that are as accurate and of equivalent quality as those produced by sorting (low domain cases), otherwise, the partial_hashing parameter uses sorting for high-domain cases.
Hashing may produce less accurate histograms than sorting in some cases (high-domain cases).
Although hash-based statistics do not require the tempdb disk space or procedure cache memory used by sorting, it may use a significant amount of tempdb buffer cache memory.
Large sorts used for the no_hashing parameter may scavenge statements in the statement cache and stored procedures to release procedure cache memory to support the sort.
max_resource_granularity limits the amount of tempdb buffer cache memory used for hashing or partial_hashing. It does not affect the amount of memory used by the no_hashing parameter or sorting.
If you include the partial_hashing parameter, and a previous histogram on a column exists that indicates a high-domain column, then Adaptive Server assumes that sort-based statistics are required on this column. If no previous histogram exists on a column, then Adaptive Server assumes this column is low-domain until the high domain limit is reached.

Standards

ANSI SQL – Compliance level: Transact-SQL extension.

Permissions

You must be the table owner or a user with update statistics permission on the table to run update all statistics.