datachange function

The datachange function measures the amount of change in the data distribution since update statistics last ran. Specifically, it measures the number of inserts, updates, and deletes that have occurred on the given object, partition, or column, and helps you determine if running update statistics would benefit the query plan.

The syntax for datachange is:

select datachange(object_name, partition_name, colname)

Where:

These parameters are all required.

datachange is expressed as a percentage of the total number of rows in the table or partition (if the partition is specified). The percentage value can be greater than 100 percent because the number of changes to an object can be much greater than the number of rows in the table, particularly when the number of deletes and updates to a table is very high.

The following set of examples illustrate the various uses for the datachange function. The examples use the following:

Passing a valid object, partition, and column name

The value reported when you include the object, partition, and column name is determined by the datachange value for the specified column in the specified partition divided by the number of rows in the partition. The result is expressed as a percentage:

datachange = 100 * (data change value for column C/ rowcount (P))

Using null partition names

If you include a null partition name, the datachange value is determined by the sum of the datachange value for the column across all partitions divided by the number of rows in the table. The result is expressed as a percentage:

datachange = 100 * (Sum(data change value for (O, P(1-N) , C))/rowcount(O)

Where P(1-N) indicates that the value is summed over all partitions.

Using null column names

If you include null column names, the value reported by datachange is determined by the maximum value of the datachange for all columns that have histograms for the specified partition divided by the number of rows in the partition. The result is expressed as a percentage:

datachange = 100 * (Max(data change value for (O, P, Ci))/rowcount(P)

Where the value of i varies through the columns with histograms (for example, formatid 102 in sysstatistics).

Null partition and column names

If you include null partition and column names, the value of datachange is determined by the maximum value of the datachange for all columns that have histograms summed across all partitions divided by the number of rows in the table. The result is expressed as a percentage:

datachange = 100 * ( Max(data change value for (O, NULL, Ci))/rowcount(O)

Where i is 1 through the total number of columns with histograms (for example, formatid 102 in sysstatistics).

This illustrates datachange gathering statistics:

create table matrix(col1 int, col2 int)
go
insert into matrix values (234, 560)
go
update statistics matrix(col1)
go
insert into matrix values(34,56)
go
select datachange ("matrix", NULL, NULL)
go

------
50.000000

The number of rows in matrix is two. The amount of data that has changed since the last update statistics command is 1, so the datachange percentage is 100 * 1/2 = 50 percent.

datachange counters are all maintained in memory. These counters are periodically flushed to disk by the housekeeper or when you run sp_flushstats.