Improving subtotal calculation

If you have data that varies across dimensions such as date or place, you may need to determine how the data varies in each dimension. You can use the ROLLUP and CUBE operators to create multiple levels of subtotals and a grand total from a list of references to grouping columns. The subtotals “roll up” from the most detailed level to the grand total. For example, if you are analyzing sales data, you can compute an overall average and the average sales by year using the same query.

Using ROLLUP

To select total car sales by year, model and color:

SELECT Name, Size, Color, Sum(Quantity) 
FROM Products
GROUP BY ROLLUP (Name, Size, Color);

Name

Size

Color

sum(Products.Quantity)

Baseball Cap

One size fits all

Black

112

Baseball Cap

One size fits all

White

12

Baseball Cap

One size fits all

(NULL)

124

Baseball Cap

(NULL)

(NULL)

124

Shorts

Medium

Black

80

Shorts

Medium

(NULL)

80

Shorts

(NULL)

(NULL)

80

Sweatshirt

Large

Blue

32

Sweatshirt

Large

Green

39

Sweatshirt

Large

(NULL)

71

Sweatshirt

(NULL)

(NULL)

71

Tee Shirt

Medium

Orange

54

Tee Shirt

Medium

(NULL)

54

Tee Shirt

One size fits all

Black

75

Tee Shirt

One size fits all

(NULL)

75

Tee Shirt

Small

White

28

Tee Shirt

Small

(NULL)

28

Tee Shirt

(NULL)

(NULL)

157

Visor

One size fits all

Black

28

Visor

One size fits all

White

36

Visor

One size fits all

(NULL)

64

Visor

(NULL)

(NULL)

64

(NULL)

(NULL)

(NULL)

496

When processing this query, Sybase IQ first groups the data by all three specified grouping expressions (year, model, color), then for all grouping expressions except the last one (color). In the fifth row, NULL indicates the ROLLUP value for the color column, in other words, the total number of sales of that model in all colors. 343 represents the total sales of all models and colors in 1990 and 314 is the total for 1991. The last row represents total sales on all years, all models and all colors.

ROLLUP requires an ordered list of grouping expressions as arguments. When listing groups that contain other groups, list the larger group first (such as state before city.)

You can use ROLLUP with the aggregate functions: AVG, COUNT, MAX, MIN, STDDEV, SUM, and VARIANCE. ROLLUP does not support COUNT DISTINCT and SUM DISTINCT, however.

Using CUBE

The following query uses data from the Employees table, including the state (geographic location), gender, education level, and income of people. You can use the CUBE extension of the GROUP BY clause, if you want to compute the average income in the entire census of state, gender, and education and compute the average income in all possible combinations of the columns state, gender, and education, while making only a single pass through the census data in the table census. For example, use the CUBE operator if you want to compute the average income of all females in all states, or compute the average income of all people in the census according to their education and geographic location.

When CUBE calculates a group, CUBE puts a NULL value in the columns whose group is calculated. The distinction is difficult between the type of group each row represents and whether the NULL is a NULL stored in the database or a NULL resulting from CUBE. The GROUPING function solves this problem by returning 1, if the designated column has been merged to a higher level group.

The following query illustrates the use of the GROUPING function with GROUP BY CUBE.

SELECT case grouping(state) WHEN 1 THEN 'ALL' ELSE state
   END AS c_state, case grouping(sex) WHEN 1 THEN 'ALL'
   ELSE sex end AS c_gender, case grouping(DepartmentId)
   WHEN 1 THEN 'ALL' ELSE cast(DepartmentId as char(4)) end
   AS c_dept, COUNT(*), CAST(ROUND(AVG(salary),2) AS
   NUMERIC(18,2))AS AVERAGE 
FROM employees WHERE state IN ('MA' , 'CA') 
GROUP BY CUBE(state, sex, DepartmentId)
ORDER BY 1,2,3;

The results of this query are shown below. Note that the NULLs generated by CUBE to indicate a subtotal row are replaced with ALL in the subtotal rows, as specified in the query.

c_state

c_gender

c_dept

COUNT()

AVERAGE

ALL

ALL

200

3

52,200.00

ALL

ALL

ALL

3

52,200.00

ALL

F

200

2

58,650.00

ALL

F

ALL

2

58,650.00

ALL

M

200

1

39,300.00

ALL

M

ALL

1

39,300.00

CA

ALL

200

3

52,200.00

CA

ALL

ALL

3

52,200.00

CA

F

200

2

58,650.00

CA

F

ALL

2

58,650.00

CA

M

200

1

39,300.00

CA

M

ALL

1

39,300.00

Data warehouse administrators find ROLLUP and CUBE particularly useful for operations like:

ROLLUP and CUBE allow you to use one query to compute data using multiple levels of grouping, instead of a separate query for each level.

For more information on the ROLLUP and CUBE operators, see the SELECT statement in “SQL Statements” in Reference: Statements and Options.