How queries with GROUP BY are executed

This section uses the ROLLUP sub-clause of the GROUP BY clause in the explanation and example. For more information on the ROLLUP clause, see Using ROLLUP and CUBE.

Consider a single-table query of the following form:

SELECT select-list
   FROM table
   WHERE where-search-condition
   GROUP BY [ group-by-expression | ROLLUP (group-by-expression) ]
   HAVING having-search-condition

This query is executed in the following manner:

  1. Apply the WHERE clause

    This generates an intermediate result that contains only some of the rows of the table.

    The intermediate result contains a subset of rows from the table.
  2. Partition the result into groups

    This action generates an intermediate result with one row for each group as dictated by the GROUP BY clause. Each generated row contains the group-by-expression for each group, and the computed aggregate functions in the select-list and having-search-condition.

    A second intermediate result is made smaller by the application of the GROUP BY clause.
  3. Apply any ROLLUP operation

    Subtotal rows computed as part of a ROLLUP operation are added to the result set.

  4. Apply the HAVING clause

    Any rows from this second intermediate result that do not meet the criteria of the HAVING clause are removed at this point.

  5. Project out the results to display

    This action takes from step 3 only those columns that need to be displayed in the result set of the query—that is, it takes only those columns corresponding to the expressions from the select-list.

    The final result set is a projection of the second intermediate result set.

This process makes requirements on queries with a GROUP BY clause:

  • The WHERE clause is evaluated first. Therefore, any aggregate functions are evaluated only over those rows that satisfy the WHERE clause.
  • The final result set is built from the second intermediate result, which holds the partitioned rows. The second intermediate result holds rows corresponding to the group-by-expression. Therefore, if an expression that is not an aggregate function appears in the select-list, then it must also appear in the group-by-expression. No function evaluation can be performed during the projection step.
  • An expression can be included in the group-by-expression but not in the select-list. It is projected out in the result.