group by and having clauses

Description

Used in select statements to divide a table into groups and to return only groups that match conditions in the having clause. group by is typically used in conjunction with aggregates to specify how to group the unaggregated columns of a select query. having clauses are applied to these groups.

Syntax

Start of select statement
[group by [all] aggregate_free_expression
	[, aggregate_free_expression]...]
[having search_conditions]
End of select statement

Parameters

group by

specifies the groups into which the table is divided, and if aggregate functions are included in the select list, finds a summary value for each group. These summary values appear as columns in the results, one for each group. You can refer to these summary columns in the having clause.

You can use the avg, count, count_big, max, min, and sum aggregate functions in the select list before group by (the expression is usually a column name). For more information, see “Aggregate functions” in Chapter 2, “Transact-SQL Functions” of Reference Manual: Building Blocks.

A table can be grouped by any combination of columns—that is, groups can be nested within each other, as in Example 2.

all

is a Transact-SQL extension that includes all groups in the results, even those excluded by a where clause. For example:

select type, avg (price) 
from titles 
where advance > 7000 
group by all type

type 
-----------------  ---------- 
UNDECIDED                NULL
business                 2.99
mod_cook                 2.99
popular_comp            20.00
psychology               NULL
trad_cook               14.99

 (6 rows affected)

“NULL” in the aggregate column indicates groups that would be excluded by the where clause. A having clause negates the meaning of all.

aggregate_free_expression

is an expression that includes no aggregates. A Transact-SQL extension allows grouping by an aggregate-free expression as well as by a column name.

You cannot group by column heading or alias. This example is correct:

select Price=avg (price), Pay=avg (advance), 
Total=price * $1.15 
from titles 
group by price * $1.15
having

sets conditions for the group by clause, similar to the way in which where sets conditions for the select clause.

having search conditions can include aggregate expressions; otherwise, having search conditions are identical to where search conditions. This is an example of a having clause with aggregates:

select pub_id, total = sum (total_sales) 
from titles 
where total_sales is not null 
group by pub_id 
having count (*)>5

When Adaptive Server optimizes queries, it evaluates the search conditions in where and having clauses, and determines which conditions are search arguments (SARGs) that can be used to choose the best indexes and query plan. All of the search conditions are used to qualify the rows. For more information on search arguments, see the Performance and Tuning Guide: Optimizer and Abstract Plans.

Examples

Example 1

Calculates the average advance and the sum of the sales for each type of book:

select type, avg (advance), sum (total_sales) 
from titles 
group by type

Example 2

Groups the results by type, then by pub_id within each type:

select type, pub_id, avg (advance), sum (total_sales) 
from titles 
group by type, pub_id

Example 3

Calculates results for all groups, but displays only groups whose type begins with “p”:

select type, avg (price) 
from titles 
group by type 
having type like 'p%'

Example 4

Calculates results for all groups, but displays results for groups matching the multiple conditions in the having clause:

select pub_id, sum (advance), avg (price) 
from titles 
group by pub_id 
having sum (advance) > $15000 
and avg (price) < $10 
and pub_id > "0700"

Example 5

Calculates the total sales for each group (publisher) after joining the titles and publishers tables:

select p.pub_id, sum (t.total_sales)
from publishers p, titles t
where p.pub_id = t.pub_id
group by p.pub_id

Example 6

Displays the titles that have an advance of more than $1000 and a price that is more than the average price of all titles:

select title_id, advance, price
from titles
where advance > 1000
having price > avg (price)

Usage


How group by works with the optimizer

In Adaptive Server version 15.0, there are two possible algorithms (implemened as operators) for doing group by: GroupHashing and GroupSorted. The optimizer chooses which operator to use based on factors such as the requirements these operators place on the input data streams.

The GroupSorted operator requires that the input rows to be aggregated are already sorted on the group by columns. Since the input rows must be sorted, the optimizer uses either of the following:

The optimizer uses the GroupHashing operator if ordering on the group by columns is not available or the row size limitations of the GroupSorted operator are exceeded. The GroupHashing operator applies a hashing function to the values of the group by columns to be able to put rows with the same group by column values into the same hash bucket. Once the input rows have all been hashed into buckets, the rows in the buckets are aggregated to generate the group by results. The only limitation of the GroupHashing operator is that the total row size of group by columns and aggregate results cannot be larger than 64K. There is no limitation on the number of group by columns or the number of aggregation operations, just the total row width.


How group by and having queries with aggregates work


Standard group by and having queries


Transact-SQL extensions to group by and having


group by and having and sort orders

Standards

ANSI SQL – Compliance level: Entry-level compliant.

The use of columns within the select list that are not in the group by list and have no aggregate functions is a Transact-SQL extension.

The use of the all keyword is a Transact-SQL extension.

See also

Commands compute clause, declare, select, where clause

Functions Aggregate functions