Transact-SQL Extensions to group by and having

Transact-SQL extensions to standard SQL let you display data more flexibly, by allowing references to columns and expressions that are not used for creating groups or summary calculations:

A select list that includes aggregates can include extended columns that are not arguments of aggregate functions and are not included in the group by clause. An extended column affects the display of final results, since additional rows are displayed.
The group by clause can include columns or expressions that are not in the select list.
The group by all clause displays all groups, even those excluded from calculations by a where clause. See the example for the keyword all in the “Parameters” section.
The having clause can include columns or expressions that are not in the select list and not in the group by clause.

When the Transact-SQL extensions add rows and columns to a display, or if group by is omitted, query results may be hard to interpret. The examples that follow can help you understand how Transact-SQL extensions can affect query results.

The following examples illustrate the differences between queries that use standard group by and having clauses and queries that use the Transact-SQL extensions:

An example of a standard grouping query:

select type, avg (price) 
from titles 
group by type

type                
----------------------  ---------- 
UNDECIDED                 NULL 
business                 13.73 
mod_cook                 11.49 
popular_comp             21.48 
psychology               13.50 
trad_cook                15.96 

 (6 rows affected)

The Transact-SQL extended column, price (in the select list, but not an aggregate and not in the group by clause), causes all qualified rows to display in each qualified group, even though a standard group by clause produces a single row per group. The group by still affects the vector aggregate, which computes the average price per group displayed on each row of each group (they are the same values that were computed for example a):

select type, price, avg (price) 
from titles 
group by type

type           price
------------ ---------------- --------------
business       19.99            13.73
business       11.95            13.73
business        2.99            13.73
business       19.99            13.73
mod_cook       19.99            11.49
mod_cook        2.99            11.49
UNDECIDED       NULL             NULL
popular_comp   22.95            21.48
popular_comp   20.00            21.48
popular_comp    NULL            21.48
psychology     21.59            13.50
psychology     10.95            13.50
psychology      7.00            13.50
psychology     19.99            13.50
psychology      7.99            13.50
trad_cook      20.95            15.96
trad_cook      11.95            15.96
trad_cook      14.99            15.96

 (18 rows affected)

The way Transact-SQL extended columns are handled can make it look as if a query is ignoring a where clause. This query computes the average prices using only those rows that satisfy the where clause, but it also displays rows that do not match the where clause.

The SAP ASE server first builds a worktable containing only the type and aggregate values using the where clause. This worktable is joined back to the titles table in the grouping column type to include the price column in the results, but the where clause is not used in the join.

The only row in titles that is not in the results is the lone row with type = “UNDECIDED” and a NULL price, that is, a row for which there were no results in the worktable. If you also want to eliminate the rows from the displayed results that have prices of less than $10.00, you must add a having clause that repeats the where clause, as shown in Example 4:

select type, price, avg (price) 
from titles 
where price > 10.00 
group by type

type           price
------------ ---------------- --------------
business        19.99            17.31
business        11.95            17.31
business         2.99            17.31
business        19.99            17.31
mod_cook        19.99            19.99
mod_cook         2.99            19.99
popular_comp    22.95            21.48
popular_comp    20.00            21.48
popular_comp     NULL            21.48
psychology      21.59            17.51
psychology      10.95            17.51
psychology       7.00            17.51
psychology      19.99            17.51
psychology       7.99            17.51
trad_cook       20.95            15.96
trad_cook       11.95            15.96
trad_cook       14.99            15.96

 (17 rows affected)

If you are specifying additional conditions, such as aggregates, in the having clause, also include all conditions specified in the where clause. The SAP ASE server appears to ignore any where clause conditions that are missing from the having clause:

select type, price, avg (price)
from titles
where price > 10.00
group by type
having price > 10.00

type           price
----------- ----------------  --------------
business       19.99           17.31
business       11.95           17.31
business       19.99           17.31
mod_cook       19.99           19.99
popular_comp   22.95           21.48
popular_comp   20.00           21.48
psychology     21.59           17.51
psychology     10.95           17.51
psychology     19.99           17.51
trad_cook      20.95           15.96
trad_cook      11.95           15.96
trad_cook      14.99           15.96 

 (12 rows affected)

This is an example of a standard grouping query using a join between two tables. It groups by pub_id, then by type within each publisher ID, to calculate the vector aggregate for each row:
```
select p.pub_id, t.type, sum (t.total_sales)
from publishers p, titles t
where p.pub_id = t.pub_id
group by p.pub_id, t.type
```
```
pub_id  type
------  ------------  --------
0736    business       18722
0736    psychology      9564
0877    UNDECIDED       NULL
0877    mod_cook       24278
0877    psychology       375
0877    trad_cook      19566
1389    business       12066
1389    popular_comp   12875

 (8 rows affected)
```
It may seem that it is only necessary to specify group by for the pub_id and type columns to produce the results, and add extended columns as follows:
```
select p.pub_id, p.pub_name, t.type,
    sum (t.total_sales)
from publishers p, titles t
where p.pub_id = t.pub_id
group by p.pub_id, t.type
```
However, the results for the above query are much different from the results for the first query in this example. After joining the two tables to determine the vector aggregate in a worktable, the SAP ASE server joins the worktable to the table (publishers) of the extended column for the final results. Each extended column from a different table invokes an additional join.

As you can see, using the extended column extension in queries that join tables can easily produce results that are difficult to comprehend. In most cases, use the standard group by to join tables in your queries.
This example uses the Transact-SQL extension to group by to include columns that are not in the select list. Both the pub_id and type columns are used to group the results for the vector aggregate. However, the final results do not include the type within each publisher. In this case, you may only want to know how many distinct title types are sold for each publisher:
```
select p.pub_id, sum (t.total_sales)
from publishers p, titles t
where p.pub_id = t.pub_id
group by p.pub_id, t.type
```
```
pub_id
------  --------
0736       18722
0736        9564
0877        NULL
0877        24278
0877          375
0877        19566
1389        12066
1389        12875

 (8 rows affected)
```
This example combines two Transact-SQL extension effects. First, it omits the group byclause while including an aggregate in the select list. Second, it includes an extended column. By omitting the group by clause:
- The table becomes a single group. The scalar aggregate counts three qualified rows.
- pub_id becomes a Transact-SQL extended column because it does not appear in a group by clause. No having clause is present, so all rows in the group are qualified to be displayed.
```
select pub_id, count (pub_id) 
from publishers
```
```
pub_id 
---------- --------- 
0736               3 
0877               3 
1389               3 

 (3 rows affected)
```

The where clause excludes publishers with a pub_id of 1000 or more from the single group, so the scalar aggregate counts two qualified rows. The extended column pub_id displays all qualified rows from the publishers table:

select pub_id, count (pub_id) 
from publishers 
where pub_id < "1000"

pub_id 
-------------- ----------- 
0736                    2 
0877                    2 
1389                    2 

 (3 rows affected)

This example illustrates an effect of a having clause used without a group by clause.
- The table is considered a single group. No where clause excludes rows, so all the rows in the group (table) are qualified to be counted.
- The rows in this single-group table are tested by the having clause.
- These combined effects display the two qualified rows.
```
select pub_id, count (pub_id) 
from publishers 
having pub_id < "1000"
```
```
pub_id 
-------------- --------- 
0736                   3 
0877                   3 
 (2 rows affected)
```
This example uses the extension to having that allows columns or expressions not in the select list and not in the group by clause. It determines the average price for each title type, but it excludes those types that do not have more than $10,000 in total sales, even though the sum aggregate does not appear in the results:
```
select type, avg (price)
from titles
group by type
having sum (total_sales) > 10000
```
```
type
------------  ----------
business           13.73
mod_cook           11.49
popular_comp       21.48
trad_cook          15.96

 (4 rows affected)
```