GROUP BY and the SQL/2008 standard

The SQL/2008 standard is considerably more restrictive in its syntax than what is supported by SQL Anywhere. In the SQL/2008 standard, GROUP BY requires the following:

  • Each group-by-term specified in a GROUP BY clause must be a column reference: that is, a reference to a column from a table referenced in the query FROM clause. These expressions are termed grouping columns.

  • An expression in a SELECT list, HAVING clause, or ORDER BY clause that is not an aggregate function must be a grouping column, or only reference grouping columns. However, if optional SQL/2008 language feature T301, "Functional dependencies" is supported, then such a reference can refer to columns from the query FROM clause that are functionally determined by grouping columns.

In a GROUP BY clause in SQL Anywhere, group-by-term can be an arbitrary expression involving column references, literal constants, variables or host variables, and scalar and user-defined functions. For example, this query partitions the Employee table into three groups based on the Salary column, producing one row per group:

SELECT COUNT() FROM Employees 
   GROUP BY (
      IF SALARY < 25000 
         THEN 'low range' 
      ELSE IF Salary < 50000 
         THEN 'mid range' 
      ELSE 'high range' 
         ENDIF 
      ENDIF);

To include the partitioning value in the query result, you must add a group-by-term to the query SELECT list. To be syntactically valid, SQL Anywhere ensures that the syntax of the SELECT list item and group-by-term are identical. However, syntactically large SQL constructions may fail this analysis; moreover, expressions involving subqueries never compare equal.

In the example below, SQL Anywhere detects that the two IF expressions are identical, and computes the result without error:

SELECT (IF SALARY < 25000 THEN 'low range' ELSE IF Salary < 50000 THEN 'mid range' ELSE 'high range' ENDIF ENDIF), COUNT() 
FROM Employees 
GROUP BY (IF SALARY < 25000 THEN 'low range' ELSE IF Salary < 50000 THEN 'mid range' ELSE 'high range' ENDIF ENDIF);

However, this query contains a subquery in the GROUP BY clause that returns an error:

SELECT (Select State from Employees e WHERE e.EmployeeID = e2.EmployeeID), 
      COUNT() 
   FROM Employees e2
   GROUP BY (Select State from Employees e WHERE EmployeeID = e2.EmployeeID)

A more concise approach is to alias the SELECT list expression, and refer to the alias in the GROUP BY clause. Using an alias permits the SELECT list and the GROUP BY clause to contain correlated subqueries. SELECT list aliases used in this fashion are a vendor extension:



SELECT (
   IF SALARY < 25000 
      THEN 'low range' 
   ELSE IF Salary < 50000 
      THEN 'mid range' 
   ELSE 'high range' 
      ENDIF
   ENDIF) AS Salary_Range, 
   COUNT() FROM Employees GROUP BY Salary_Range;

While SQL Anywhere does not support all facets of SQL/2008 language feature T301 (Functional dependencies), SQL Anywhere does offer some support for derived values based on GROUP BY terms. SQL Anywhere supports SELECT list expressions that refer to GROUP BY terms, literal constants, and (host) variables, with or without scalar functions that may modify those values. As an example, the following query lists the number of employees by city/state combination:

SELECT City || ' ' || State, SUBSTRING(City,1,3), COUNT() 
FROM Employees 
GROUP BY City, State
 See also