Discovery of exploitable conditions through predicate inference

An efficient access strategy for virtually any query relies on the presence of sargable conditions in the WHERE, ON, and HAVING clauses. Indexed retrieval is possible only by exploiting sargable conditions as matching predicates. In addition, hash, merge, and block-nested loops join can only be used when an equijoin condition is present. For these reasons, SQL Anywhere does detailed analysis of the search conditions in the original query text to discover simplified or implied conditions that can be exploited by the optimizer.

As a preprocessing step, several simplifications are made to predicates in the original statement once view expansion and merging have taken place. For example:

  • X = X is rewritten as X IS NOT NULL if X is nullable; otherwise, the predicate is eliminated.
  • ISNULL(X,X) is rewritten as simply X.
  • X+0 is rewritten as X if X is a numeric column.
  • AND 1=1 is eliminated.
  • OR 1=0 is eliminated.
  • IN-list predicates that consist of a single element are converted to simple equality conditions.

After this preprocessing step, SQL Anywhere attempts to normalize the original search condition into conjunctive normal form (CNF). For an expression to be in CNF, each term in the expression must be AND'ed together. Each term is either made up of a single atomic condition, or a set of conditions OR'ed together.

Converting an arbitrary condition into CNF may yield an expression of similar complexity, but with a much larger set of conditions. SQL Anywhere recognizes this situation, and refrains from naively converting the condition into CNF. Instead, SQL Anywhere analyzes the original expression for exploitable predicates that are implied by the original search condition, and ANDs these inferred conditions to the query. Complete normalization is also avoided if this requires duplication of an expensive predicate (for example, a quantified subquery predicate). However, the algorithm merges IN-list predicates together whenever feasible.

Once the search condition has either been completely normalized or the exploitable conditions have been found, the optimizer performs transitivity analysis to discover transitive equality conditions, primarily transitive join conditions and conditions with a constant. In doing so, the optimizer increases its degrees of freedom when performing join enumeration during its cost-based optimization phase, since these transitive conditions may permit additional alternative join orders.

Example

Suppose the original query is as follows:

SELECT e.Surname, s.ID, s.OrderDate
FROM SalesOrders s, Employees e
WHERE 
  ( e.EmployeeID = s.SalesRepresentative AND
    ( s.SalesRepresentative = 142 OR 
      s.SalesRepresentative = 1596 )
  ) OR (
     e.EmployeeID = s.SalesRepresentative AND 
     s.CustomerID = 667 );

This query has no conjunctive equijoin condition; hence, without detailed predicate analysis the optimizer would fail to discover an efficient access plan. Fortunately, SQL Anywhere is able to convert the entire expression to CNF, yielding the equivalent query:

SELECT e.Surname, s.ID, s.OrderDate
FROM SalesOrders s, Employees e
WHERE 
   e.EmployeeID = s.SalesRepresentative AND
   ( s.SalesRepresentative = 142 OR 
     s.SalesRepresentative = 1596 OR 
     s.CustomerID = 667 );

This query can now be efficiently optimized as an inner join query.