Parallelism of attribute-sensitive operation

Attribute-sensitive operations include joins, vector aggregations, and unions.


join

If two tables are joined in parallel, Adaptive Server tries to use semantics-based partitioning to make the join more efficient, depending on the amount of data being joined and the type of partitioning that each of the operands have. If the amount of data to be joined is small, but the number of pages to scan for each of the tables is quite significant, Adaptive Server serializes the parallel streams from each side and the join is done in serial mode. In this case, the query optimizer determines that it is suboptimal to run a join operation in parallel. In general, one or both of the operands used for the join operators may be any intermediate operator, like another join or a grouping operator, but the examples used show only scans as operands.

Tables with same useful partitioning

The partitioning of each operand of a join is useful only with respect to the join predicate. If two tables have the same partitioning, and the partitioning columns are a subset of the join predicate, the tables are said to be equipartitioned. For example, if you create another table, RB2, which is partitioned similarly to that of RA2, using the following command:

create table RB2(b1 int, b2 int, b3 int) 
partition by range(b1,b2)
(p1 values <= (500,100), p2 values <= (1000, 2000)) 

Then join RB2 with RA2; the scans and the join can be done in parallel without additional repartitioning. Adaptive Server can join the first partition of RA2 with the first partition of RB2, then join the second partition of RA2 with the second partition of RB2. This is called an equipartitioned join and is possible only if the two tables join on columns a1, b1 and a2, b2 as shown below:

select * from RA2, RB2 
where a1 = b1 and a2 = b2 and a3 < 0 
QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 2 worker processes.

7 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator

	 |EXCHANGE Operator (Merged)
	 |Executed in parallel by 2 Producer 
         and 1 Consumer processes.

	 |
	 |   |EXCHANGE:EMIT Operator
	 |   |
	 |   |   |NESTED LOOP JOIN Operator 
                 (Join Type: Inner Join)
	 |   |   |
	 |   |   |   |RESTRICT Operator
	 |   |   |   |
	 |   |   |   |   |SCAN Operator
	 |   |   |   |   |  FROM TABLE
	 |   |   |   |   |  RB2
	 |   |   |   |   |  Table Scan.
	 |   |   |   |   |  Forward Scan.
	 |   |   |   |   |  Positioning at start of table.
	 |   |   |   |   |  Executed in parallel with a 
                          2-way partition scan.
	 |   |   |
	 |   |   |   |RESTRICT Operator
	 |   |   |   |
	 |   |   |   |   |SCAN Operator
	 |   |   |   |   |  FROM TABLE
	 |   |   |   |   |  RA2
	 |   |   |   |   |  Table Scan.
	 |   |   |   |   |  Forward Scan.
	 |   |   |   |   |  Positioning at start of table.
	 |   |   |   |   |  Executed in parallel with a 
                          2-way partition scan.

The exchange operator is shown above the nested-loop join. This implies that exchange spawns two producer threads: the first scans the first partition of RA2 and RB2 and performs the nested-loop join; the second scans the second partition of RA2 and RB2 to do the nested-loop join. The two threads merge the results using a many-to-one (in this case, two-to-one) exchange operator.

One of the tables with useful partitioning

In this example, the table RB2 is repartitioned to a three-way hash partitioning on column b1 using the alter table command.

alter table RB2 partition by hash(b1) (p1, p2, p3) 

Now, take a slightly modified join query as shown below:

select * from RA2, RB2 where a1 = b1 

The partitioning on table RA2 is not useful because the partitioned columns are not a subset of the joining columns (that is, given a value for the joining column a1, you cannot specify the partition to which it belongs). However, the partitioning on RB2 is helpful because it matches the joining column b1 of RB2. In this case, the query optimizer repartitions table RA2 to match the partitioning of RB2 by using hash partitioning on column a1 of RA2 (the joining column, which is followed by a three-way merge join). The many-to-many (two-to-three) exchange operator above the scan of RA2 does this dynamic repartitioning. The exchange operator above the merge join operator merges the result using a many-to-one (three-to-one, in this case) exchange operator. The showplan output for this query is shown in the following example:

select * from RA2, RB2 where a1 = b1
QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 5 worker processes.

10 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator
   |EXCHANGE Operator (Merged)
   |Executed in parallel by 3 Producer and 1 Consumer        processes.
   
   |
   |   |EXCHANGE:EMIT Operator
   |   |
   |   |   |MERGE JOIN Operator (Join Type: Inner 
              Join)
   |   |   | Using Worktable3 for internal storage.
   |   |   |  Key Count: 1
   |   |   |  Key Ordering: ASC
   |   |   |
   |   |   |   |SORT Operator
   |   |   |   |Using Worktable1 for internal storage.
   |   |   |   |
   |   |   |   |   |EXCHANGE Operator (Repartitioned)
   |   |   |   |   |Executed in parallel by 2 Producer
                    and 3 Consumer processes.

   |   |   |   |   |
   |   |   |   |   |   |EXCHANGE:EMIT Operator
   |   |   |   |   |   |
   |   |   |   |   |   |   |RESTRICT Operator
   |   |   |   |   |   |   |
   |   |   |   |   |   |   |   |SCAN Operator
   |   |   |   |   |   |   |   |  FROM TABLE
   |   |   |   |   |   |   |   |  RA2
   |   |   |   |   |   |   |   |  Table Scan.
   |   |   |   |   |   |   |   |  Forward Scan.
   |   |   |   |   |   |   |   |  Positioning at start
                                  of table.
   |   |   |   |   |   |   |   |  Executed in parallel
                                  with a 2-way
                                  partition scan.
   |   |   |
   |   |   |   |SORT Operator
   |   |   |   |Using Worktable2 for internal storage.
   |   |   |   |
   |   |   |   |   |SCAN Operator
   |   |   |   |   |  FROM TABLE
   |   |   |   |   |  RB2
   |   |   |   |   |  Table Scan.
   |   |   |   |   |  Forward Scan.
   |   |   |   |   |  Positioning at start of table.
   |   |   |   |   |  Executed in parallel with a
                         3-way partition scan.

Both tables with useless partitioning

The next example uses a join where the native partitioning of the tables on both sides is useless. The partitioning on table RA2 is on columns (a1,a2) and that of RB2 is on (b1). The join predicate is on different sets of columns, and the partitioning for both tables does not help at all. One option is to dynamically repartition both sides of the join. By repartitioning table RA2 using a M-to-N (two-to-three) exchange operator, Adaptive Server chooses column a3 of table RA2 for repartitioning, as it is involved in the join with table RB2. For identical reasons, table RB2 is also repartitioned three ways on column b3. The repartitioned operands of the join are equipartitioned with respect to the join predicate, which means that the corresponding partitions from each side will join. In general, when repartitioning needs to be done on both sides of the join operator, Adaptive Server employs a hash-based partitioning scheme.

select * from RA2, RB2 where a3 = b3
QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 8 worker processes.

12 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator

    |EXCHANGE Operator (Merged)
    |Executed in parallel by 3 Producer and 1 Consumer
         processes.
    
    |
    |   |EXCHANGE:EMIT Operator
    |   |
    |   |   |MERGE JOIN Operator 
                (Join Type: Inner Join)
    |   |   | Using Worktable3 for internal storage.
    |   |   | Key Count: 1
    |   |   | Key Ordering: ASC
    |   |   |
    |   |   |   |SORT Operator
    |   |   |   |Using Worktable1 for internal 
                    storage.
    |   |   |   |
    |   |   |   |   |EXCHANGE Operator (Repartitioned)
    |   |   |   |   |Executed in parallel by 2 
                       Producer and 3 Consumer
                       processes.

    |   |   |   |   |
    |   |   |   |   |   |EXCHANGE:EMIT Operator
    |   |   |   |   |   |
    |   |   |   |   |   |   |RESTRICT Operator
    |   |   |   |   |   |   |
    |   |   |   |   |   |   |   |SCAN Operator
    |   |   |   |   |   |   |   |  FROM TABLE
    |   |   |   |   |   |   |   |  RA2
    |   |   |   |   |   |   |   |  Table Scan.
    |   |   |   |   |   |   |   |  Forward Scan.
    |   |   |   |   |   |   |   |  Positioning at 
                                     start of table.
    |   |   |   |   |   |   |   |  Executed in 
                                       parallel with
                                       a 2-way
                                       partition scan.
    |   |   |
    |   |   |   |SORT Operator
    |   |   |   |Using Worktable2 for internal 
                     storage.
    |   |   |   |
    |   |   |   |   |EXCHANGE Operator (Repartitioned)
    |   |   |   |   |Executed in parallel by 3 
                       Producer and 3 Consumer
                       processes.

    |   |   |   |   |
    |   |   |   |   |   |EXCHANGE:EMIT Operator
    |   |   |   |   |   |
    |   |   |   |   |   |   |SCAN Operator
    |   |   |   |   |   |   |  FROM TABLE
    |   |   |   |   |   |   |  RB2
    |   |   |   |   |   |   |  Table Scan.
    |   |   |   |   |   |   |  Forward Scan.
    |   |   |   |   |   |   |  Positioning at start 
                                 of table.
    |   |   |   |   |   |   |  Executed in parallel 
                                  with a 3-way
                                   partition scan.

In general, all joins, including nested-loop, merge, and hash joins, behave in a similar way. nested-loop joins display one exception, which is that the inner side of a nested-loop join cannot be repartitioned. This limitation occurs because, in the case of a nested-loop join, a column value for the joining predicate is pushed from the outer side to the inner side.

Replicated join

A replicated join is useful when an index nested-loop join needs to be used. Consider the case where a large table has a useful index on the joining column, but useless partitioning, and joins to a small table that is either partitioned or not partitioned. The small table can be replicated N ways to that of the inner table, where N is the number of partitions of the large table. Each partition of the large table is joined with the small table and, because no exchange operator is needed on the inner side of the join, an index nested-loop join is allowed.

create table big_table(b1 int, b2 int, b3 int)
partition by hash(b3) (p1, p2)

create index big_table_nc1 on big_table(b1)

create table small_table(s1 int, a2 int, s3 int)

select * from small_table, big_table
where small_table.s1 = big_table.b1
QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 3 worker processes.

7 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator

    |EXCHANGE Operator (Merged)
    |Executed in parallel by 2 Producer and 1 
         Consumer processes.

    |
    |   |EXCHANGE:EMIT Operator
    |   |
    |   |   |NESTED LOOP JOIN Operator (Join Type:
                Inner Join)
    |   |   |
    |   |   |   |EXCHANGE Operator (Replicated)
    |   |   |   |Executed in parallel by 1 Producer 
                    and 2 Consumer processes.

    |   |   |   |
    |   |   |   |   |EXCHANGE:EMIT Operator
    |   |   |   |   |
    |   |   |   |   |   |SCAN Operator
    |   |   |   |   |   |  FROM TABLE
    |   |   |   |   |   |  small_table
    |   |   |   |   |   |  Table Scan.
    |   |   |
    |   |   |   |SCAN Operator
    |   |   |   |  FROM TABLE
    |   |   |   |  big_table
    |   |   |   |  Index : big_table_nc1
    |   |   |   |  Forward Scan.
    |   |   |   |  Positioning by key.
    |   |   |   |  Keys are:
    |   |   |   |    b1 ASC
    |   |   |   |  Executed in parallel with a 
                      2-way hash scan.

Parallel reformatting

Parallel reformatting is especially useful when you are working with a nested-loop join. Usually, reformatting refers to materializing the inner side of a nested join into a worktable, then creating an index on the joining predicate. With parallel queries and nested-loop join, reformatting is also helpful when there is no useful index on the joining column or nested-loop join is the only viable option for a query because of the server/session/query level settings. This is an important option for Adaptive Server. The outer side may have useful partitioning and, if not, it can be repartitioned to create that useful partitioning. But for the inner side of a nested-loop join, any repartitioning means that the table must be reformatted into a worktable that uses the new partitioning strategy. The inner scan of a nested-loop join must then access the worktable.

In this next example, partitioning for tables RA2 and RB2 is on columns (a1, a2) and (b1, b2) respectively. The query is run with merge and hash join turned off for the session.

select * from RA2, RB2 where a1 = b1 and a2 = b3

QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 12 worker processes.

17 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator

    |SEQUENCER Operator has 2 children.
    |
    |   |EXCHANGE Operator (Merged)
    |   |Executed in parallel by 4 Producer 
             and 1 Consumer processes.

    |   |
    |   |   |EXCHANGE:EMIT Operator
    |   |   |
    |   |   |   |STORE Operator
    |   |   |   |  Worktable1 created, in allpages 
                      locking mode, for REFORMATTING.
    |   |   |   |  Creating clustered index.
    |   |   |   |
    |   |   |   |   |INSERT Operator
    |   |   |   |   |  The update mode is direct.
    |   |   |   |   |
    |   |   |   |   |   |EXCHANGE Operator
                            (Repartitioned)
    |   |   |   |   |   |Executed in parallel by 
                             2 Producer and 4 
                             Consumer processes.
    |   |   |   |   |   |
    |   |   |   |   |   |   |EXCHANGE:EMIT Operator
    |   |   |   |   |   |   |
    |   |   |   |   |   |   |   |RESTRICT Operator
    |   |   |   |   |   |   |   |
    |   |   |   |   |   |   |   |   |SCAN Operator
    |   |   |   |   |   |   |   |   |  FROM TABLE
    |   |   |   |   |   |   |   |   |  RB2
    |   |   |   |   |   |   |   |   |  Table Scan.
    |   |   |   |   |   |   |   |   |  Executed in
                                           parallel
                                           with a 
                                           2-way
                                           partition
                                           scan.
    |   |   |   |   |
    |   |   |   |   |  TO TABLE
    |   |   |   |   |  Worktable1.
    |
    |   |EXCHANGE Operator (Merged)
    |   |Executed in parallel by 4 Producer 
             and 1 Consumer processes.

    |   |
    |   |   |EXCHANGE:EMIT Operator
    |   |   |
    |   |   |   |NESTED LOOP JOIN Operator 
                    (Join Type: Inner Join)
    |   |   |   |
    |   |   |   |   |EXCHANGE Operator (Repartitioned)
    |   |   |   |   |Executed in parallel by 2 
                        Producer and 4 Consumer
                        processes.
    |   |   |   |   |
    |   |   |   |   |   |EXCHANGE:EMIT Operator
    |   |   |   |   |   |
    |   |   |   |   |   |   |RESTRICT Operator
    |   |   |   |   |   |   |
    |   |   |   |   |   |   |   |SCAN Operator
    |   |   |   |   |   |   |   |  FROM TABLE
    |   |   |   |   |   |   |   |  RA2
    |   |   |   |   |   |   |   |  Table Scan.
    |   |   |   |   |   |   |   |  Executed in 
                                       parallel with 
                                       a 2-way
                                       partition scan.
    |   |   |   |   |SCAN Operator
    |   |   |   |   |  FROM TABLE
    |   |   |   |   |  Worktable1.
    |   |   |   |   |  Using Clustered Index.
    |   |   |   |   |  Forward Scan.
    |   |   |   |   |  Positioning by key.
	

The sequence operator executes all of its child operators but the last, before executing the last child operator. In this case, the sequence operator executes the first child operator, which reformats table RB2 into a worktable using a four-way hash partitioning on columns b1 and b3. The table RA2 is also repartitioned four ways to match the stored partitioning of the worktable.

Serial join

Sometimes, it may not make sense to run a join in parallel because of the amount of data that needs to be joined. If you run a query similar to that of the earlier join queries, but now have predicates on each of the tables (RA2 and RB2) such that the amount of data to be joined is not enough, the join may be done in serial mode. In such a case, it does not matter how these tables are partitioned. The query still benefits from scanning the tables in parallel.

select * from RA2, RB2 where a1=b1 and a2 = b2 
and a3 =  0 and  b2 = 20

QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 4 worker processes.

11 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator

    |MERGE JOIN Operator (Join Type: Inner Join)
    | Using Worktable3 for internal storage.
    |  Key Count: 1
    |  Key Ordering: ASC
    |
    |   |SORT Operator
    |   | Using Worktable1 for internal storage.
    |   |
    |   |   |EXCHANGE Operator (Merged)
    |   |   |Executed in parallel by 2 Producer and
                 1 Consumer processes.

    |   |   |
    |   |   |   |EXCHANGE:EMIT Operator
    |   |   |   |
    |   |   |   |   |RESTRICT Operator
    |   |   |   |   |
    |   |   |   |   |   |SCAN Operator
    |   |   |   |   |   |  FROM TABLE
    |   |   |   |   |   |  RA2
    |   |   |   |   |   |  Table Scan.
    |   |   |   |   |   |  Executed in parallel with
                             a 2-way partition scan.
    |   |SORT Operator
    |   |Using Worktable2 for internal storage.
    |   |
    |   |   |EXCHANGE Operator (Merged)
    |   |   |Executed in parallel by 2 Producer and
                 1 Consumer processes.

    |   |   |
    |   |   |   |EXCHANGE:EMIT Operator
    |   |   |   |
    |   |   |   |   |RESTRICT Operator
    |   |   |   |   |
    |   |   |   |   |   |SCAN Operator
    |   |   |   |   |   |  FROM TABLE
    |   |   |   |   |   |  RB2
    |   |   |   |   |   |  Table Scan.
    |   |   |   |   |   |  Executed in parallel with 
                               a 2-way partition scan.

Semijoins

Semijoins, which result from flattening of in/exist subqueries, behave the same way as regular inner joins. However, replicated joins are not used for semijoins, because an outer row can match more than one time in such a situation.

Outer joins

In terms of parallel processing for outer joins, replicated joins are not considered. Everything else behaves in a similar way as regular inner joins. One other point of difference is that no partition elimination is done for any table in an outer join that belongs to the outer group.


Vector aggregation

Vector aggregation refers to queries with group-bys. There are different ways Adaptive Server can perform vector aggregation. The actual algorithms are not described here; only the technique for parallel evaluation is shown in the following sections.

In-partitioned vector aggregation

If any base or intermediate relation requires a grouping and is partitioned on a subset, or the same columns as that of the columns in the group by clause, the grouping operation can be done in parallel on each of the partitions and the resultant grouped streams merged using a simple N-to-1 exchange. This is because a given group cannot appear in more than one stream. The same restriction applies to grouping over any SQL query as long as you use semantics-based partitioning on the grouping columns or a subset of them. This method of parallel vector aggregation is called in-partitioned aggregation.

The following query uses a parallel in-partitioned vector aggregation since range partitioning is defined on the columns a1 and a2, which also happens to be the column on which the aggregation is needed.

select count(*), a1, a2 from RA2 group by a1,a2 
QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 2 worker processes.

4 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator

    |EXCHANGE Operator (Merged)
    |Executed in parallel by 2 Producer and 
        1 Consumer processes.

    |
    |   |EXCHANGE:EMIT Operator
    |   |
    |   |   |HASH VECTOR AGGREGATE Operator
    |   |   |  GROUP BY
    |   |   |  Evaluate Grouped COUNT AGGREGATE.
    |   |   | Using Worktable1 for internal storage.
    |   |   |
    |   |   |   |SCAN Operator
    |   |   |   |  FROM TABLE
    |   |   |   |  RA2
    |   |   |   |  Table Scan.
    |   |   |   |  Forward Scan.
    |   |   |   |  Positioning at start of table.
    |   |   |   |  Executed in parallel with a 2-way
                       partition scan.
    |   |   |   |  Using I/O Size 2 Kbytes for data 
                       pages.
    |   |   |   |  With LRU Buffer Replacement 
                        Strategy for data pages.

Repartitioned vector aggregation

Sometimes, the partitioning of the table or the intermediate results may not be useful for the grouping operation. It may still be worthwhile to do the grouping operation in parallel by repartitioning the source data to match the grouping columns, then applying the parallel vector aggregation. Such a scenario is shown below, where the partitioning is on columns (a1, a2), but the query requires a vector aggregation on column a1.

select count(*), a1 from RA2 group by a1

QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 4 worker processes.

6 operator(s) under root


The type of query is SELECT.

ROOT:EMIT Operator

    |EXCHANGE Operator (Merged)
    |Executed in parallel by 2 Producer and 1 Consumer processes.

    |
    |   |EXCHANGE:EMIT Operator
    |   |
    |   |   |HASH VECTOR AGGREGATE Operator
    |   |   |  GROUP BY
    |   |   |  Evaluate Grouped COUNT AGGREGATE.
    |   |   | Using Worktable1 for internal storage.
    |   |   |
    |   |   |   |EXCHANGE Operator (Repartitioned)
    |   |   |   |Executed in parallel by 2 Producer 
                     and 2 Consumer processes.

    |   |   |   |
    |   |   |   |   |EXCHANGE:EMIT Operator
    |   |   |   |   |
    |   |   |   |   |   |SCAN Operator
    |   |   |   |   |   |  FROM TABLE
    |   |   |   |   |   |  RA2
    |   |   |   |   |   |  Table Scan.
    |   |   |   |   |   |  Forward Scan.
    |   |   |   |   |   |  Positioning at start of 
                              table.
    |   |   |   |   |   |  Executed in parallel with
                               a 2-way partition scan.

Two-phased vector aggregation

For the query in the previous example, repartitioning may be expensive. Another possibility is to do a first level of grouping, merge the data using a N-to-1 exchange operator, then do another level of grouping. This is called a two-phased vector aggregation. Depending on the number of duplicates for the grouping column, Adaptive Server can reduce the cardinality of the data streaming through the N-to-1 exchange, which reduces the cost of the second level of grouping.

select count(*), a1 from RA2 group by a1

QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 2 worker processes.

5 operator(s) under root


The type of query is SELECT.

ROOT:EMIT Operator

    |HASH VECTOR AGGREGATE Operator
    |  GROUP BY
    |  Evaluate Grouped SUM OR AVERAGE AGGREGATE.
    | Using Worktable2 for internal storage.
    |
    |   |EXCHANGE Operator (Merged)
    |   |Executed in parallel by 2 Producer and
            1 Consumer processes.
    |   |
    |   |   |EXCHANGE:EMIT Operator
    |   |   |
    |   |   |   |HASH VECTOR AGGREGATE Operator
    |   |   |   |  GROUP BY
    |   |   |   |  Evaluate Grouped COUNT AGGREGATE.
    |   |   |   | Using Worktable1 for internal 
                      storage.
    |   |   |   |
    |   |   |   |   |SCAN Operator
    |   |   |   |   |  FROM TABLE
    |   |   |   |   |  RA2
    |   |   |   |   |  Table Scan.
    |   |   |   |   |  Executed in parallel with 
                          a 2-way partition scan.

Serial vector aggregation

As with some of the earlier examples, if the amount of data flowing into the grouping operator is restricted by using a predicate, executing that query in parallel may not make much sense. In such a case, the partitions are scanned in parallel and an N-to-1 exchange operator is used to serialize the stream followed by a serial vector aggregation:

select count(*), a1, a2 from RA2
where a1 between 100 and 200
group by a1, a2

QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and
     2 worker processes.

4 operator(s) under root


The type of query is SELECT.

ROOT:EMIT Operator

   |HASH VECTOR AGGREGATE Operator
   |  GROUP BY
   |  Evaluate Grouped COUNT AGGREGATE.
   | Using Worktable1 for internal storage.
   |
   |   |EXCHANGE Operator (Merged)
   |   |Executed in parallel by 2 Producer and 1
            Consumer processes.
   |   |
   |   |   |EXCHANGE:EMIT Operator
   |   |   |
   |   |   |   |SCAN Operator
   |   |   |   |  FROM TABLE
   |   |   |   |  RA2
   |   |   |   |  Positioning at start of table.
   |   |   |   |  Executed in parallel with a 2-way
                       partition scan.

You cannot always group on the partitioning columns, or take advantage of a table that is already partitioned on the grouping columns. The query optimizer determines if it is better to repartition and perform the grouping in parallel, or merge the data stream in a partitioned table and do the grouping in serial or a two-phased aggregation.

distinct

Queries with distinct operations are the same as grouped vector aggregation without the aggregation part. For example:

select distinct a1, a2 from RA2

is same as:

select a1, a2 from RA2 group by a1, a2

All of the methodologies that are applicable to vector aggregates are applicable here as well.

Queries with an in list

Adaptive Server uses an optimized technique to handle an in list. This is a common SQL construct. So, a construct like:

col in (value1, value2,..valuek) 

is same as:

col = value1 OR col = value2 OR .... col = valuek

The values in the in list are put into a special in-memory table and sorted for removal of duplicates.The table is then joined back with the base table using an index nested-loop join. The following example illustrates this with two values in the in list that correspond to two values in the or list:

SCAN Operator
FROM OR List
OR List has up to 2 rows of OR/IN values.

select * from RA2 where a3 in (1425, 2940)

QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 2
      worker processes.

6 operator(s) under root


The type of query is SELECT.

ROOT:EMIT Operator

  |EXCHANGE Operator (Merged)
  |Executed in parallel by 2 Producer and 1 
       Consumer processes.

  |
  |   |EXCHANGE:EMIT Operator
  |   |
  |   |   |NESTED LOOP JOIN Operator (Join Type: 
               Inner Join)
  |   |   |
  |   |   |   |SCAN Operator
  |   |   |   |  FROM OR List
  |   |   |   |  OR List has up to 2 rows of OR/IN 
                     values.
  |   |   |
  |   |   |   |RESTRICT Operator
  |   |   |   |
  |   |   |   |   |SCAN Operator
  |   |   |   |   |  FROM TABLE
  |   |   |   |   |  RA2
  |   |   |   |   |  Index : RA2_NC1
  |   |   |   |   |  Forward Scan.
  |   |   |   |   |  Positioning by key.
  |   |   |   |   |  Keys are:
  |   |   |   |   |    a3 ASC
  |   |   |   |   |  Executed in parallel with a 
                        2-way hash scan.

Queries with or clauses

Adaptive Server takes a disjunctive predicate like an or clause and applies each side of the disjunction separately to qualify a set of row IDs (RIDs). The set of conjunctive predicates on each side of the disjunction must be indexable. Also, the conjunctive predicates on each side of the disjunction cannot have further disjunction within them; that is, it makes little sense to use an arbitrarily deep nesting of disjunctive and conjunctive clauses. In the next example, a disjunctive predicate is taken on the same column (you can have predicates on different columns as long as you have indexes that can do inexpensive scans), but the predicates may qualify an overlapping set of data rows. Adaptive Server uses the predicates on each side of the disjunction separately and qualifies a set of row IDs. These row IDs are then subjected to duplicate elimination.

select a3 from RA2 where a3 = 2955 or a3 > 2990

QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 2 worker processes.

8 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator

  |EXCHANGE Operator (Merged)
  |Executed in parallel by 2 Producer and 1 
       Consumer processes.

  |
  |   |EXCHANGE:EMIT Operator
  |   |
  |   |   |RID JOIN Operator
  |   |   | Using Worktable2 for internal storage.
  |   |   |
  |   |   |   |HASH UNION Operator has 2 children.
  |   |   |   | Using Worktable1 for internal storage.
  |   |   |   |
  |   |   |   |   |SCAN Operator
  |   |   |   |   |  FROM TABLE
  |   |   |   |   |  RA2
  |   |   |   |   |  Index : RA2_NC1
  |   |   |   |   |  Forward Scan.
  |   |   |   |   |  Positioning by key.
  |   |   |   |   |  Index contains all needed
                         columns.Base table will not
                         be read.
  |   |   |   |   |  Keys are:
  |   |   |   |   |    a3 ASC
  |   |   |   |   |  Executed in parallel with a 
                       2-way hash scan.
  |   |   |   |
  |   |   |   |   |SCAN Operator
  |   |   |   |   |  FROM TABLE
  |   |   |   |   |  RA2
  |   |   |   |   |  Index : RA2_NC1
  |   |   |   |   |  Forward Scan.
  |   |   |   |   |  Positioning by key.
  |   |   |   |   |  Index contains all needed 
                         columns. Base table will 
                         not be read.
  |   |   |   |   |  Keys are:
  |   |   |   |   |    a3 ASC
  |   |   |   |   |  Executed in parallel with a 
                       2-way hash scan.
  |   |   |   |RESTRICT Operator
  |   |   |   |
  |   |   |   |   |SCAN Operator
  |   |   |   |   |  FROM TABLE
  |   |   |   |   |  RA2
  |   |   |   |   |  Using Dynamic Index.
  |   |   |   |   |  Forward Scan.
  |   |   |   |   |  Positioning by Row IDentifier 
                         (RID.)
  |   |   |   |   |  Using I/O Size 2 Kbytes for 
                         data pages.
  |   |   |   |   |  With LRU Buffer Replacement 
                          Strategy for data pages.

Two separate index scans are employed using the index RA2_NC1, which is defined on the column a3. The qualified set of row IDs are then checked for duplicate row IDs, and finally, joined back to the base table. Note the line Positioning by Row Identifier (RID). You can use different indexes for each side of the disjunction, depending on what the predicates are, as long as they are indexable. One way to easily identify this is to run the query separately with each side of the disjunction to make sure that the predicates are indexable. Adaptive Server may not choose an index intersection if it seems more expensive than a single scan of the table.

Queries with an order by clause

If a query requires sorted output because of the presence of an order by clause, Adaptive Server can apply the sort in parallel. First, Adaptive Server tries to avoid the sort if there is some inherent ordering available. If Adaptive Server is forced to do the sort, it sees if the sort can be done in parallel. To do that, Adaptive Server may repartition an existing data stream or it may use the existing partitioning scheme, then apply the sort to each of the constituent streams. The resultant data is merged using an N-to-1 order, preserving the exchange operator.

select * from RA2 order by a1, a2

QUERY PLAN FOR STATEMENT 1 (at line 1).
Executed in parallel by coordinating process and 2 worker processes.

4 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator

   |EXCHANGE Operator (Merged)
   |Executed in parallel by 2 Producer and 
         1 Consumer processes.

   |
   |   |EXCHANGE:EMIT Operator
   |   |
   |   |   |SORT Operator
   |   |   | Using Worktable1 for internal storage.
   |   |   |
   |   |   |   |SCAN Operator
   |   |   |   |  FROM TABLE
   |   |   |   |  RA2
   |   |   |   |  Index : RA2_NC2L
   |   |   |   |  Forward Scan.
   |   |   |   |  Positioning at index start.
   |   |   |   |  Executed in parallel with a 
                       2-way partition scan.

Depending upon the volume of data to be sorted, and the available resources, Adaptive Server may repartition the data stream to a higher degree than the current degree of the stream, so that the sort operation is faster. The degrees of sorting depends on whether the benefit obtained from doing the sort in parallel far outweighs the overheads of repartitioning.