Results from parallel queries are returned through one of three merge strategies, or as the final step in a sort. Parallel queries that do not have a final sort step use one of these merge types:
Queries that contain a vector (grouped) aggregate use worktables to store temporary results; the coordinating process merges the results into one worktable and returns results to the client.
Queries that contain a scalar (ungrouped) aggregate use internal variables, and the coordinating process performs the final computations to return the results to the client.
Queries that do not contain aggregates and that do not use clauses that do not require a final sort can return results to the client as the tables are being scanned. Each worker process stores results in a result buffer and uses address locks to coordinate transferring the results to the network buffers for the task.
More than one merge type can be used when queries require several steps or multiple worktables.
See “showplan messages for parallel queries” on page 114 in the Performance and Tuning: Monitoring and Analyzing for Performance for more information on merge messages.
For parallel queries that include an order by clause, distinct, or union, results are stored in a worktable in tempdb, then sorted. If the sort can benefit from parallel sorting, a parallel sort is used, and results are returned to the client during the final merge step performed by the sort.
For more information on how parallel sorts are performed, see Chapter 9, “Parallel Sorting.”
Since parallel queries use multiple processes to scan data pages, queries that do not use aggregates and do not include a final sort step may return results in different order than serial queries and may return different results for queries with set rowcount in effect and for queries that select into a local variable.
For details and solutions, see “When parallel query results can differ”.