Distributed query processing (DQP) is dividing a query into multiple independent pieces of work, distributing that work to other nodes in the multiplex, and collecting and organizing the intermediate results to generate the final result set for the query. The servers in a Sybase IQ multiplex connect to a central store for permanent shared data, such as a shared disk array. Multiplex has a hybrid cluster architecture that uses shared storage for permanent IQ data, and independent node storage for catalog metadata, private temporary data, and transaction logs.
A simple query runs fastest on a single machine. However, when large and complex queries exceed a single machine's CPU capacity, the Sybase IQ query optimizer breaks the query into parallel “fragments” to concurrently run the distributed query on other servers in the multiplex.
Distribution requires network and storage overhead to
assign work, and store and transmit intermediate results. If a query
does not fully use the CPU resources on a single machine, you see
better results using a single machine. For example, if the optimizer
is going to parallelize a query 7 ways (thereby keeping 7 threads
at a time busy) on an 8-core box, Sybase IQ does not distribute
the query.
Before using DQP, you must:
Configure a multiplex. (See “Converting the IQ demo database to multiplex”.)
Set up temporary storage shared between multiplex servers. (See “Adding shared temporary storage” or “Adding shared temporary storage (manual method)”.)
“Running a query that gets distributed” tells how to run a query that is distributed to the two servers in the demo multiplex.