Although the query optimizer can efficiently optimize most queries, these issues may effect the optimizer’s efficiency:
If statistics have not been updated recently, the actual data distribution may not match the values used to optimize queries.
The rows referenced by a specified transaction may not fit the pattern reflected by the index statistics.
An index may access a large portion of the table.
where clauses (SARGS) are written in a form that cannot be optimized.
No appropriate index exists for a critical query.
A stored procedure was compiled before significant changes to the underlying tables were performed.
No statistics exists for the SARG or joining columns.
These situations highlight the need to follow some best practices that allow the query optimizer to perform at its full potential:
Follow these guidelines when you write search arguments for your queries:
Avoid functions, arithmetic operations, and other expressions on the column side of search clauses. When possible, move functions and other operations to the expression side of the clause.
Use as many search arguments as you can, to give the query processor as much as possible to work with.
If a query has more than 400 predicates for a table, put the most potentially useful clauses near the beginning of the query, since only the first 102 SARGs on each table are used during optimization. (All of the search conditions are used to qualify the rows.)
Queries using > (greater than) may perform better if you can rewrite them to use >= (greater than or equal to). For example, this query, with an index on int_col, uses the index to find the first value where int_col equals 3, and then scans forward to find the first value that is greater than 3. If there are many rows where int_col equals 3, the server must scan many pages to find the first row where int_col is greater than 3:
select * from table1 where int_col > 3
It is more efficient to write the query this way:
select * from table1 where int_col >= 4
This optimization is more difficult with character strings and floating-point data.
Check the showplan output to see which keys and indexes are used.
If an index is not being used when you expect it to be, use output from the set commands in Table 3-1 to see whether the query processor is considering the index.
Queries expressed as a single SQL statement exploit the query processor better than queries expressed in two or more SQL statements. SQL-derived tables enable you to express, in a single step, what might otherwise require several SQL statements and temporary tables, especially where intermediate aggregate results must be stored. For example:
select dt_1.* from (select sum(total_sales) from titles_west group by total_sales) dt_1(sales_sum), (select sum(total_sales) from titles_east group by total_sales) dt_2(sales_sum) where dt_1.sales_sum = dt_2.sales_sum
Here, aggregate results are obtained from the SQL derived tables dt_1 and dt_2, and a join is computed between the two SQL derived tables. Everything is accomplished in a single SQL statement.
For more information, see Chapter 9, “SQL Derived Tables,” in the Transact-SQL User's Guide.
To understand query and system behavior, know the sizes of your tables and indexes. At several stages of tuning work, you need size data to:
Understand statistics i/o reports for a specific query plan.
Understand the query processor's choice of query plan. The Adaptive Server cost-based query processor estimates the physical and logical I/O required for each possible access method and selects the cheapest method.
Determine object placement, based on the sizes of database objects and on the expected I/O patterns on the objects.
To improve performance, distribute database objects across physical devices, so that reads and writes to disk are evenly distributed.
Object placement is described in the Performance and Tuning Series: Physical Database Tuning.
Understand changes in performance. If objects grow, their performance characteristics can change. For example, consider a table that is heavily used and is usually 100 percent cached. If the table grows too large for its cache, queries that access the table can suffer poor performance. This is particularly true of joins that require multiple scans.
Do capacity planning. Whether you are designing a new system or planning for the growth of an existing system, you must know the space requirements to plan for physical disks and memory needs.
Understand output from sp_sysmon reports on physical I/O.
See the System Administration Guide: Volume 2 for more information on sizing.