set statistics plancost example

set statistics plancost simplifies query analysis by displaying estimated values for logical I/O, physical I/O, and row counts compared to the actual ones evaluated at each operator, and reports on CPU and sort buffer cost.

set statistics plancost on

Use plancost to compare a query plan’s estimated and actual costs. plancost may also help you diagnose query performance problems. Table 1-2 describes plancost’s actual and estimated output.

**Table 1-2: Estimated and actual costs**
For each operator, plancost shows	For this operation
`el:` – estimated `l:` – actual	Logical I/O
`ep: – estimated` `p:` – actual	Physical I/O
`er:` – estimated `r:` – actual	Row counts
`cpu:` – actual	CPU counts

Execution operators that perform sort- or hash-based operations report the number of private buffers used for that operation (“bufct:”, not shown in the example below). plancost may display a subset of costs because not all these quantities apply to all operators. Use the subset cost to verify whether the optimizer’s estimates are valid for sub-optimal query plans.

For example, plancost issues the following output when you run a join query on the authors, titleauthor, and titles tables:

select A.au_fname, A.au_lname, T.title
from authors A, titleauthor TA, titles T
where A.au_id = TA.au_id and T.title_id = TA.title_id

==================== Lava Operator Tree ====================
                                               Emit
                                               (VA = 6)
                                               r:25 er:342
                                               cpu: 0
                                       /
                                       MergeJoin
                                       Inner Join
                                       (VA = 5)
                                       r:25 er:342
                               /                          \
                             Sort                         IndexScan
                             (VA = 3)                     titles_indx (T)
                             r:25 er:25                   (VA = 4)
                             l:6 el:6                     r:18 er:18
                             p:0 ep:0                     l:2 el:3
                             cpu: 0 bufct: 24             p:0 ep:3
                      /
                     MergeJoin
                     Inner Join
                     (VA = 2)
                     r:25 er:25
             /                                  \
            IndexScan                           IndexScan
            auidind (TA)                        authors_indx (A)
            (VA = 0)                            (VA = 1)
            r:25 er:25                          r:23 er:23
            l:1 el:2                            l:1 el:2
            p:0 ep:2                            p:0 ep:2

The query optimizer produces the estimated numbers; the actual numbers are the result of the query execution.

This output shows the bottom-left IndexScan operator (on the titleauthors table) has estimated (er:) and actual (r:) row counts of 25, which means the optimizer’s estimate was correct. However, the row count estimates for the top MergeJoin (VA = 5) are incorrect: the query processor’s estimate is 342 but the actual row count is 25.

You may be able to more accurately controle the query processor’s estimates by keeping the statistics up to date, or by increasing the number of histogram steps. Use set option show_missing_stats on to verify whether join columns have histograms. You may be able to improve the query processor’s estimates by creating histograms, if none exist.

If an estimated row count is 25 but the actual row count is 30, this is not necessarily an indication that the query processor’s estimates are incorrect. When comparing estimated and actual values, look for “order-of-magnitude differences” (such as 25 versus 342 in the example above).

The query processor displays the name of the index (not the table name) for IndexScan operator nodes. To determine which table is associated with an operator node:

The index name may uniquely identify the table.
The operator node output includes a correlation name if the correlation name was included in the query. For example, in the above output, the “(TA)” in the IndexScan operator, refers to “titleauthor TA” from the SQL query.

The query tree and the showplan output include “(VA=n)”, where n = 0, 1, 2, and so on, and uniquely identifies each operator node.