Choosing an index for a range query

Assume that you need to improve the performance of the following query:

select title
from titles
where price between $20.00 and $30.00

Some basic statistics on the table are:

The table has 1,000,000 rows, and uses allpages locking.
There are 10 rows per page; pages are 75 percent full, so the table has approximately 135,000 pages.
190,000 (19%) of the titles are priced between $20 and $30.

With no index, the query would scan all 135,000 pages.

With a clustered index on price, the query would find the first $20 book and begin reading sequentially until it gets to the last $30 book. With pages about 75 percent full, the average number of rows per page is 7.5. To read 190,000 matching rows, the query would read approximately 25,300 pages, plus 3 or 4 index pages.

With a nonclustered index on price and random distribution of price values, using the index to find the rows for this query requires reading about 19 percent of the leaf level of the index; about 1,500 pages.

If the price values are randomly distributed, the number of data pages that must be read is likely to be high, perhaps as many data pages as there are qualifying rows, 190,000. Since a table scan requires only 135,000 pages, you would not want to use a nonclustered index.

Another choice is a nonclustered index on price, title. The query can perform a matching index scan, using the index to find the first page with a price of $20, and then scanning forward on the leaf level until it finds a price of more than $30. This index requires about 35,700 leaf pages, so to scan the matching leaf pages requires reading about 19 percent of the pages of this index, or about 6,800 reads.

For this query, the covering nonclustered index on price, title is best.