When you load data into a table, Sybase IQ physically stores data by column rather than by row, for each column in the table. The column orientation gives IQ indexes important advantages over traditional row-based indexing. Logically, the data can still be accessed row-by-row, just as in more traditional row-based SQL databases.Column storage structures your data according to the attributes you are interested in tracking. In a data warehousing environment, you typically look at specific attributes of thousands or millions of rows of data, rather than complete, single rows of data that are traditionally the focus in transaction processing. Column storage optimizes your ability to perform selections or calculations on the attributes you care about.
The default column storage structure that Sybase IQ creates for each column is actually an index optimized for storing and projecting data. Depending on the size of your database, the disk space available to you, and the type of queries your users submit, you almost certainly want to supplement this default index with one or more of the Sybase IQ bitwise index types. You can choose from several column index types. The column indexes you define are created as part of each individual table. Create join indexes with care; they add significant load, update, and delete costs.
Besides column indexes, Sybase IQ lets you define join indexes. Join indexes are optimized for joining related tables. You may want to create a join index for any set of tables that your users commonly join to resolve queries. Column indexes underlie any join indexes involving those columns.
The first half of this chapter discusses column indexes. The second half of this chapter discusses join indexes. See “Using join indexes” for details.
A default index that optimizes projections is created by Sybase IQ for all columns.
Columns with fewer than 16 million unique values can be stored in an optimized default index that significantly reduces storage requirements. This format supports improved performance by the IQ optimizer and for the aggregate functions SUM, SUM DISTINCT, MAX, MIN, and COUNT DISTINCT. It is available for:
Any column where IQ UNIQUE() is specified
All columns created when the MINIMIZE_STORAGE database option is ON
To achieve maximum query performance, however, you should choose one or more additional index types illustrated in Table 6-1 for most columns that best represent the cardinality and usage of column data:
Index type |
Description |
---|---|
Compare or CMP |
Stores the binary comparison (<, >, or =) of any two distinct columns with identical data types, precision, and scale. |
DATE |
An index on columns of data type DATE used to process queries involving date quantities. |
Datetime or DTTM |
An index on columns of data type DATETIME or TIMESTAMP used to process queries involving datetime quantities. |
High_Group or HG |
An enhanced B-tree index used to process equality and group by operations on high-cardinality data (recommended for more than 1,000 distinct values). |
High_Non_Group or HNG |
A non value-based bitmap index ideal for most high-cardinality decision support operations involving ranges or aggregates. |
Low_Fast or LF |
A value-based bitmap index for processing queries on low-cardinality data (recommended for up to 1,000 distinct values but can support up to 10,000 distinct values.) |
TIME |
An index on columns of data type TIME used to process queries involving time quantities. |
WD |
Used to index keywords by treating the contents of a CHAR, VARCHAR, or LONG VARCHAR column as a delimited list. |
TEXT |
Used to index terms (words) and their positions. Provides ability to search for individual terms, phrases, pairs of terms within specified distances, and given order, as well as combinations of these conditions. |
Select column indexes according to the type of data in the column and your intended operations for the column data. In general, you can use any index or combination of indexes on any column. However, there are some exceptions.
When a table is created with the DATE data type, an optimized two-byte FP index is created on the DATE field, which is independent of the settings in database option MINIMIZE_STORAGE.
If you want to create a three-byte FP or flat-style FP index on the DATE field, use the following IQ UNIQUE values when creating the table:
For a three-byte FP — IQ UNIQUE should be between 65537 and 16777216.
For flat-style FP — IQ UNIQUE should be higher than 16777216.
To take advantage of the High_Non_Group index types for columns with nonintegral numeric data, use the NUMERIC or DECIMAL data types, which support up to 254 digits of precision.
Some index types have duplicate functionality; creating unnecessary indexes wastes disk space. Read the sections that follow for details on how to select an index.
When a column is designated as FOREIGN KEY, PRIMARY KEY, or UNIQUE, Sybase IQ creates a High_Group index for it automatically. For each foreign key, Sybase IQ creates a non-unique High_Group index.
You can also create a High_Group index on a set of columns explicitly. For details, see CREATE INDEX statement in Reference: Statements and Options.
You may also want to define additional indexes on your columns for best performance. Sybase IQ uses the fastest index available for the current query or join predicate. If you do not create the correct types of indexes for a column, Sybase IQ can still resolve queries involving the column, but response may be slower than it would be with the correct index types.
If multiple indexes are defined on a particular column, Sybase IQ builds all the indexes for that column from the same input data.
If you set the INDEX_ADVISOR option on your database, Sybase IQ issues messages in the message log or query plan to suggest additional indexes that might improve performance. Messages focus on the following areas:
Local predicate columns
Single-column join key columns
Correlated subquery columns
Grouping columns
For details, see “INDEX_ADVISOR option” in Chapter 2, “Database Options,” of Reference: Statements and Options.
If you decide to follow the recommendations, you create the indexes yourself.
If you discover later that you need additional indexes, it is simple to add them; however, it is usually faster to create all necessary indexes before you insert any data.
You can only rename or alter an index in a base table or global temporary table with the owner type USER. See the “ALTER INDEX statement,” Chapter 1, “SQL Statements,” in Reference: Statements and Options for more information on renaming indexes and changing foreign key role names.
You can drop any optional index if you decide that you do not need it. See the DROP INDEX command in “DROP statement,” Chapter 1, “SQL Statements,” Reference: Statements and Options for more information on dropping indexes.
You may want to remove a foreign key constraint, but retain the underlying HG index. A non-unique HG index can provide query performance improvement, but may be expensive to build.
Note that ALTER TABLE DROP FOREIGN KEY CONSTRAINT does not remove the automatically-created non-unique HG index. You cannot drop a primary key if associated foreign keys remain. To remove such an index, drop it explicitly after issuing the ALTER TABLE DROP FOREIGN KEY command.