CONTAINS search condition

Syntax
CONTAINS ( column-name [,...], query-string )
query-string :
simple-expression
| binary-expression
simple-expression :
term
| " phrase "
| ( simple-expression )
| ( binary-expression )
| FUZZY " fuzzy-expression "
binary-expression :
simple-expression { AND | & } query-string
| simple-expression query-string
| simple-expression AND NOT query-string
| simple-expression { OR | | } query-string
| simple-expression { NEAR  | ~ } query-string
| simple-expression NEAR [distance] query-string
term :
simple-term
| simple-term*
phrase :
simple-term
| simple-term*
| simple-term phrase
| simple-term* phrase
fuzzy-expression :
simple-term
| simple-term fuzzy-expression
simple-term : A string separated by whitespace and special characters that
       represents a single indexed term (word) to search for.
distance : a positive integer
Note

The use of the word binary in the syntax does not imply binary data. Instead, it means two expressions separated by an operator or proximity keyword.

Remarks

The CONTAINS search condition takes a column list and query-string as arguments. It can be used anywhere a predicate can, and returns TRUE or FALSE. query-string must be a constant string, or a variable, with a value that is known at query time.

If multiple columns are specified, then they must all refer to a single base table and must be present in a single text index. The base table can be referenced directly in the FROM clause, or it can occur in a view or derived table provided that the view or derived table does not use DISTINCT, GROUP BY, ORDER BY, UNION, INTERSECT, EXCEPT, or a row limitation.

In full text search, term is typically synonymous with word. For example, when you use the GENERIC term breaker algorithm (the default), a term is defined as a string of alphanumeric characters, with a non-alphanumeric character on each side. However, a term does not need to be a language word. Also, depending on the term breaker algorithm used, the value may be parsed differently into terms. For more information about the term breaker algorithms, see Altering text configuration objects.

Fuzzy matching is only supported when the term breaker is NGRAM.

The query-string of a CONTAINS search condition is parsed in several steps:

  1. In this step, the database server breaks query-string into query terms separated by whitespace, special characters, and keywords that have meaning to the CONTAINS grammar. For example, it separates the query-string 'hasn't AND will' into the query term hasn't, followed by the keyword AND, followed by the query term will. Note that the apostrophe is not one of the special characters in the CONTAINS grammar.

    The complete list of special characters and keywords used by the CONTAINS grammar is as follows:

    • &
    • |
    • "
    • (
    • )
    • ~
    • [
    • ]
    • -
    • AND
    • OR
    • NOT
    • NEAR
    • FUZZY
  2. Each query term from Step 1 is broken into one or more index terms using the term breaking configuration of the index being searched. If a single query term breaks into multiple index terms, then it is treated as a phrase. For example, the GENERIC term breaker breaks the query term hasn't into the two index terms: hasn and t. This is because the apostrophe is not an alpha-numeric and this is one of the conditions that the GENERIC term breaker uses to separate terms. These two index terms are then used exactly as if they had originally appeared in the query string as the two term phrase "hasn t".

    When breaking query terms with the term breaker, the configured stoplist and minimum and maximum term length for the term breaker are applied as usual, but dropping any term is considered an error. For example, if the MINIMUM TERM LENGTH is set to 2, then the 't' from the above example will be dropped. Therefore an attempt to use the query term 'hasn't' gives an error.

    Consider another example using the NGRAM term breaker to break query terms into index terms. The NGRAM term breaker breaks strings into n-grams of length n, where n is the value of the MAXIMUM TERM LENGTH setting. Using a 3-gram term breaker, the query term apple is broken into the following index terms: app, ppl, ple. These are treated the same as the phrase "app ppl ple".

  3. The CONTAINS grammar is used to parse the list of keywords, query terms, and special characters from the first step, with query terms replaced by the index terms and phrases generated by Step 2.
See also