CONTAINS conditions

Perform a full text query using the CONTAINS clause in the FROM clause of a SELECT statement, or by using the CONTAINS search condition (predicate) in a WHERE clause. Both methods return the same rows; however, the CONTAINS clause also returns scores for the matching rows.

Syntax

CONTAINS ( column-name [,...], contains-query-string )
contains-query-string:
	simple-expression
	| or-expression
simple-expression:
	primary-expression
	| and-expression
or-expression:
	simple-expression { OR | | } contains-query-string
primary-expression:
	basic-expression
	| FUZZY " fuzzy-expression "
	| and-not-expression
and-expression:
	primary-expression [ AND | & ] simple-expression
and-not-expression:
	primary-expression [ AND | & ] { NOT | - } basic-expression
basic-expression:
	term
	| phrase
	| ( contains-query-string )
	| proximity-expression
fuzzy-expression:
	term
	| fuzzy-expression term
term:
	simple-term
	| prefix-term
prefix-term: 
	simple-term*
phrase:
	" phrase-string "
proximity-expression:
	term ( BEFORE | NEAR) [minimum distance, | maximum distance ]
 term
	| term { BEFORE | NEAR | ~ } term
phrase-string:
	term
	| phrase-string term

Parameters

Remarks

The CONTAINS search condition takes a column list and contains-query-string as arguments. It can be used anywhere a search condition (also referred to as predicate) can be specified, and returns TRUE or FALSE. contains-query-string must be a constant string, or a variable, with a value that is known at query time.

If multiple columns are specified, then they must all refer to a single base table; a TEXT index cannot span multiple base tables. You can reference the base directly in the FROM clause, or use it in a view or derived table, provided that the view or derived table does not use DISTINCT, GROUP BY, ORDER BY, UNION, INTERSECT, EXCEPT, or a row limitation.

Queries using ANSI join syntax are supported (FULL OUTER JOIN, RIGHT OUTER JOIN, LEFT OUTER JOIN), but may have suboptimal performance. Use outer joins for CONTAINS in the FROM clause only if the score column from each of the CONTAINS clauses is required. Otherwise, move CONTAINS to an ON condition or WHERE clause.

These types of queries are unsupported:

If you use a SQL variable less than 32KB in length as a search term and the type of variable is LONG VARCHAR, use CAST to convert the variable to VARCHAR data type. For example:

SELECT * FROM tab1 WHERE CONTAINS(c1, cast(v1 AS VARCHAR(64))

The following warnings apply to the use of non-alphanumeric characters in query strings:

Within phrases, the asterisk is the only special character that continues to be interpreted as a special character. All other special characters within phrases are treated as white space and serve as term breakers.

Interpretation of contains-query-string takes place in two main steps:

Operator precedence in a CONTAINS search condition

During query evaluation, expressions are evaluated using the following order of precedence:

  1. FUZZY, NEAR

  2. AND NOT

  3. AND

  4. OR

Allowed syntax for asterisk (*)

The asterisk is used for prefix searching. An asterisk can occur at the end of the query string, or be followed by a space, ampersand, vertical bar, closing bracket, or closing quotation mark. Any other usage of asterisk returns an error.

Table 4-1 shows allowable asterisk usage:

Table 4-1: Asterisk interpretations

Query string

Equivalent to

Interpreted as

'th*&best'

'th* AND best' and 'th* best'

Find any term beginning with th, and the term best.

'th*|best'

'th* OR best'

Find either any term beginning with th, or the term best.

'very&(best|th*)'

'very AND (best OR th*)'

Find the term very, and the term best or any term beginning with th.

'"fast auto*"'

Find the term fast, immediately followed by a term beginning with auto.

'"auto* price"'

Find a term beginning with auto, immediately followed by the term price.

NoteInterpretation of query strings containing asterisks varies depending on the text configuration object settings.

Allowed syntax for hyphen (-)

The hyphen can be used for term or expression negation, and is equivalent to NOT. Whether a hyphen is interpreted as a negation depends on its location in the query string. For example, when a hyphen immediately precedes a term or expression, it is interpreted as a negation. If the hyphen is embedded within a term, it is interpreted as a hyphen.

A hyphen used for negation must be preceded by a white space, and followed immediately by an expression.

When used in a phrase of a fuzzy expression, the hyphen is treated as white space and used as a term breaker.

Table 4-2 shows the allowed syntax for hyphen:

Table 4-2: Hyphen interpretations

Query string

Equivalent to

Interpreted as

'the -best'

'the AND NOT best', 'the AND -best', 'the & -best', 'the NOT best'

Find the term the, and not the term best.

'the -(very best)'

'the AND NOT (very AND best)'

Find the term the, and not the terms very and best.

'the -"very best"'

'the AND NOT "very best"'

Find the term the, and not the phrase very best.

'alpha-numerics'

'"alpha numerics"'

Find the term alpha, immediately followed by the term numerics.

'wild - west'

'wild west', and 'wild AND west'

Find the term wild, and the term west.

Allowed syntax for special characters

Table 4-3 shows the allowed syntax for all special characters except asterisk and hyphen. These characters are not considered special characters, if they are found in a phrase, and are dropped.

NoteThe restrictions on specifying string literals also apply to the query string. For example, apostrophes must be within an escape sequence.

Table 4-3: Special character interpretations

Character or syntax

Usage examples and remarks

ampersand (&)

The ampersand is equivalent to AND, and can be specified as follows:

  • 'a & b'

  • 'a &b'

  • 'a& b'

  • 'a&b'

vertical bar (|)

The vertical bar is equivalent to OR, and can be specified as follows:

  • 'a| b'

  • 'a |b'

  • 'a | b

  • 'a|b'

double quotes (")

Double quotes are used to contain a sequence of terms where order and relative distance are important. For example, in the query string 'learn "full text search"', "full text search" is a phrase. In this example, learn can come before or after the phrase, or exist in another column (if the TEXT index is built on more than one column), but the exact phrase must be found in a single column.

parentheses ()

Parentheses are used to specify the order of evaluation of expressions, if different from the default order. For example, 'a AND (b|c)' is interpreted as a, and b or c.

tilde (~)

The tilde is equivalent to NEAR, and has no special syntax rules. The query string 'full~text' is equivalent to 'full NEAR text', and is interpreted as: the term full within ten terms of the term text.

square brackets [ ]

Square brackets are used in conjunction with the keyword NEAR to contain distance. Other uses of square brackets return an error.


Effect of dropped terms

TEXT indexes are built according to the settings defined for the text configuration object used to create the TEXT index. ATEXT index excludes terms that meet any of the following conditions:

The same rules apply to query strings. The dropped term can match zero or more terms at the end or beginning of the phrase. For example, suppose the term 'the' is in the stop list:

NoteIf all of the terms for which you are searching are dropped, Sybase IQ returns the error CONTAINS has NULL search term. SQL Anywhere reports no error and returns zero rows.


CONTAINS table expressions

When you include a CONTAINS clause in the FROM clause of a query, each match has a score associated with it. The score indicates how close the match is, and you can use score information to sort the data. Two main criteria determine score:

Depending on the type of full text search, other criteria affect scoring. For example, in proximity searches, the proximity of search terms impacts scoring. By default, the result set of a CONTAINS clause has the correlation name contains that has a single column in it called score. You can refer to "contains".score in the SELECT list, ORDER BY clause, or other parts of the query. However, because contains is a SQL reserved word, you must remember to put it in double quotes. Alternatively, you can specify another correlation name, for example, CONTAINS ( expression ) AS ct. The examples for full text search refer to the score column as ct.score.

This statement searches MarketingInformation.Description for terms starting with ‘stretch” or terms starting with “comfort”:

SELECT ID, ct.score, Description
FROM MarketingInformation
CONTAINS ( MarketingInformation.Description,
           'stretch* | comfort*' )
AS ct ORDER BY ct.score DESC;

NoteThe SQL Anywhere documentation provides full text search examples. Not all of these examples apply to Sybase IQ 15.2. For example, Sybase IQ does not support text searches that are part of the IF search condition and does not allow fuzzy or NGRAM searches.

See also

”Types of full text searches”

in SQL Anywhere Server - SQL Usage.