Example text configuration objects

For in-depth descriptions of text configuration object settings and how they impact the contents of a text index and the results returned when querying a text index, see Text configuration object settings.

For a list of all text configuration objects in the database and the settings they contain, query the SYSTEXTCONFIG system view (for example, SELECT * FROM SYSTEXTCONFIG). See SYSTEXTCONFIG system view.

Default text configuration objects

SQL Anywhere provides two default text configuration objects, default_nchar and default_char for use with NCHAR and non-NCHAR data, respectively. These configurations are created the first time you attempt to create a text configuration object or text index. If you delete one by mistake, it is recreated the next time you attempt to create a text configuration object or text index.

The settings for default_char and default_nchar at the time of installation are shown in the table below. These settings were chosen because they were best suited for most character-based languages. It is strongly recommended that you do not change the settings in the default text configuration objects.

Setting Installed value
TERM BREAKER

0 (GENERIC)

MINIMUM TERM LENGTH 1
MAXIMUM TERM LENGTH 20
STOPLIST (empty)

If you delete a default text configuration object, it is automatically recreated the next time you create a text index or text configuration object. See DROP TEXT CONFIGURATION statement.

Example text configuration objects

The following table shows the settings for different text configuration objects and how the settings impact what is indexed and how a full text query string is interpreted. All the examples use the string 'I'm not sure I understand'.

Configuration settings Terms that are indexed Query interpretation

TERM BREAKER GENERIC

MINIMUM TERM LENGTH 1

MAXIMUM TERM LENGTH 20

STOPLIST ''

I m not sure I understand

"I m" AND not AND sure AND I AND understand'

TERM BREAKER GENERIC

MINIMUM TERM LENGTH 2

MAXIMUM TERM LENGTH 20

STOPLIST 'not and'

sure understand

'sure AND understand'.

TERM BREAKER NGRAM

MAXIMUM TERM LENGTH 3

STOPLIST 'not and'

sur ure und nde der ers rst sta tan

'sur AND ure AND und AND nde AND der AND ers AND rst AND sta AND tan'.

In the case of a fuzzy search:

'sur OR ure OR und OR nde OR der OR ers OR rst OR sta OR tan'

TERM BREAKER GENERIC

MINIMUM TERM LENGTH 1

MAXIMUM TERM LENGTH 20

STOPLIST 'not and'

I m sure I understand

'"I m" AND sure AND I AND understand'.

TERM BREAKER NGRAM

MAXIMUM TERM LENGTH 20

STOPLIST 'not and'

Nothing is indexed because no term is equal to or longer than 20 characters.

This illustrates how differently MAXIMUM TERM LENGTH impacts GENERIC and NGRAM text indexes; on NGRAM text indexes, MAXIMUM TERM LENGTH sets the length of the n-grams inserted into the text index.

The search returns an empty result set because no n-grams of 20 characters can be formed from the query string.

Example string interpretations

The following table provides examples of how the settings of the text configuration object strings are interpreted.

The parenthetical numbers in the Interpreted string column reflect the position information stored for each term. The numbers are for illustration purposes in the documentation. The actual stored terms do not include the parenthetical numbers.

Configuration settings String Interpreted String

TERM BREAKER GENERIC

MINIMUM TERM LENGTH 3

MAXIMUM TERM LENGTH 20

'w*'

'"w*(1)"'

'we*'

'"we*(1)"'

'wea*'

'"wea*(1)"'

'we* -the'

'"we*(1)" -"the(1)"'

'we* the'

"we*(1)" & "the(1)"'

'for* | wonderl*'

'"for*(1)" | "wonderl*(1)"'

'wonderlandwonderlandwonderland*'

''

'"tr* weather"'

'"weather(1)"'

'"tr* the weather"'

'"the(1) weather(2)"'

'"wonderlandwonderlandwonderland* wonderland"'

'"wonderland(1)"'

'"wonderlandwonderlandwonderland* weather"'

'"weather(1)"'

'"the_wonderlandwonderlandwonderland* weather"'

'"the(1) weather(3)"'

'the_wonderlandwonderlandwonderland* weather'

'"the(1)" & "weather(1)"'

'"light_a* the end" & tunnel'

'"light(1) the(3) end(4)" & "tunnel(1)"'

light_b* the end" & tunnel'

'"light(1) the(3) end(4)" & "tunnel(1)"'

'"light_at_b* end"'

'"light(1) end(4)"'

'and-te*'

'"and(1) te*(2)"'

'a_long_and_t* & journey'

'"long(2) and(3) t*(4)" & "journey(1)"'

TERM BREAKER NGRAM

MAXIMUM TERM LENGTH 3

'w*'

'"w*(1)"'

'we*'

'"we*(1)"'

'wea*

'"wea(1)"'

'we* -the'

'"we*(1)" -"the(1)"'

'we* the'

'"we*(1)" & "the(1)"'

'for | la*'

'"for(1)" | "la*(1)"'

'weath*'

'"wea(1) eat(2) ath(3)"'

'"ful weat*"'

'"ful(1) wea(2) eat(3)"'

'"wo* la*"'

'"wo*(1)" & "la*(2)"'

'"la* won* "'

'"la*(1)" & "won(2)"'

'"won* weat*"'

'"won(1)" & "wea(2) eat(3)"'

'"won* weat"'

'"won(1)" & "wea(2) eat(3)"'

'"wo* weat*"'

'"wo*(1)" & "wea(2) eat(3)"'

'"weat* wo* "'

'"wea(1) eat(2)" & "wo*(3)"'

'"wo* weat"'

'"wo*(1)" & "wea(2) eat(3)"'

'"weat wo* "'

'"wea(1) eat(2) wo*(3)"'

'w* NEAR[1] f*'

'"w*(1)" & "f*(1)"'

'weat* NEAR[1] f*'

"wea(1) eat(2)" & "f*(1)"'

'f* NEAR[1] weat*'

'"f*(1)" & "wea(1) eat(2)"'

'weat NEAR[1] f*'

'"wea(1) eat(2)" & "f*(1)"'

'f* NEAR[1] weat'

'"f*(1)" & "wea(1) eat(2)"'

'for NEAR[1] weat*'

'"for(1)" & "wea(1) eat(2)"'

'weat* NEAR[1] for'

'"wea(1) eat(2)" & "for(1)"'

'and_tedi*'

'"and(1) ted(2) edi(3)"'

'and-t*'

'"and(1) t*(2)"'

'"and_tedi*"'

'"and(1) ted(2) edi(3)"'

'"and-t*"'

'"and(1) t*(2)"'

'"ligh* at_the_end of_the tun* nel"'

'"lig(1) igh(2)" & ("the(4) end(5) the(7) tun(8)" & "nel(9)")'

'"ligh* at_the_end_of_the_tun* nel"'

'"lig(1) igh(2)" & ("the(4) end(5) the(7) tun(8)" & "nel(9)")'

'"at_the_end of_the tun* ligh* nel"'

'"the(2) end(3) the(5) tun(6)" & ("lig(7) igh(8)" & "nel(9)")'

'l* NEAR[1] and_t*'

"l*(1)" & "and(1) t*(2)"'

'long NEAR[1] and_t*'

'"lon(1) ong(2)" & "and(1) t*(2)"'

'end NEAR[3] tunne*'

'"end(1)" & "tun(1) unn(2) nne(3)"'

TERM BREAKER NGRAM

MAXIMUM TERM LENGTH 3

SKIPPED TOKENS IN TABLE AND IN QUERIES

'"cat in a hat"'

'"cat(1) hat(4)"'

'"cat in_a hat"'

'"cat(1) hat(4)"'

'"cat_in_a_hat"'

'"cat(1) hat(4)"'

'"cat_in a_hat"'

'"cat(1) hat(4)"'

'cat in a hat'

'"cat(1)" & "hat(1)"'

'cat in_a hat'

'"cat(1)" & "hat(1)"'

'"ice hat"'

'"ice(1) hat(2)"'

'ice NEAR[1] hat'

'"ice(1)" NEAR[1] "hat(1)"'

'ear NEAR[2] hat'

'"ear(1)" NEAR[2] "hat(1)"'

'"ear a hat"'

'"ear(1) hat(3)"'

'"cat hat"'

'"cat(1) hat(2)"'

'cat NEAR[1] hat'

'"cat(1)" NEAR[1] "hat(1)"'

'ear NEAR[1] hat'

'"ear(1)" NEAR[1] "hat(1)"'

'"ear hat"'

'"ear(1) hat(2)"'

'"wear a a hat"'

'"wea(1) ear(2) hat(5)"'

See also