Alters a text configuration object.
ALTER TEXT CONFIGURATION [ owner.]config-name STOPLIST stoplist-string | DROP STOPLIST | { MINIMUM | MAXIMUM } TERM LENGTH integer | TERM BREAKER { GENERIC [ EXTERNAL NAME external-call ] | NGRAM } | PREFILTER EXTERNAL NAME external-call | DROP PREFILTER | SAVE OPTION VALUES [ FROM CONNECTION ] }
external-call : '[ operating-system: ]library-function-name@library-name[;...]'
operating-system : UNIX
STOPLIST clause
Use this clause to create or replace the list of terms to ignore when building a text index. Using this text configuration
object, terms specified in this list are also ignored in a query. Separate stoplist terms with spaces. For example, STOPLIST 'because about therefore only'
. Stoplist terms cannot contain whitespace.
Samples of stoplists for different languages are located in the samples-dir\SQLAnywhere\SQL subdirectory. For the location of samples-dir, see Samples directory.
Stoplist terms should not contain non-alphanumeric characters. The stoplist length must be less than 8000 bytes.
Carefully consider whether you want to put terms in your stoplist. For more information, see Text configuration object settings.
DROP STOPLIST clause Use this clause to drop the stoplist for a text configuration object.
MINIMUM TERM LENGTH clause The value specified in the MINIMUM TERM LENGTH clause is ignored when using NGRAM text indexes.
The minimum length, in characters, of a term to include in the text index. Terms that are shorter than this setting are ignored when building or refreshing the text index. The value of this option must be greater than 0. If you set this option to be higher than MAXIMUM TERM LENGTH, the value of MAXIMUM TERM LENGTH is automatically adjusted to be the same as the new MINIMUM TERM LENGTH value.
MAXIMUM TERM LENGTH clause With NGRAM text indexes, use the MAXIMUM TERM LENGTH clause to set the size of the n-grams into which strings are broken.
With GENERIC text indexes, use the MAXIMUM TERM LENGTH clause to set the maximum length, in characters, of a term to include in the text index. Terms that are longer than this setting are ignored when building or refreshing the text index. The value of MAXIMUM TERM LENGTH must be less than or equal to 60. If you set this option to be lower than MINIMUM TERM LENGTH, the value of MINIMUM TERM LENGTH is automatically adjusted to be the same as the new MAXIMUM TERM LENGTH value.
TERM BREAKER clause The name of the algorithm to use for separating column values into terms. The choices are GENERIC (the default) or NGRAM.
GENERIC For GENERIC, you can use the built-in GENERIC term breaker algorithm by specifying TERM BREAKER GENERIC, or you can specify an external algorithm using the TERM BREAKER GENERIC EXTERNAL NAME clause.
The built-in GENERIC algorithm treats any string of one or more alphanumerics, separated by non-alphanumerics, as a term.
Specify the TERM BREAKER GENERIC EXTERNAL NAME clause to specify an entry point to a term breaker function in an external library. This is useful if you have custom requirements for how you want terms broken up before they are indexed or queried (for example, if you want an apostrophe to be considered as part of a term and not as a term breaker).
external-call can specify more than one function and/or library, and can include the file extension of the library, which is typically
.dll on Windows, and .so on Unix. In the absence of the file extension, the database server defaults to the platform-specific file extension for libraries.
For example, EXTERNAL NAME 'TermBreakFunct1@myTBlib;Unix:TermBreakFunct2@myTBlib'
calls the TermBreakFunct1 function from myTBlib.dll on Windows, and the TermBreakFunct2 function from myTBlib.so on Unix.
NGRAM The built-in NGRAM algorithm breaks strings into n-grams. An n-gram is an n-character substring of a larger string. The NGRAM term breaker is required for fuzzy (approximate) matching, or for documents that do not use whitespace or non-alphanumeric characters to separate terms, if no external term breaker is specified. For more information about these algorithms and how to choose between them, see Text configuration object settings.
PREFILTER EXTERNAL NAME clause Specify the PREFILTER EXTERNAL NAME clause to specify an entry point to a prefilter function in an external library. This is useful if text data needs to be extracted from binary data (for example, PDF). It is also useful if the text you want to index contains formatting information and/or images that you want to strip out before indexing the data (for example, HTML).
external-call can specify more than one function and/or library, and can include the file extension of the library, which is typically
.dll on Windows, and .so on Unix. In the absence of the file extension, the database server defaults to the platform-specific file extension for libraries.
For example, PREFILTER EXTERNAL NAME 'PrefilterFunct1@myPreFilterlib;Unix:PrefilterFunct2@myPreFilterlib'
calls the PrefilterFunct1 function from myPreFilterlib.dll on Windows, and the PrefilterFunct2 function from myPreFilterlib.so on Unix.
DROP PREFILTER clause Use the DROP PREFILTER clause to drop use of the specified prefiltering library for the text configuration object. This means that prefiltering is no longer performed when the database server builds indexes that use this text configuration object.
SAVE OPTION VALUES clause When a text configuration object is created, the current date_format, time_format, timestamp_format, and timestamp_with_time_zone_format database options reflect how DATE, TIME, and TIMESTAMP columns are saved with the text configuration object. Use the SAVE OPTION VALUES clause to update the option values saved for the text configuration object to reflect the options currently in effect for the connection. See How to alter a text configuration object.
Before changing the term length settings, read about the impact of various settings on what gets indexed and how query terms are interpreted. See Text configuration object settings, and Example text configuration objects.
Text indexes are dependent on a text configuration object. Before using this statement you must truncate dependent AUTO or MANUAL REFRESH text indexes, and drop any IMMEDIATE REFRESH text indexes.
To determine the text indexes that refer to a text configuration object, see How to view text index info in the database.
To view the settings for text configuration objects, query the SYSTEXTCONFIG system view. See SYSTEXTCONFIG system view.
When changing or dropping the external prefilter or term breaker, DBA authority is required (being the owner of the text configuration object is not enough).
For all other cases, you can be the owner of the text configuration object, or have DBA authority.
Automatic commit
SQL/2008 Vendor extension.
The following statements create a text configuration object, maxTerm16, and then change the maximum term length to 16:
CREATE TEXT CONFIGURATION maxTerm16 FROM default_char; ALTER TEXT CONFIGURATION maxTerm16 MAXIMUM TERM LENGTH 16; |
The following statement adds a stoplist to the maxTerm16 configuration object:
ALTER TEXT CONFIGURATION maxTerm16 STOPLIST 'because about therefore only'; |
The following statement configures an external term breaker for the myTextConfig text configuration object. Both the Windows and Unix interfaces are specified.
ALTER TEXT CONFIGURATION myTextConfig TERM BREAKER GENERIC EXTERNAL NAME 'my_termbreaker@termbreaker.dll;Unix:my_termbreaker@libtermbreaker_r.so' |
The following example configures an external prefilter for the myTextConfig text configuration object. Both the Windows and Unix interfaces are specified.
ALTER TEXT CONFIGURATION myTextConfig PREFILTER EXTERNAL NAME 'html_xml_filter@html_xml_filter.dll;UNIX:html_xml_filter@libhtml_xml_filter_r.so'; |
The following example drops the external prefilter for the myTextConfig text configuration object.
ALTER TEXT CONFIGURATION myTextConfig DROP PREFILTER; |
Discuss this page in DocCommentXchange.
|
Copyright © 2010, iAnywhere Solutions, Inc. - SQL Anywhere 12.0.0 |