Text Configuration Object Properties window: Settings tab

This tab has the following components:

  • Term breaker   Select one of the following algorithms to use for separating column values into terms:

    • Generic   The Generic algorithm treats as a term any string of one or more alphanumerics, separated by non-alphanumerics.

    • N-gram   The N-gram algorithm breaks the strings into n-grams. An n-gram is an n-character substring of a larger string. N-grams are useful for approximate matching or for documents that do not use a whitespace to separate terms.

    For more information about these algorithms and how to choose between them, see What to specify when creating or altering text configuration objects.

  • Minimum term length   Specifies the minimum length, in characters, of terms allowed in the text index. Terms that are shorter than this setting are ignored when building or refreshing the text index. See MINIMUM TERM LENGTH clause, ALTER TEXT CONFIGURATION statement.

  • Maximum term length   Specifies the maximum length, in characters, of terms allowed in the text index. Terms that are longer than this setting are ignored when building or refreshing the text index. See MAXIMUM TERM LENGTH clause, ALTER TEXT CONFIGURATION statement.

  • Use external term breaker   Select to specify an external library function to break the text into terms. This option appears when the Term Breaker is set to Generic. See TERM BREAKER clause, ALTER TEXT CONFIGURATION statement.

    • Function & library   Specifies the external term breaker function and library.

      The function and library must be specified in the form: function-name@library-file-name. If you want to use different function and library names on Windows and Unix platforms, you can specify them in the form: function-name@library-file-name.dll;UNIX:function-name@library-file-name.so.

      For example, TermBreakFunct1@myTBlib.dll;Unix:TermBreakFunct2@myTBlib calls the TermBreakFunct1 on Windows, and the TermBreakFunct2 on Unix.

  • Use external prefilter   Select to specify an external library function to perform document filtering before term breaker processing. An external prefilter is useful if the text you want to index contains formatting information and/or images. A prefilter allows you to convert the document to plain text by removing formatting information and images. See PREFILTER EXTERNAL NAME clause, ALTER TEXT CONFIGURATION statement.

    • Function & library   Specifies the external prefilter function and library.

      The function and library must be specified in the form: function-name@library-file-name. If you want to use different function and library names on Windows and Unix platforms, you can specify them in the form: function-name@library-file-name.dll;UNIX:function-name@library-file-name.so.

      For example, PrefilterFunct1@myTBlib.dll;Unix:PrefilterFunct2@myPreFilterlib calls the PrefilterFunct1 on Windows, and the PrefilterFunct2 on Unix.

 See also