Using multibyte collations

This section describes how multibyte character sets are handled and applies to the supported collations.

Sybase IQ provides collations using several multibyte character sets.

Sybase IQ supports variable-width character sets. In these sets, some characters are represented by one byte, and some by more than one, to a maximum of four bytes. The value of the first byte in any character indicates the number of bytes used for that character, and also indicates whether the character is a space character, a digit, or an alphabetic (alpha) character.

For the UTF8 collation, UTF-8 characters are represented by one to four bytes. For other multibyte collations, one or two bytes are used. For all provided multibyte collations, characters comprising two or more bytes are considered to be “alphabetic”, such that they can be used in identifiers without requiring double quotes.

Sybase IQ does not support 16-bit or 32-bit character sets such as UTF-16 or UTF-32.

All client libraries other than embedded SQL are Unicode-enabled, using the UTF-16 encoding. Translation occurs between the client and the server.

Japanese language support

Sybase recommends using collation 932JPN for Japanese Windows applications. Collation 932JPN supports loading 32-bit multibyte characters that cannot be loaded into SJIS or SJIS2. SJIS and SJIS2 are older collations. SJIS is available as an alternate collation. SJIS2 is no longer supported. For Unix applications, use EUC_JAPAN.

Thai language support

Sybase IQ provides the CP874toUTF8 utility to convert data files in CP874 format into UTF8, a collation supported by Sybase IQ for the Thai language. For syntax, see the Utility Guide. You can also load data in the CP874 character set without converting it to UTF8 using this utility.

The SORTKEY() function returns values in the sort order thaidict (Thai dictionary), the Thai character set in UTF8 form. The following statements generate the same result:

SELECT c1, SORTKEY(c1) from T1 where rid=3
SELECT c1, SORTKEY(c1, ‘thaidict’) from T1 where rid=3)
SELECT ‘\340\270\201\340\271\207’,SORTKEY(‘\340\279\201\340\271\207’) from T1 where rid=3

For more details, see “SORTKEY function [String]” in Chapter 4, “SQL Functions,” in Reference: Building Blocks, Tables, and Procedures.