The UTF-16 encoding of Unicode includes “surrogate pairs,” which are pairs of 16-bit values that represent infrequently used characters.
Additional checking is built in to SAP ASE to ensure the integrity of surrogate pairs. You can switch this checking off by setting the enable surrogate processing configuration parameter to 0. This yields slightly higher performance, although the integrity of surrogate pairs is no longer guaranteed.
Unicode also defines “normalization,” which is the process by which all possible representations of a single character are transformed into a single representation. Many base characters followed by combining diacritical marks are equivalent to precomposed characters, although their bit patterns are different. For example, the following two sequences are equivalent:
0x00E9 -- é (LATIN SMALL LETTER E WITH ACUTE)
0x00650301 -- e (LATIN SMALL LETTER E), ´ (COMBINING ACUTE ACCENT)
The enable unicode normalization configuration parameter controls whether or not SAP ASE normalizes incoming Unicode data.
Significant performance increases are possible when the default Unicode sortorder is set to “binary” and the enable Unicode normalization configuration parameter is set to 1. This combination allows SAP ASE to make several assumptions about the nature of the Unicode data, and code has been implemented to take advantage of these assumptions.