Example 7

Most characters are represented by code points in the range [0x20, 0xFFFF], and can be represented with a single 16-bit value. A surrogate pair is a pair of 16 bit values that represent a character in the range [0x010000..0x10FFFF]. The first half of the pair is in the range [0xD800..0xDBFF], and the second half of the pair is in the range [0xDC00..0xDFFF]. Such a pair (H, L) represents the character computed as follows (hex arithmetic):

(H - 0xD800) * 400 + (L – 0xDC00) 

For example, the character “&#x01D6D1” is a lower-case bold mathematical symbol, represented by the surrogate pair D835, DED1:

select convert(unitext, u&'\+1d6d1')
--------------------- 
0xd835ded1 

When you specify ncr=non_ascii or ncr=non_server to generate a SQLX XML document containing non-ASCII data with surrogate pair characters, the surrogate pairs appear as single NCR characters, not as pairs:

select convert(unitext, u&'\+1d6d1') 
for xml option 'ncr=non_ascii"
-------------------------------
<resultset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">      <row>        
    <C1>&#x1d6d1;</C1>    
 </row>
</resultset>