Table 2-14 describes how Japanese characters are represented in supported character sets, and how their lengths are affected.
Character set |
SBCS or DBCS |
Datatype |
Length considerations |
Example |
---|---|---|---|---|
EUC-JIS |
DBCS (hankaku katakana) |
character |
Each 1-byte hankaku katakana character is preceded by a 1-byte SS2 indicator. As a result, each eucjis hankaku katakana character has a length of 2: the SS2 indicator and the hankaku katakana itself. |
A string of 4 hankaku katakana occupies 8 bytes and has a length of 8. |
EUC-JIS |
DBCS (kanji) |
character |
Each kanji character is 2 bytes long and has a length of 2. Kanji and single-byte alphabetic characters can be mixed. When converting mixed strings from IBM Kanji to workstation kanji, double the length to be safe. |
A string of 4 kanji occupies 8 bytes and has a length of 8. |
Shift-JIS |
SBCS (hankaku katakana) |
character |
Each hankaku katakana character is 1 byte long and has a length of 1. Shift-JIS hankaku katakana does not use SS2 indicators. |
A string of 4 hankaku katakana occupies 4 bytes and has a length of 4. |
Shift-JIS |
DBCS (kanji) |
character |
Each kanji character is 2 bytes long and has a length of 2. Kanji and single-byte alphabetic characters can be mixed. When converting mixed strings from IBM Kanji to workstation kanji, double the length to be safe. |
A string of 4 kanji occupies 8 bytes and has a length of 8. |
IBM Kanji |
DBCS |
character |
Each kanji character is 2 bytes long and has a length of 2. Each kanji string is preceded by a Shift Out indicator and followed by a Shift In indicator, adding two to the length of each kanji string. Kanji and single-byte alphabetic characters can be mixed. When converting mixed strings from IBM Kanji to workstation kanji, double the length to be safe. |
A string of 4 kanji occupies 10 bytes and has a length of 10. (8 bytes for the data and 2 bytes for the SO/SI codes) |
IBM Kanji kanji |
DBCS |
graphic |
Each kanji character is a double-byte character and has a length of 1. There are no SO/SI indicators with graphic data. |
A string of 4 kanji occupies 8 bytes and has a length of 4. |
IBM Kanji hankaku katakana |
SBCS |
character |
Each hankaku katakana character is 1 byte long and has a length of 1. IBM Kanji hankaku katakana does not use SS2 indicators. |
A string of 4 hankaku katakana occupies 4 bytes and has a length of 4. |
Copyright © 2005. Sybase Inc. All rights reserved. |
![]() |