Select Correct Encoding to Display Unicode Characters

Characters display incorrectly unless you select the correct “endianness” type for the character set encoding for Unicode files that have a byte order marker (BOM), when you load character data using the “Text Data Provider” or “Text Data Sink” components.

To display the Unicode characters correctly, in the Character Encoding field of the component configuration window, select the character set encoding with the correct endianness type for character data. For example, select:

  • UTF-16LE – to process text files encoded in UTF-16LE that have a BOM at the beginning of the file where LE means “little-endian” since the BOM is at the beginning of the file.

  • UTF-16BE – to process text files encoded in UTF-16BE with a BOM at the end of the file where BE means “big-endian” since the BOM is at the end of the file.