Encoding responses and double-byte characters

When you compile a Java servlet, the characters are encoded according to the locale of your machine unless you specify encoding in the javac compile command. When a client sends a request from a browser, the parameters are always ISO 8859-1 encoded.

To provide a client’s browser with the encoding information it needs to translate the content of a response correctly, declare the encoding in the response header. If you specify the content type without the encoding information, for instance:

response.setContentType("text.html");

the client’s browser assumes that the content is ISO 8859-1 encoded. If the content has been encoded using some other standard, the client’s browser does not translate the data correctly. This example specifies the double-byte character set big5, the encoding name of traditional Chinese characters:

response.setContentType("text/html;charset=big5");

To encode the response content, compile the servlet with this encoding option:

javac -encode iso-8859-1 <java source file>

or convert static strings within the servlet code, for instance:

String origMsg = "<double-byte character string>";
String newMsg = new String(origMsg.getBytes(),                            "iso-8859-1");