 
Table E-1lists the suggested charset(s) for a number of languages. Charsets are used by servlets that generate multilingual output; they determine which character encoding a servlet's PrintWriter is to use. By default, the PrintWriter uses the ISO-8859-1 (Latin-1) charset, appropriate for most Western European languages. To specify an alternate charset, the charset value must be passed to the setContentType() method before the servlet retrieves its PrintWriter. For example:
res.setContentType("text/html; charset=Shift_JIS");  // A Japanese charset
PrintWriter out = res.getWriter();  // Writes Shift_JIS Japanese
Note that not all web browsers support all charsets or have the fonts available to represent all characters, although at minimum all clients support ISO-8859-1. Also, the UTF-8 charset can represent all Unicode characters and may be assumed a viable alternative for all languages.
| Language | Language Code | Suggested Charsets | 
|---|---|---|
| Albanian | sq | ISO-8859-2 | 
| Arabic | ar | ISO-8859-6 | 
| Bulgarian | bg | ISO-8859-5 | 
| Byelorussian | be | ISO-8859-5 | 
| Catalan (Spanish) | ca | ISO-8859-1 | 
| Chinese (Simplified/Mainland) | zh | GB2312 | 
| Chinese (Traditional/Taiwan) | zh (country TW) | Big5 | 
| Croatian | hr | ISO-8859-2 | 
| Czech | cs | ISO-8859-2 | 
| Danish | da | ISO-8859-1 | 
| Dutch | nl | ISO-8859-1 | 
| English | en | ISO-8859-1 | 
| Estonian | et | ISO-8859-1 | 
| Finnish | fi | ISO-8859-1 | 
| French | fr | ISO-8859-1 | 
| German | de | ISO-8859-1 | 
| Greek | el | ISO-8859-7 | 
| Hebrew | he (formerly iw) | ISO-8859-8 | 
| Hungarian | hu | ISO-8859-2 | 
| Icelandic | is | ISO-8859-1 | 
| Italian | it | ISO-8859-1 | 
| Japanese | ja | Shift_JIS, ISO-2022-JP, EUC-JP[1] | 
| Korean | ko | EUC-KR[2] | 
| Latvian, Lettish | lv | ISO-8859-2 | 
| Lithuanian | lt | ISO-8859-2 | 
| Macedonian | mk | ISO-8859-5 | 
| Norwegian | no | ISO-8859-1 | 
| Polish | pl | ISO-8859-2 | 
| Portuguese | pt | ISO-8859-1 | 
| Romanian | ro | ISO-8859-2 | 
| Russian | ru | ISO-8859-5, KOI8-R | 
| Serbian | sr | ISO-8859-5, KOI8-R | 
| Serbo-Croatian | sh | ISO-8859-5, ISO-8859-2, KOI8-R | 
| Slovak | sk | ISO-8859-2 | 
| Slovenian | sl | ISO-8859-2 | 
| Spanish | es | ISO-8859-1 | 
| Swedish | sv | ISO-8859-1 | 
| Turkish | tr | ISO-8859-9 | 
| Ukranian | uk | 
[1] First supported in JDK 1.1.6. Earlier versions of the JDK know the EUC-JP character set by the name EUCJIS, so for portability you can set the character set to EUC-JP and manually construct an EUCJIS PrintWriter.
[2] First supported in JDK 1.1.6. Earlier versions of the JDK know the EUC-KR character set by the name KSC_5601, so for portability you can set the character set to EUC-KR and manually construct a KSC_5601 PrintWriter.

Copyright © 2001 O'Reilly & Associates. All rights reserved.