 |
Advice for encoding Multilingual Web documents
Despite its name, the World Wide Web was not conceived as a system that would encompass the whole world. Rather, HTML was designed primarily with English in mind.
Initially the Web used only the very limited ISO 8859-1 character set, which supports only Western European languages.
In principle, you can encode your documents in any character set appropriate for the language. However, in order for other character sets to display correctly on the user's side, the user's software must be able to discover which set you selected.
In general, it is best to stick with standardized and well-known character sets, such as:
The ISO 8859 series
Important national standards (ISCII, ASMO, VISCII, etc.)
ISO 10646 (Unicode)
The correct way to encode a multilingual document is to:
- Choose and identify a character set that is appropriate for the document's language and as standard as possible.
- Use the characters in the set as much as possible when typing. Avoid using entity or numerical references.
- Most important, clearly denote the character set chosen for your document by adding the following line to your document header, immediately following the <HEAD> tag:
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
|
 |