This page is still under development. Please take with a grain of salt!
The pages linked to are each displayed in a different charset, showing how different bytes-values are displayed in the given charset.
iso-8859-1 - Latin-1 (Northern European)
Western Europe and Scandanavian: Afrikaans, Basque, Catalan, Danish, Dutch, English, Faeroese, Finnish, French, Galician, German, Icelandic, Irish, Italian, Norwegian, Portuguese, Spanish and Swedish.
Note: The Dutch IJ and ij (IJ & &307;), the German versions of double-quotes: „ & ” („ & ”), and the French œ & Œ (œ & Œ) are in the Supplementary character Set.
Additional Characters for Vietnamese
iso-8859-2 - Latin-2 (Eastern European)
Latin-written Slavic and Central European: Czech, German, Hungarian, Polish, Romanian, Croatian, Slovak, Slovene.
Note: Š & š (Š & š) Č & č (Č & č) and Ž & ž (Ž & ž) are in the Supplementary Character Set.
iso-8859-3 - Latin-3 (Southern European)
Esperanto, Galician, Maltese, and Turkish.
iso-8859-5 - Cyrillic
Bulgarian, Byelorussian, Macedonian, Russian, Serbian and Ukrainian.
Cyrillic Characters in Supplementary Set
iso-8859-6 - Non-accented Arabic
Arabic Characters in Supplementary Set
iso-8859-7 Greek
Greek Characters in Supplementary Set
iso-8859-8 - Non-accented Hebrew
iso-8859-9 - Latin-5 (Turkish)
As for iso-8859-1, but Turkish instead of Icelandic.
iso-8859-10 - Latin-6 (Nordic)
Lappish/Nordic/Eskimo languages: Adds the last Inuit (Greenlandic) and Sami (Lappish) letters that were missing in Latin 4 to cover the entire Nordic area.
iso-8859-11 - Thai
Boishakhi font - Bengali
Supplementary Character set 256 to 8993
Favourites from Supplementary Character set
non-SGML Characters (129 to 159: not to be used!)
See Examples of Mathematical Formulae in HTML.
windows-1250 - Central European
windows-1251 - Russian
windows-1252 - Western Europe
windows-1253 - Greek
windows-1254 - Turkish
windows-1255 - Hebrew
windows-1256 - Arabic
windows-1257 - Baltic
windows-874 - Thai
unicode | 1200 | Universal Alphabet |
unicodeFEFF | 1201 | Universal Alphabet (Big-Endian) |
utf-7 | 65000 | Universal Alphabet (UTF-7) |
utf-8 | 65001 | Universal Alphabet (UTF-8) |
iso-2022-jp | 50220 | Japanese (JIS) |
iso-2022-jp | 50222 | Japanese (JIS-Allow 1 byte Kana) |
iso-2022-kr | 50225 | Korean (ISO) |
DIN_66003 | 20106 | IA5 (German) |
NS_4551-1 | 20108 | IA5 (Norwegian) |
SEN_850200_B | 20107 | IA5 (Swedish) |
_autodetect | 50932 | Japanese (Auto Select) |
_autodetect_kr | 50949 | Korean (Auto Select) |
big5 | 950 | Chinese Traditional (Big5) |
csISO2022JP | 50221 | Japanese (JIS-Allow 1 byte Kana) |
euc-kr | 51949 | Korean (EUC) |
gb2312 | 936 | Chinese Simplified (GB2312) |
hz-gb-2312 | 52936 | Chinese Simplified (HZ) |
ibm852 | 852 | Central European (DOS) |
ibm866 | 866 | Cyrillic Alphabet (DOS) |
irv | 20105 | IA5 (IRV) |
koi8-r | 20866 | Cyrillic Alphabet (KOI8-R) |
ks_c_5601 | 949 | Korean |
shift-jis | 932 | Japanese (Shift-JIS) |
windows-874 | 874 | Thai (Windows) |
x-euc | 51932 | Japanese (EUC) |