Documentation Contents

Supported Encodings

The java.io.InputStreamReader, java.io.OutputStreamWriter, java.lang.String classes, and classes in the java.nio.charset package can convert between Unicode and a number of other character encodings. The supported encodings vary between different implementations of Java SE 8. The class description for java.nio.charset.Charset lists the encodings that any implementation of Java SE 8 is required to support.

JDK 8 for all platforms (Solaris, Linux, and Microsoft Windows) and JRE 8 for Solaris and Linux support all encodings shown on this page. JRE 8 for Microsoft Windows may be installed as a complete international version or as a European languages version. By default, the JRE 8 installer installs a European languages version if it recognizes that the host operating system only supports European languages. If the installer recognizes that any other language is needed, or if the user requests support for non-European languages in a customized installation, a complete international version is installed. The European languages version only supports the encodings shown in the following Basic Encoding Set table. The international version (which includes the lib/charsets.jar file) supports all encodings shown on this page.

The following tables show the encoding sets supported by Java SE 8. The canonical names used by the new java.nio APIs are in many cases not the same as those used in the java.io and java.lang APIs.

Basic Encoding Set (contained in lib/rt.jar)

Canonical Name for java.nio API Canonical Name for java.io API and java.lang API Alias or Aliases Description
CESU-8 CESU8 CESU8 csCESU-8 Unicode CESU-8
GB18030 GB18030 gb18030-2022 or gb18030-2000 if the system property and value jdk.charset.GB18030=2000 are specified Simplified Chinese, PRC standard
IBM00858 Cp858 cp858 858 PC-Multilingual-850+euro cp00858 ccsid00858 Variant of Cp850 with Euro character
IBM437 Cp437 ibm437 437 ibm-437 cspc8codepage437 cp437 windows-437 MS-DOS United States, Australia, New Zealand, South Africa
IBM775 Cp775 ibm-775 ibm775 775 cp775 PC Baltic
IBM850 Cp850 cp850 cspc850multilingual ibm850 850 ibm-850 MS-DOS Latin-1
IBM852 Cp852 csPCp852 ibm-852 ibm852 852 cp852 MS-DOS Latin-2
IBM855 Cp855 ibm855 855 ibm-855 cp855 cspcp855 IBM Cyrillic
IBM857 Cp857 ibm857 857 cp857 csIBM857 ibm-857 IBM Turkish
IBM862 Cp862 csIBM862 cp862 ibm862 862 cspc862latinhebrew ibm-862 PC Hebrew
IBM866 Cp866 ibm866 866 ibm-866 csIBM866 cp866 MS-DOS Russian
ISO-8859-1 ISO8859_1 819 ISO8859-1 l1 ISO_8859-1:1987 ISO_8859-1 8859_1 iso-ir-100 latin1 cp819 ISO8859_1 IBM819 ISO_8859_1 IBM-819 csISOLatin1 ISO-8859-1, Latin Alphabet No. 1
ISO-8859-2 ISO8859_2 ISO8859-2 ibm912 l2 ISO_8859-2 8859_2 cp912 ISO_8859-2:1987 iso8859_2 iso-ir-101 latin2 912 csISOLatin2 ibm-912 Latin Alphabet No. 2
ISO-8859-4 ISO8859_4 8859_4 latin4 l4 cp914 ISO_8859-4:1988 ibm914 ISO_8859-4 iso-ir-110 iso8859_4 csISOLatin4 iso8859-4 914 ibm-914 Latin Alphabet No. 4
ISO-8859-5 ISO8859_5 ISO_8859-5:1988 csISOLatinCyrillic iso-ir-144 iso8859_5 cp915 8859_5 ibm-915 ISO_8859-5 ibm915 915 cyrillic ISO8859-5 Latin/Cyrillic Alphabet
ISO-8859-7 ISO8859_7 greek 8859_7 greek8 ibm813 ISO_8859-7 iso8859_7 ELOT_928 cp813 ISO_8859-7:1987 sun_eu_greek csISOLatinGreek iso-ir-126 813 iso8859-7 ECMA-118 ibm-813 Latin/Greek Alphabet (ISO-8859-7:2003)
ISO-8859-9 ISO8859_9 ibm-920 ISO_8859-9 8859_9 ISO_8859-9:1989 ibm920 latin5 l5 iso8859_9 cp920 920 iso-ir-148 ISO8859-9 csISOLatin5 Latin Alphabet No. 5
ISO-8859-13 ISO8859_13 iso_8859-13 ISO8859-13 iso8859_13 8859_13 Latin Alphabet No. 7
ISO-8859-15 ISO8859_15 ISO8859-15 LATIN0 ISO8859_15_FDIS ISO8859_15 cp923 8859_15 L9 ISO-8859-15 IBM923 csISOlatin9 ISO_8859-15 IBM-923 csISOlatin0 923 LATIN9 Latin Alphabet No. 9
KOI8-R KOI8_R koi8_r koi8 cskoi8r KOI8-R, Russian
KOI8-U KOI8_U koi8_u KOI8-U, Ukrainian
US-ASCII ASCII ANSI_X3.4-1968 cp367 csASCII iso-ir-6 ASCII iso_646.irv:1983 ANSI_X3.4-1986 ascii7 default ISO_646.irv:1991 ISO646-US IBM367 646 us American Standard Code for Information Interchange
UTF-8 UTF8 unicode-1-1-utf-8 UTF8 Eight-bit Unicode (or UCS) Transformation Format
UTF-16 UTF-16 UTF_16 unicode utf16 UnicodeBig Sixteen-bit Unicode (or UCS) Transformation Format, byte order identified by an optional byte-order mark
UTF-16BE UnicodeBigUnmarked X-UTF-16BE UTF_16BE ISO-10646-UCS-2 UnicodeBigUnmarked Sixteen-bit Unicode (or UCS) Transformation Format, big-endian byte order
UTF-16LE UnicodeLittleUnmarked UnicodeLittleUnmarked UTF_16LE X-UTF-16LE Sixteen-bit Unicode (or UCS) Transformation Format, little-endian byte order
UTF-32 UTF_32 UTF_32 UTF32 32-bit Unicode (or UCS) Transformation Format, byte order identified by an optional byte-order mark
UTF-32BE UTF_32BE X-UTF-32BE UTF_32BE 32-bit Unicode (or UCS) Transformation Format, big-endian byte order
UTF-32LE UTF_32LE X-UTF-32LE UTF_32LE 32-bit Unicode (or UCS) Transformation Format, little-endian byte order
x-UTF-32BE-BOM UTF_32BE_BOM UTF_32BE_BOM UTF-32BE-BOM 32-bit Unicode (or UCS) Transformation Format, big-endian byte order, with byte-order mark
x-UTF-32LE-BOM UTF_32LE_BOM UTF_32LE_BOM UTF-32LE-BOM 32-bit Unicode (or UCS) Transformation Format, little-endian byte order, with byte-order mark
windows-1250 Cp1250 cp1250 cp5346 Windows Eastern European
windows-1251 Cp1251 cp5347 ansi-1251 cp1251 Windows Cyrillic
windows-1252 Cp1252 cp5348 cp1252 Windows Latin-1
windows-1253 Cp1253 cp1253 cp5349 Windows Greek
windows-1254 Cp1254 cp1254 cp5350 Windows Turkish
windows-1257 Cp1257 cp1257 cp5353 Windows Baltic
Not available UnicodeBig Not available Sixteen-bit Unicode (or UCS) Transformation Format, big-endian byte order, with byte-order mark
x-IBM737 Cp737 cp737 ibm737 737 ibm-737 PC Greek
x-IBM874 Cp874 ibm-874 ibm874 874 cp874 IBM Thai
x-UTF-16LE-BOM UnicodeLittle UnicodeLittle Sixteen-bit Unicode (or UCS) Transformation Format, little-endian byte order, with byte-order mark

Extended Encoding Set (contained in lib/charsets.jar)

Canonical Name for java.nio API Canonical Name for java.io API and java.lang API Alias or Aliases Description
Big5 Big5 csBig5 Big5, Traditional Chinese
Big5-HKSCS Big5_HKSCS big5-hkscs big5hk Big5_HKSCS big5hkscs Big5 with Hong Kong extensions, Traditional Chinese (incorporating 2001 revision)
EUC-JP EUC_JP csEUCPkdFmtjapanese x-euc-jp eucjis Extended_UNIX_Code_Packed_Format_for_Japanese euc_jp eucjp x-eucjp JISX 0201, 0208 and 0212, EUC encoding Japanese
EUC-KR EUC_KR ksc5601-1987 csEUCKR ksc5601_1987 ksc5601 5601 euc_kr ksc_5601 ks_c_5601-1987 euckr KS C 5601, EUC encoding, Korean
GB2312 EUC_CN gb2312 euc-cn x-EUC-CN euccn EUC_CN gb2312-80 gb2312-1980 GB2312, EUC encoding, Simplified Chinese
GBK GBK CP936 windows-936 GBK, Simplified Chinese
IBM-Thai Cp838 ibm-838 ibm838 838 cp838 IBM Thailand extended SBCS
IBM01140 Cp1140 cp1140 1140 cp01140 ebcdic-us-037+euro ccsid01140 Variant of Cp037 with Euro character
IBM01141 Cp1141 1141 cp1141 cp01141 ccsid01141 ebcdic-de-273+euro Variant of Cp273 with Euro character
IBM01142 Cp1142 1142 cp1142 cp01142 ccsid01142 ebcdic-no-277+euro ebcdic-dk-277+euro Variant of Cp277 with Euro character
IBM01143 Cp1143 1143 cp01143 ccsid01143 cp1143 ebcdic-fi-278+euro ebcdic-se-278+euro Variant of Cp278 with Euro character
IBM01144 Cp1144 cp01144 ccsid01144 ebcdic-it-280+euro cp1144 1144 Variant of Cp280 with Euro character
IBM01145 Cp1145 ccsid01145 ebcdic-es-284+euro 1145 cp1145 cp01145 Variant of Cp284 with Euro character
IBM01146 Cp1146 ebcdic-gb-285+euro 1146 cp1146 cp01146 ccsid01146 Variant of Cp285 with Euro character
IBM01147 Cp1147 cp1147 1147 cp01147 ccsid01147 ebcdic-fr-277+euro Variant of Cp297 with Euro character
IBM01148 Cp1148 cp1148 ebcdic-international-500+euro 1148 cp01148 ccsid01148 Variant of Cp500 with Euro character
IBM01149 Cp1149 ebcdic-s-871+euro 1149 cp1149 cp01149 ccsid01149 Variant of Cp871 with Euro character
IBM037 Cp037 cp037 ibm037 ibm-037 csIBM037 ebcdic-cp-us ebcdic-cp-ca ebcdic-cp-nl ebcdic-cp-wt 037 cpibm37 cs-ebcdic-cp-wt ibm-37 cs-ebcdic-cp-us cs-ebcdic-cp-ca cs-ebcdic-cp-nl USA, Canada (Bilingual, French), Netherlands, Portugal, Brazil, Australia
IBM1026 Cp1026 cp1026 ibm-1026 1026 ibm1026 IBM Latin-5, Turkey
IBM1047 Cp1047 ibm-1047 1047 cp1047 Latin-1 character set for EBCDIC hosts
IBM273 Cp273 ibm-273 ibm273 273 cp273 IBM Austria, Germany
IBM277 Cp277 ibm277 277 cp277 ibm-277 IBM Denmark, Norway
IBM278 Cp278 cp278 278 ibm-278 ebcdic-cp-se csIBM278 ibm278 ebcdic-sv IBM Finland, Sweden
IBM280 Cp280 ibm280 280 cp280 ibm-280 IBM Italy
IBM284 Cp284 csIBM284 ibm-284 cpibm284 ibm284 284 cp284 IBM Catalan/Spain, Spanish Latin America
IBM285 Cp285 csIBM285 cp285 ebcdic-gb ibm-285 cpibm285 ibm285 285 ebcdic-cp-gb IBM United Kingdom, Ireland
IBM290 Cp290 ibm290 290 cp290 EBCDIC-JP-kana csIBM290 ibm-290 IBM Japanese Katakana Host Extended SBCS
IBM297 Cp297 297 csIBM297 cp297 ibm297 ibm-297 cpibm297 ebcdic-cp-fr IBM France
IBM420 Cp420 ibm420 420 cp420 csIBM420 ibm-420 ebcdic-cp-ar1 IBM Arabic
IBM424 Cp424 ebcdic-cp-he csIBM424 ibm-424 ibm424 424 cp424 IBM Hebrew
IBM500 Cp500 ibm-500 ibm500 500 ebcdic-cp-bh ebcdic-cp-ch csIBM500 cp500 EBCDIC 500V1
IBM860 Cp860 ibm860 860 cp860 csIBM860 ibm-860 MS-DOS Portuguese
IBM861 Cp861 cp861 ibm861 861 ibm-861 cp-is csIBM861 MS-DOS Icelandic
IBM863 Cp863 csIBM863 ibm-863 ibm863 863 cp863 MS-DOS Canadian French
IBM864 Cp864 csIBM864 ibm-864 ibm864 864 cp864 PC Arabic
IBM865 Cp865 ibm-865 csIBM865 cp865 ibm865 865 MS-DOS Nordic
IBM868 Cp868 ibm868 868 cp868 csIBM868 ibm-868 cp-ar MS-DOS Pakistan
IBM869 Cp869 cp869 ibm869 869 ibm-869 cp-gr csIBM869 IBM Modern Greek
IBM870 Cp870 870 cp870 csIBM870 ibm-870 ibm870 ebcdic-cp-roece ebcdic-cp-yu IBM Multilingual Latin-2
IBM871 Cp871 ibm871 871 cp871 ebcdic-cp-is csIBM871 ibm-871 IBM Iceland
IBM918 Cp918 918 ibm-918 ebcdic-cp-ar2 cp918 IBM Pakistan (Urdu)
ISO-2022-CN ISO2022CN csISO2022CN ISO2022CN GB2312 and CNS11643 in ISO 2022 CN form, Simplified and Traditional Chinese (conversion to Unicode only)
ISO-2022-JP ISO2022JP csjisencoding iso2022jp jis_encoding jis csISO2022JP JIS X 0201, 0208, in ISO 2022 form, Japanese
ISO-2022-JP-2 ISO2022JP2 csISO2022JP2 iso2022jp2 JIS X 0201, 0208, 0212 in ISO 2022 form, Japanese
ISO-2022-KR ISO2022KR csISO2022KR ISO2022KR ISO 2022 KR, Korean
ISO-8859-3 ISO8859_3 ISO8859-3 ibm913 8859_3 l3 cp913 ISO_8859-3 iso8859_3 latin3 csISOLatin3 913 ISO_8859-3:1988 ibm-913 iso-ir-109 Latin Alphabet No. 3
ISO-8859-6 ISO8859_6 ASMO-708 8859_6 iso8859_6 ISO_8859-6 csISOLatinArabic ibm1089 arabic ibm-1089 1089 ECMA-114 iso-ir-127 ISO_8859-6:1987 ISO8859-6 cp1089 Latin/Arabic Alphabet
ISO-8859-8 ISO8859_8 8859_8 ISO_8859-8 ISO_8859-8:1988 cp916 iso-ir-138 ISO8859-8 hebrew iso8859_8 ibm-916 csISOLatinHebrew 916 ibm916 Latin/Hebrew Alphabet
JIS_X0201 JIS_X0201 JIS0201 csHalfWidthKatakana X0201 JIS_X0201 JIS X 0201
JIS_X0212-1990 JIS_X0212-1990 JIS0212 iso-ir-159 x0212 jis_x0212-1990 csISO159JISX02121990 JIS X 0212
Shift_JIS SJIS shift_jis x-sjis sjis shift-jis ms_kanji csShiftJIS Shift-JIS, Japanese
TIS-620 TIS620 tis620 tis620.2533 TIS620, Thai
windows-1255 Cp1255 cp1255 Windows Hebrew
windows-1256 Cp1256 cp1256 Windows Arabic
windows-1258 Cp1258 cp1258 Windows Vietnamese
windows-31j MS932 MS932 windows-932 csWindows31J Windows Japanese
x-Big5-Solaris Big5_Solaris Big5_Solaris Big5 with seven additional Hanzi ideograph character mappings for the Solaris zh_TW.BIG5 locale
x-euc-jp-linux EUC_JP_LINUX euc_jp_linux euc-jp-linux JISX 0201, 0208, EUC encoding Japanese
x-EUC-TW EUC_TW euctw cns11643 EUC-TW euc_tw CNS11643 (Plane 1-7,15), EUC encoding, Traditional Chinese
x-eucJP-Open EUC_JP_Solaris eucJP-open EUC_JP_Solaris JISX 0201, 0208, 0212, EUC encoding Japanese
x-IBM1006 Cp1006 ibm1006 ibm-1006 1006 cp1006 IBM AIX Pakistan (Urdu)
x-IBM1025 Cp1025 ibm-1025 1025 cp1025 ibm1025 IBM Multilingual Cyrillic: Bulgaria, Bosnia, Herzegovinia, Macedonia (FYR)
x-IBM1046 Cp1046 ibm1046 ibm-1046 1046 cp1046 IBM Arabic - Windows
x-IBM1097 Cp1097 ibm1097 ibm-1097 1097 cp1097 IBM Iran (Farsi)/Persian
x-IBM1098 Cp1098 ibm-1098 1098 cp1098 ibm1098 IBM Iran (Farsi)/Persian (PC)
x-IBM1112 Cp1112 ibm1112 ibm-1112 1112 cp1112 IBM Latvia, Lithuania
x-IBM1122 Cp1122 cp1122 ibm1122 ibm-1122 1122 IBM Estonia
x-IBM1123 Cp1123 ibm1123 ibm-1123 1123 cp1123 IBM Ukraine
x-IBM1124 Cp1124 ibm-1124 1124 cp1124 ibm1124 IBM AIX Ukraine
x-IBM1166 Cp1166 cp1166 ibm1166 ibm-1166 1166 IBM Cyrillic Multilingual with euro for Kazakhstan
x-IBM1364 Cp1364 cp1364 ibm1364 ibm-1364 1364 IBM EBCDIC KS X 1005-1
x-IBM1381 Cp1381 cp1381 ibm-1381 1381 ibm1381 IBM OS/2, DOS People's Republic of China (PRC)
x-IBM1383 Cp1383 ibm1383 ibm-1383 1383 cp1383 IBM AIX People's Republic of China (PRC)
x-IBM300 Cp300 cp300 ibm300 300 ibm-300 IBM Japanese Latin Host Double-Byte
x-IBM33722 Cp33722 33722 ibm-33722 cp33722 ibm33722 ibm-5050 ibm-33722_vascii_vpua IBM-eucJP - Japanese (superset of 5050)
x-IBM833 Cp833 ibm833 cp833 ibm-833 IBM Korean Host Extended SBCS
x-IBM834 Cp834 ibm834 834 cp834 ibm-834 IBM EBCDIC DBCS-only Korean
x-IBM856 Cp856 ibm856 856 cp856 ibm-856 IBM Hebrew
x-IBM875 Cp875 ibm-875 ibm875 875 cp875 IBM Greek
x-IBM921 Cp921 ibm921 921 ibm-921 cp921 IBM Latvia, Lithuania (AIX, DOS)
x-IBM922 Cp922 ibm922 922 cp922 ibm-922 IBM Estonia (AIX, DOS)
x-IBM930 Cp930 ibm-930 ibm930 930 cp930 Japanese Katakana-Kanji mixed with 4370 UDC, superset of 5026
x-IBM933 Cp933 ibm933 933 cp933 ibm-933 Korean Mixed with 1880 UDC, superset of 5029
x-IBM935 Cp935 cp935 ibm935 935 ibm-935 Simplified Chinese Host mixed with 1880 UDC, superset of 5031
x-IBM937 Cp937 ibm-937 ibm937 937 cp937 Traditional Chinese Host miexed with 6204 UDC, superset of 5033
x-IBM939 Cp939 ibm-939 cp939 ibm939 939 Japanese Latin Kanji mixed with 4370 UDC, superset of 5035
x-IBM942 Cp942 ibm-942 cp942 ibm942 942 IBM OS/2 Japanese, superset of Cp932
x-IBM942C Cp942C ibm942C cp942C ibm-942C 942C Variant of Cp942
x-IBM943 Cp943 ibm943 943 ibm-943 cp943 IBM OS/2 Japanese, superset of Cp932 and Shift-JIS
x-IBM943C Cp943C 943C cp943C ibm943C ibm-943C Variant of Cp943
x-IBM948 Cp948 ibm-948 ibm948 948 cp948 OS/2 Chinese (Taiwan) superset of 938
x-IBM949 Cp949 ibm-949 ibm949 949 cp949 PC Korean
x-IBM949C Cp949C ibm949C ibm-949C cp949C 949C Variant of Cp949
x-IBM950 Cp950 cp950 ibm950 950 ibm-950 PC Chinese (Hong Kong, Taiwan)
x-IBM964 Cp964 ibm-964 cp964 ibm964 964 AIX Chinese (Taiwan)
x-IBM970 Cp970 ibm970 ibm-eucKR 970 cp970 ibm-970 AIX Korean
x-ISCII91 ISCII91 ISCII91 iso-ir-153 iscii ST_SEV_358-88 csISO153GOST1976874 ISCII91 encoding of Indic scripts
x-ISO2022-CN-CNS ISO2022_CN_CNS Not available CNS11643 in ISO 2022 CN form, Traditional Chinese (conversion from Unicode only)
x-ISO2022-CN-GB ISO2022_CN_GB Not available GB2312 in ISO 2022 CN form, Simplified Chinese (conversion from Unicode only)
x-iso-8859-11 x-iso-8859-11 iso-8859-11 iso8859_11 Latin/Thai Alphabet
x-JIS0208 x-JIS0208 JIS0208 JIS_C6226-1983 iso-ir-87 x0208 JIS_X0208-1983 csISO87JISX0208 JIS X 0208
x-JISAutoDetect JISAutoDetect JISAutoDetect Detects and converts from Shift-JIS, EUC-JP, ISO 2022 JP (conversion to Unicode only)
x-Johab x-Johab ms1361 ksc5601_1992 johab ksc5601-1992 Korean, Johab character set
x-MacArabic MacArabic MacArabic Macintosh Arabic
x-MacCentralEurope MacCentralEurope MacCentralEurope Macintosh Latin-2
x-MacCroatian MacCroatian MacCroatian Macintosh Croatian
x-MacCyrillic MacCyrillic MacCyrillic Macintosh Cyrillic
x-MacDingbat MacDingbat MacDingbat Macintosh Dingbat
x-MacGreek MacGreek MacGreek Macintosh Greek
x-MacHebrew MacHebrew MacHebrew Macintosh Hebrew
x-MacIceland MacIceland MacIceland Macintosh Iceland
x-MacRoman MacRoman MacRoman Macintosh Roman
x-MacRomania MacRomania MacRomania Macintosh Romania
x-MacSymbol MacSymbol MacSymbol Macintosh Symbol
x-MacThai MacThai MacThai Macintosh Thai
x-MacTurkish MacTurkish MacTurkish Macintosh Turkish
x-MacUkraine MacUkraine MacUkraine Macintosh Ukraine
x-MS932_0213 x-MS950-HKSCS MS950_HKSCS Not available Shift_JISX0213 Windows MS932 Variant
x-MS950-HKSCS MS950_HKSCS MS950_HKSCS Windows Traditional Chinese with Hong Kong extensions
x-MS950-HKSCS-XP x-mswin-936 MS936 MS950_HKSCS_XP HKSCS Windows XP Variant
x-mswin-936 MS936 ms936 ms_936 Windows Simplified Chinese
x-PCK PCK pck Solaris version of Shift_JIS
x-SJIS_0213 x-SJIS_0213 Not available Shift_JISX0213
x-windows-50220 Cp50220 cp50220 ms50220 Windows Codepage 50220 (7-bit implementation)
x-windows-50221 Cp50221 cp50221 ms50221 Windows Codepage 50221 (7-bit implementation)
x-windows-874 MS874 ms-874 ms874 windows-874 Windows Thai
x-windows-949 MS949 windows949 ms949 windows-949 ms_949 Windows Korean
x-windows-950 MS950 ms950 windows-950 Windows Traditional Chinese
x-windows-iso2022jp x-windows-iso2022jp windows-iso2022jp Variant ISO-2022-JP (MS932 based)

Oracle and/or its affiliates Copyright © 1993, 2023, Oracle and/or its affiliates. All rights reserved.
Contact Us