Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

iconv_ja(5) [sunos man page]

iconv_ja(5)						Standards, Environments, and Macros					       iconv_ja(5)

NAME
iconv_ja - code set conversions in ja locale DESCRIPTION
The following code set conversions are supported: +-----------------------------------------------------------+ | Code Set|Conversions Supported | | Source Code | Target Code | |eucJP | PCK | |eucJP | ISO-2022-JP | |eucJP | ISO-2022-JP.RFC1468 | |eucJP | JIS7 | |eucJP | SJIS | |eucJP | UTF-8 | |eucJP | UTF-8-Java | |eucJP | jis | |eucJP | ibmj | |eucJP | ibmj-EBCDIK | |SJIS | eucJP | |SJIS | ISO-2022-JP | |SJIS | UTF-8 | |SJIS | jis | |SJIS | ibmj | |PCK | eucJP | |PCK | UTF-8 | |PCK | UTF-8-Java | |PCK | ISO-2022-JP | |PCK | ISO-2022-JP.RFC1468 | |PCK | jis | |PCK | ibmj | |PCK | ibmj-EBCDIK | |ISO-2022-JP | eucJP | |ISO-2022-JP | PCK | |ISO-2022-JP | SJIS | |ISO-2022-JP | UTF-8 | |UTF-8 | eucJP | |UTF-8 | SJIS | |UTF-8 | PCK | |UTF-8 | ISO-2022-JP | |UTF-8 | ISO-2022-JP.RFC1468 | |UTF-8-Java | eucJP | |UTF-8-Java | PCK | |JIS7 | eucJP | |jis | eucJP | |jis | PCK | |jis | SJIS | |ibmj | eucJP | |ibmj | PCK | |ibmj | SJIS | |ibmj-EBCDIK | eucJP | |ibmj-EBCDIK | PCK | +----------------------+------------------------------------+ +-------------------------------------------------------------+ | Code Set Conversions Supported | | Source Code | Target Code | |eucJP | ibm930 | |eucJP | ibm931 | |eucJP | ibm939 | |eucJP | ibm5026 | |eucJP | ibm5035 | |eucJP | FujitsuJEF-ascii-code | |eucJP | FujitsuJEF-kana-code | |eucJP | FujitsuJEF-ascii-face | |eucJP | FujitsuJEF-kana-face | |eucJP | HitachiKEIS83 | |eucJP | HitachiKEIS90 | |eucJP | NECJIPS | |PCK | ibm930 | |PCK | ibm931 | |PCK | ibm939 | |PCK | ibm5026 | |PCK | ibm5035 | |PCK | FujitsuJEF-ascii-code | |PCK | FujitsuJEF-kana-code | |PCK | FujitsuJEF-ascii-face | |PCK | FujitsuJEF-kana-face | |PCK | HitachiKEIS83 | |PCK | HitachiKEIS90 | |PCK | NECJIPS | |UTF-8 | ibm930 | |UTF-8 | ibm931 | |UTF-8 | ibm939 | |UTF-8 | ibm5026 | |UTF-8 | ibm5035 | |UTF-8 | ms932 | |UTF-8 | UTF-8-ms932 | |UTF-8 | FujitsuJEF-ascii-code | |UTF-8 | FujitsuJEF-kana-code | |UTF-8 | FujitsuJEF-ascii-face | |UTF-8 | FujitsuJEF-kana-face | |UTF-8 | HitachiKEIS83 | |UTF-8 | HitachiKEIS90 | |UTF-8 | NECJIPS | |UTF-8-ms932 | UTF-8 | |ibm930 | eucJP | |ibm930 | PCK | |ibm930 | UTF-8 | |ibm931 | eucJP | |ibm931 | PCK | |ibm931 | UTF-8 | |ibm939 | eucJP | |ibm939 | PCK | |ibm939 | UTF-8 | |ibm5026 | eucJP | |ibm5026 | PCK | |ibm5026 | UTF-8 | |ibm5035 | eucJP | |ibm5035 | PCK | |ibm5035 | UTF-8 | |FujitsuJEF-ascii-code | eucJP | |FujitsuJEF-ascii-code | PCK | |FujitsuJEF-ascii-code | UTF-8 | |FujitsuJEF-kana-code | eucJP | |FujitsuJEF-kana-code | PCK | |FujitsuJEF-kana-code | UTF-8 | |FujitsuJEF-ascii-face | eucJP | |FujitsuJEF-ascii-face | PCK | |FujitsuJEF-ascii-face | UTF-8 | +----------------------+--------------------------------------+ +------------------------------------------------------------+ | Code Set Conversions Supported | | Source Code | Target Code | |FujitsuJEF-kana-face | eucJP | |FujitsuJEF-kana-face | PCK | |FujitsuJEF-kana-face | UTF-8 | |HitachiKEIS83 | eucJP | |HitachiKEIS83 | PCK | |HitachiKEIS83 | UTF-8 | |HitachiKEIS90 | eucJP | |HitachiKEIS90 | PCK | |HitachiKEIS90 | UTF-8 | |NECJIPS | eucJP | |NECJIPS | PCK | |NECJIPS | UTF-8 | |ms932 | UTF-8 | +---------------------+--------------------------------------+ The descriptions of each code sets in the above table are as follows: +-----------------------------------------------------------+ | Description of Supported Code Sets | | Codeset | Description | |eucJP |Japanese EUC | |PCK |PC kanji | |SJIS |the same as PC kanji (eol in future) | |ISO-2022-JP |Coded representation of the charac- | | |ter sets ISO 646 IRV or JIS X 0201, | | |JIS X 0208, and JIS X 0212 according | | |to UI/OSF Application Platform Pro- | | |file for Japanese Environment Ver- | | |sion 1.1 item 7.1 using the designa- | | |tion sequence to G0 specified by ISO | | |2022 | +---------------------+-------------------------------------+ +-----------------------------------------------------------+ | Description of Supported Code Sets | | Codeset | Description | |JIS7 |same as ISO-2022-JP | |ISO-2022-JP.RFC1468 |Coded representation of the charac- | | |ter sets ISO 646 IRV or JIS X | | |0201-1976 (except for figure charac- | | |ter set for katakana), and JIS X | | |0208-1983 according to RFC1468 | | |(Request for Comments: 1468 Japanese | | |Character Encoding for Internet Mes- | | |sages) using the designation | | |sequence to G0 specified by ISO 2022 | |jis |JIS 7bit code used in JLE, JFP 2.4 | | |and the preceding releases | |ibmj |IBM Kanji code | |ibmj-EBCDIK |Maps single-byte code set (SBCS) of | | |IBM host code to the character set | | |that is called the EBCDIK code set | | |in general. The character code set | | |includes the IBM code page 290 and | | |threee more characters '`' | | |(0x79),'{' (0xc0), and '}' (0xd0). | | |Japanese katakana characters are | | |included, but lowercase alphabet | | |letters are not. In case of double- | | |byte code set (DBCS), the descrip- | | |tion is the same as the code set | | |"ibmj." | |UTF-8 |UNI CODE | |UTF-8-Java |UNI CODE implemented in Java | +---------------------+-------------------------------------+ +-------------------------------------------------------------+ | Description of Supported Code Sets | | Codeset | Description | |ibm930 | IBM CCSID 930: SBSC code page 290 | | | (extended), character set 1172, DBCS | | | code page 300, character set 1001 | | | 4370 user defined characters | |ibm931 | IBM CCSID 931: SBSC code page 37, | | | character set 101, DBCS code page | | | 300, character set 1001 | | | 4370 user defined characters | |ibm939 | IBM CCSID 930: SBSC code page 1027, | | | character set 1172, DBCS code page | | | 300, character set 1001 4370 user | | | defined characters | |ibm5026 | IBM CCSID 5026: same as ibm930, | | | except this code set supports 1880 | | | user defined characters | |ibm5035 | IBM CCSID 5035: same as ibm939, | | | except this code set supports 1880 | | | user defined characters | |FujitsuJEF-ascii-code | Fujitsu JEF code which uses | | | EBCDIC(ASCII) as single byte charac- | | | ter set. Charatcters differently | | | handled between JIS C 6226:1978 and | | | JIS X 0208:1983 are mapped based on | | | code value. | |FujitsuJEF-ascii-face | Fujitsu JEF code which uses | | | EBCDIC(ASCII) as single byte charac- | | | ter set. Charatcters differently | | | handled between JIS C 6226:1978 and | | | JIS X 0208:1983 are mapped based on | | | character face. | |FujitsuJEF-kana-code | Fujitsu JEF code which uses | | | EBCDIC(Kana) as single byte charac- | | | ter set. Caratcters differently | | | handled between JIS C 6226:1978 and | | | JIS X 0208:1983 are mapped based on | | | code value. | |FujitsuJEF-kana-face | Fujitsu JEF code which uses | | | EBCDIC(Kana) as single byte charac- | | | ter set. Charatcters differently | | | handled between JIS C 6226:1978 and | | | JIS X 0208:1983 are mapped based on | | | character face. | +----------------------+--------------------------------------+ +-----------------------------------------------------------+ | Description of Supported Code Sets | | Codeset | Description | |HitachiKEIS8 |Hitachi KEIS83 | |HitachiKEIS90 |Hitachi KEIS83 | |NECJIPS |NEC JIPS code | |ms932 |Shift JIS codeset which is supported | | |by Windows NT 3.51. Conversion | | |betwenn this codeset and UTF-8 is | | |done in the same way Windows NT 3.51 | | |does. | |UTF-8-ms932 |UTF-8 encoded Unicode which was con- | | |verted from ms932 | +---------------------+-------------------------------------+ Conversions are performed as described below. For all conversions, if the source code set includes characters not included in the target code set, conversion and output for all such characters will be done using a substitute character. eucJP to PCK (SJIS) and PCK (SJIS) to eucJP Conversion between eucJP and PCK (SJIS) can be used to convert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined and vendor-defined characters based on TOG Japanese Vendors Council (TOG/JVC) Recommended Code Set Conversion Specification between Japanese EUC and Shift-JIS. If input data which does not belong to the source code set is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP to ISO-2022-JP(JIS7) and ISO-2022-JP(JIS7) to eucJP Conversion between eucJP and ISO-2022-JP(JIS7) can be used to convert JIS X 0201, JIS X 0208 and JIS X 0212. If input data which does not belong to the source code set is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of success- ful conversion. eucJP to ISO-2022-JP.RFC1468 Conversion from eucJP to ISO-2022-JP.RFC1468 can be used to convert JIS X 0201 (except for figure character set for katakana) and JIS X 0208. If JIS X 0201 (figure character set for katakana), JIS X 0212, a user-defined, or a vendor-defined character is encountered among input data, it will be replaced with the substitute character ` ? ' (0x3f). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP to jis and jis to eucJP Conversion between eucJP and jis is provided for the compatibility with ujtojis7() and jis7touj() libraries, and euctojis and jis- toeuc utilities. It is extended to handle JIS X 0212. See jistoeuc(1). eucJP to UTF-8 and UTF-8 to eucJP Conversion between eucJP and UTF-8 can be used to convert JIS X 0201, JIS X 0208, JIS X 0212, a user-defined, and a vendor-defined character. If input data which does not have the corresponding character in the target code set is encountered, it will be replaced with the substitute character (eucJP: ` ? ' (0x3f), UTF-8: U+FFFD(0xefbfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP to UTF-8-Java and UTF-8-Java to eucJP Conversion between eucJP and UTF-8-Java can be used to convert JIS X 0201, JIS X 0208, and JIS X 0212. If a user-defined or vendor- defined character is encountered among input data, it will be replaced with the substitute character (eucJP: ` ? ' (0x3f), UTF-8: U+FFFD(0xefbfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP to ibmj and ibmj to eucJP Conversion between eucJP and ibmj is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If you convert eucJP to ibmj, JISX 0201 and JIS X 0201 are all converted to substitute character. eucJP to ibmj-EBCDIK and ibmj-EBCDIK to eucJP Conversion between eucJP and ibmj-EBCDIK is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If you convert eucJP to ibmj-EBCDIK, JISX 0201 and JIS X 0201 that have not correspondence characters with ibmj-EBCDIKare all converted to substitute character. PCK (SJIS) to ISO-2022-JP and ISO-2022-JP to PCK (SJIS) Conversion between PCK (SJIS) and ISO-2022-JP can be used to convert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined and vendor- defined characters based on TOG Japanese Vendors Council (TOG/JVC) Recommended Code Set Conversion Specification between Japanese EUC and Shift-JIS. If input data which does not belong to the source code set is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. PCK (SJIS) to ISO-2022-JP.RFC1468 Conversion from PCK (SJIS) to ISO-2022-JP.RFC1468 can be used to convert JIS X 0201 (except for figure character set for katakana) and JIS X 0208. If JIS X 0201 (figure character set for katakana), a user-defined, or a vendor-defined character is encountered among input data, it will be replaced with the substitute character ` ? ' (0x3f). If input data which does not belong to these code sets is encoun- tered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. PCK (SJIS) to UTF-8 and UTF-8 to PCK (SJIS) Conversion between PCK (SJIS) and UTF-8 can be used to convert JIS X 0201, JIS X 0208, a user-defined, and a vendor-defined character. If input data which does not have the corresponding character in the target code set is encountered, it will be replaced with the sub- stitute character (PCK: ` ? ' (0x3f), UTF-8: U+FFFD(0xefbfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. PCK (SJIS) to UTF-8-Java and UTF-8-Java to PCK (SJIS) Conversion between PCK (SJIS) and UTF-8-Java can be used to convert JIS X 0201 and JIS X 0208. If a user-defined or vendor-defined character is encountered among input data, it will be replaced with the substitute character (PCK: ` ? ' (0x3f), UTF-8: U+FFFD (0xef- bfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. PCK (SJIS) to jis and jis to PCK (SJIS) Conversion between PCK (SJIS) and jis is provided for the compatibility with sjtojis7() and jis7tosj() libraries , and sjtojis jistosj utilities. It is extended besed on TOG Japanese Vendors Council (TOG/JVC) Recommended Code Set Conversion Specification between Japa- nese EUC and Shift-JIS. See jistosj(1). PCK (SJIS) to ibmj and ibmj to PCK (SJIS) Conversion between PCK (SJIS) and ibmj is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If you convert PCK (SJIS) to ibmj, all characters converted to JIS X 0212 by kana characters (0xa1 to 0xdf) and TOG Japanese Vendors Council (TOG/JVC) Recommended Code Set Conversion Specification between Japanese EUC and Shift-JIS are all converted to substi- tute character. PCK to ibmj-EBCDIK and ibmj-EBCDIK to PCK Conversion between PCK and ibmj-EBCDIK is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If you convert PCK to ibmj-EBCDIK, all characters converted to JIS X 0212 by JIS X 0212 and TOG Japanese Vendors Council (TOG/JVC) Recommended Code Set Conversion Specification between Japanese EUC and Shift-JIS are all converted to substitute character. ISO-2022-JP to UTF-8 and UTF-8 to ISO-2022-JP Conversion between ISO-2022-JP and UTF-8 can be used to convert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined and vendor-defined characters. If input data which does not have the corresponding character in the target code set is encountered, it will be replaced with the substitute character (ISO-2022-JP: ` ? ' (0x3f), UTF-8: U+FFFD(0xefbfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. UTF-8 to ISO-2022-JP.RFC1468 Conversion from UTF-8 to ISO-2022-JP.RFC1468 can be used to convert JIS X 0201 (except for figure character set for katakana) and JIS X 0208. If JIS X 0201 (figure character set for katakana), JIS X 0212, a user-defined, or a vendor-defined character is encountered among input data, it will be replaced with the substitute character ` ? ' (0x3f). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP, PCK, UTF-8 to ibm930, ibm931, ibm939, ibm5026, ibm5035 Conversion from eucJP, PCK, or UTF-8 to ibm930,ibm931,ibm939, ibm5026,ibm5035 can be used to convert JIS X 0201, JIS X 0208, JIS X 0212, IBM extension characters, and user defined character. Input data which does not have corresponding character in the target code set is replaced with the substitute character. Since ibm931 does not support Kana characters in its single byte code set (SBCS), JIS X 0201 Kana characters are replaced with substitute characters in conversion to ibm931. ibm930, ibm931, ibm939, ibm5026, or ibm5035 to eucJP, PCK, or UTF-8 Conversion from ibm930, ibm931, ibm939, ibm5026, or ibm5035 to eucJP, PCK, or UTF-8 can be used to convert SBCS/DBCS characters defined in input code set. Input data which does not have corresponding character in the target code set is replaced with the substitute char- acter. FujitsuJEF-ascii-code or FujitsuJEF-kana-code to eucJP, PCK, or UTF-8: eucJP, PCK, or UTF-8 to FujitsuJEF-ascii-code or FujitsuJEF-kana- code Conversion from FujitsuJEF-ascii-code or FujitsuJEF-kana-code to eucJP, PCK, or UTF-8 and eucJP, PCK, or UTF-8 to FujitsuJEF-ascii-code or FujitsuJEF-kana-code can be used to convert JIS X 0201, JIS X 0208, and JIS X 0212 characters defined in input code set. Input data which does not have corresponding character in the target code set is replaced with the substitude character. Characters handled dif- ferently between JIS C 6226:1978 and JIS X 0208:1983 are converted based on code value. In conversion to FujitsuJEF-kana-code, JIS X 0201 Katakana characters are replaced with the substitute character. FujitsuJEF-ascii-face or FujitsuJEF-kana-face to eucJP, PCK, or UTF-8: eucJP, PCK, UTF-8 to FujitsuJEF-ascii-face or FujitsuJEF-kana-face Characters handled differently between JIS C 6226:1978 and JIS X 0208:1983 are converted based on character face. Other behavior is same as that of FujitsuJEF-ascii-code or FujitsuJEF-kana-code. HitachiKEIS83 to eucJP, PCK, or UTF-8: eucJP, PCK, UTF-8 to HitachiKEIS83 Conversion from HitachiKEIS83 to eucJP, PCK, or UTF-8 and from eucJP, PCK, UTF-8 to HitachiKEIS83 can be used to convert JIS X 0201, JIS X 0208, and JIS X 0212 characters. Input data which does not have corresponding character in the target code set is replaced with the substitude character. HitachiKEIS90 to eucJP, PCK, or UTF-8: eucJP, PCK, or UTF-8 to HitachiKEIS90 Behavior of conversion from HitachiKEIS90 to eucJP, PCK, or UTF-8 and from eucJP, PCK, or UTF-8 to HitachiKEIS90 is same as that of KEIS83 except that this conversion uses KEIS90 instead of KEIS83. NECJIPS to eucJP, PCK, or UTF-8: eucJP, PCK, or UTF-8 to NECJIPS Conversion from NECJIPS to eucJP, PCK, or UTF-8 and eucJP, PCK, or UTF-8 to NECJIPS can be used to convert JIS X 0201, JIS X 0208, and JIS X 0212 characters. Input data which does not have corresponding character in the target code set is replaced with the substitude character. EBCDIC Katakana (EK) code is used for single byte character set. ms932 to UTF-8 and UTF-8 to ms932 Conversion between ms932 and UTF-8 is done using same way of mapping characters between the two codesets as Windows NT 3.51 does. UTF-8 to UTF-8-ms932 and UTF-8-ms932 to UTF-8 This converts between "UTF-8" and "UTF-8-ms932", which are UTF-8 encoded Unicode converted from PCK, and that converted from ms932. SEE ALSO
iconv(1), jistoeuc(1), jistosj(1), iconv(3C), iconv(5), iconv_unicode(5) SunOS 5.10 10 Jan 2003 iconv_ja(5)
Man Page