Home Man
Search
Today's Posts
Register

Linux & Unix Commands - Search Man Pages

OpenSolaris 2009.06 - man page for iconv_unicode (opensolaris section 5)

iconv_unicode(5)	       Standards, Environments, and Macros		 iconv_unicode(5)

NAME
       iconv_unicode - code set conversion tables for Unicode

DESCRIPTION
       The following code set conversions are supported:

			     CODE SET CONVERSIONS SUPPORTED
			     ------------------------------
	   FROM Code Set			       TO Code Set
	       Code		 FROM	       Target Code	      TO
				 Filename			      Filename
				 Element			      Element

	 ISO 8859-1 (Latin 1)	 8859-1 	   UTF-8	       UTF-8
	 ISO 8859-2 (Latin 2)	 8859-2 	   UTF-8	       UTF-8
	 ISO 8859-3 (Latin 3)	 8859-3 	   UTF-8	       UTF-8
	 ISO 8859-4 (Latin 4)	 8859-4 	   UTF-8	       UTF-8
	 ISO 8859-5 (Cyrillic)	 8859-5 	   UTF-8	       UTF-8
	 ISO 8859-6 (Arabic)	 8859-6 	   UTF-8	       UTF-8
	 ISO 8859-7 (Greek)	 8859-7 	   UTF-8	       UTF-8
	 ISO 8859-8 (Hebrew)	 8859-8 	   UTF-8	       UTF-8
	 ISO 8859-9 (Latin 5)	 8859-9 	   UTF-8	       UTF-8
	 ISO 8859-10 (Latin 6)	 8859-10	   UTF-8	       UTF-8
	 Japanese EUC		 eucJP		   UTF-8	       UTF-8
	 Chinese/PRC EUC
	 (GB 2312-1980) 	 gb2312 	   UTF-8	       UTF-8
	 ISO-2022		 iso2022	   UTF-8	       UTF-8
	 Korean EUC		 ko_KR-euc	   Korean UTF-8        ko_KR-UTF-8
	 ISO-2022-KR		 ko_KR-iso2022-7   Korean UTF-8        ko_KR_UTF-8
	 Korean Johap
	 (KS C 5601-1987)	 ko_KR-johap	   Korean UTF-8        ko_KR-UTF-8
	 Korean Johap
	 (KS C 5601-1992)	 ko_KR-johap92	   Korean UTF-8        ko_KR-UTF-8
	 Korean UTF-8		 ko_KR-UTF-8	   Korean EUC	       ko_KR-euc
	 Korean UTF-8		 ko_KR-UTF-8	   Korean Johap        ko_KR-johap
						   (KS C 5601-1987)
	 Korean UTF-8		 ko_KR-UTF-8	   Korean Johap        ko_KR-johap92
						   (KS C 5601-1992)
	 KOI8-R (Cyrillic)	 KOI8-R 	   UCS-2	       UCS-2
	 KOI8-R (Cyrillic)	 KOI8-R 	   UTF-8	       UTF-8
	 PC Kanji (SJIS)	 PCK		   UTF-8	       UTF-8
	 PC Kanji (SJIS)	 SJIS		   UTF-8	       UTF-8
	 UCS-2			 UCS-2		   KOI8-R (Cyrillic)   KOI8-R
	 UCS-2			 UCS-2		   UCS-4	       UCS-4

			     CODE SET CONVERSIONS SUPPORTED
			     ------------------------------
	   FROM Code Set			       TO Code Set
	       Code		 FROM	       Target Code	      TO
				 Filename			      Filename
				 Element			      Element

	 UCS-2		    UCS-2	    UTF-7		    UTF-7
	 UCS-2		    UCS-2	    UTF-8		    UTF-8
	 UCS-4		    UCS-4	    UCS-2		    UCS-2
	 UCS-4		    UCS-4	    UTF-16		    UTF-16
	 UCS-4		    UCS-4	    UTF-7		    UTF-7
	 UCS-4		    UCS-4	    UTF-8		    UTF-8
	 UTF-16 	    UTF-16	    UCS-4		    UCS-4
	 UTF-16 	    UTF-16	    UTF-8		    UTF-8
	 UTF-7		    UTF-7	    UCS-2		    UCS-2
	 UTF-7		    UTF-7	    UCS-4		    UCS-4
	 UTF-7		    UTF-7	    UTF-8		    UTF-8
	 UTF-8		    UTF-8	    ISO 8859-1 (Latin 1)    8859-1
	 UTF-8		    UTF-8	    ISO 8859-2 (Latin 2)    8859-2
	 UTF-8		    UTF-8	    ISO 8859-3 (Latin 3)    8859-3
	 UTF-8		    UTF-8	    ISO 8859-4 (Latin 4)    8859-4
	 UTF-8		    UTF-8	    ISO 8859-5 (Cyrillic)   8859-5
	 UTF-8		    UTF-8	    ISO 8859-6 (Arabic)     8859-6
	 UTF-8		    UTF-8	    ISO 8859-7 (Greek)	    8859-7
	 UTF-8		    UTF-8	    ISO 8859-8 (Hebrew)     8859-8
	 UTF-8		    UTF-8	    ISO 8859-9 (Latin 5)    8859-9
	 UTF-8		    UTF-8	    ISO 8859-10 (Latin 6)   8859-10
	 UTF-8		    UTF-8	    Japanese EUC	    eucJP
	 UTF-8		    UTF-8	    Chinese/PRC EUC	    gb2312
					    (GB 2312-1980)
	 UTF-8		    UTF-8	    ISO-2022		    iso2022
	 UTF-8		    UTF-8	    KOI8-R (Cyrillic)	    KOI8-R
	 UTF-8		    UTF-8	    PC Kanji (SJIS)	    PCK
	 UTF-8		    UTF-8	    PC Kanji (SJIS)	    SJIS
	 UTF-8		    UTF-8	    UCS-2		    UCS-2
	 UTF-8		    UTF-8	    UCS-4		    UCS-4
	 UTF-8		    UTF-8	    UTF-16		    UTF-16
	 UTF-8		    UTF-8	    UTF-7		    UTF-7
	 UTF-8		    UTF-8	    Chinese/PRC EUC	    zh_CN.euc
					    (GB 2312-1980)

			     CODE SET CONVERSIONS SUPPORTED
			     ------------------------------
	   FROM Code Set			       TO Code Set
	       Code		 FROM	       Target Code	      TO
				 Filename			      Filename
				 Element			      Element

	 UTF-8		       UTF-8		 ISO 2022-CN	       zh_CN.iso2022-7
	 UTF-8		       UTF-8		 Chinese/Taiwan Big5   zh_TW-big5
	 UTF-8		       UTF-8		 Chinese/Taiwan  EUC   zh_TW-euc
						 (CNS 11643-1992)
	 UTF-8		       UTF-8		 ISO 2022-TW	       zh_TW-iso2022-7
	 Chinese/PRC EUC       zh_CN.euc	 UTF-8		       UTF-8
	 (GB 2312-1980)
	 ISO 2022-CN	       zh_CN.iso2022-7	 UTF-8		       UTF-8
	 Chinese/Taiwan Big5   zh_TW-big5	 UTF-8		       UTF-8
	 Chinese/Taiwan  EUC   zh_TW-euc	 UTF-8		       UTF-8
	 (CNS 11643-1992)
	 ISO 2022-TW	       zh_TW-iso2022-7	 UTF-8		       UTF-8

EXAMPLES
       Example 1 The library module filename

       In  the conversion library, /usr/lib/iconv (see iconv(3C)), the library module filename is
       composed of two symbolic elements separated by the percent  sign  (%).  The  first  symbol
       specifies  the  code  set  that is being converted; the second symbol specifies the target
       code, that is, the code set to which the first one is being converted.

       In the conversion table above, the first  symbol is termed the  "FROM  Filename	Element".
       The second symbol, representing the target code set, is the "TO Filename Element".

       For  example,  the  library module filename to convert from the Korean EUC code set to the
       Korean UTF-8 code set is

       ko_KR-euc%ko_KR-UTF-8

FILES
       /usr/lib/iconv/*.so    conversion modules

SEE ALSO
       iconv(1), iconv(3C), iconv(5)

       Chernov, A., Registration of a Cyrillic Character Set, RFC 1489, RELCOM Development  Team,
       July 1993.

       Chon,  K.,  H.  Je Park, and U. Choi, Korean Character Encoding for Internet Messages, RFC
       1557, Solvit Chosun Media, December 1993.

       Goldsmith, D., and M. Davis, UTF-7 - A Mail-Safe Transformation	Format	of  Unicode,  RFC
       1642, Taligent, Inc., July 1994.

       Lee,  F.,  HZ  - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and ASCII
       characters, RFC 1843, Stanford University, August 1995.

       Murai, J., M. Crispin, and E. van der Poel, Japanese Character Encoding for Internet  Mes-
       sages, RFC 1468, Keio University, Panda Programming, June 1993.

       Nussbacher,  H.,  and  Y.  Bourvine,  Hebrew Character Encoding for Internet Messages, RFC
       1555, Israeli Inter-University, Hebrew University, December 1993.

       Ohta, M., Character Sets ISO-10646 and ISO-10646-J-1, RFC 1815, Tokyo Institute	of  Tech-
       nology, July 1995.

       Ohta,  M.,  and	K. Handa, ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP, RFC 1554,
       Tokyo Institute of Technology, December 1993.

       Reynolds, J., and J. Postel, ASSIGNED NUMBERS, RFC 1700, University of  Southern  Califor-
       nia/Information Sciences Institute, October 1994.

       Simonson, K., Character Mnemonics & Character Sets, RFC 1345, Rationel Almen Planlaegning,
       June 1992.

       Spinellis, D., Greek Character Encoding for Electronic Mail Messages, RFC 1947, SENA S.A.,
       May 1996.

       The  Unicode  Consortium,  The  Unicode	Standard,  Version 2.0, Addison Wesley Developers
       Press, July 1996.

       Wei, Y., Y. Zhang, J. Li, J. Ding, and Y. Jiang, ASCII Printable Characters-Based  Chinese
       Character  Encoding  for Internet Messages, RFC 1842, AsiaInfo Services Inc., Harvard Uni-
       versity, Rice University, University of Maryland, August 1995.

       Yergeau, F., UTF-8, a transformation format of Unicode and ISO 10646, RFC 2044, Alis Tech-
       nologies, October 1996.

       Zhu,  H., D. Hu, Z. Wang, T. Kao, W. Chang, and M. Crispin, Chinese Character Encoding for
       Internet Messages, RFC 1922, Tsinghua University, China Information  Technology	Standard-
       ization	Technical  Committee (CITS), Institute for Information Industry (III), University
       of Washington, March 1996.

NOTES
       ISO 8859 character sets using Latin alphabetic characters are distinguished as follows:

       ISO 8859-1 (Latin 1)	For most West European languages, including:

				Albanian	     Finnish		   Italian
				Catalan 	     French		   Norwegian
				Danish		     German		   Portuguese
				Dutch		     Galician		   Spanish
				English 	     Irish		   Swedish
				Faeroese	     Icelandic

       ISO 8859-2 (Latin 2)	For most Latin-written Slavic and Central European languages:

				Czech		     Polish		   Slovak
				German		     Rumanian		   Slovene
				Hungarian	     Croatian

       ISO 8859-3 (Latin 3)	Popularly used for Esperanto, Galician, Maltese, and Turkish.

       ISO 8859-4 (Latin 4)	Introduces letters for Estonian, Latvian, and Lithuanian.  It  is
				an incomplete predecessor of ISO 8859-10 (Latin 6).

       ISO 8859-9 (Latin 5)	Replaces the rarely needed Icelandic letters in ISO 8859-1 (Latin
				1) with the Turkish ones.

       ISO 8859-10 (Latin 6)	Adds the last Inuit (Greenlandic) and Sami (Lappish) letters that
				were not included in ISO 8859-4 (Latin 4) to complete coverage of
				the Nordic area.

SunOS 5.11				   18 Apr 1997				 iconv_unicode(5)


All times are GMT -4. The time now is 03:17 AM.

Unix & Linux Forums Content Copyrightę1993-2018. All Rights Reserved.
UNIX.COM Login
Username:
Password:  
Show Password