euctw(5) osf1 man page

eucTW(5)							File Formats Manual							  eucTW(5)

NAME
       eucTW - A character encoding system (codeset) for Traditional Chinese

DESCRIPTION
       The Taiwanese EUC (Extended UNIX Code), or eucTW, codeset consists of the following character sets: ASCII CNS 11643 (Plane 1 to Plane 16)

       Taiwanese  EUC  uses  a combination of single-byte data and 2-byte data to represent ASCII characters, symbols, and ideographic characters.
       Because too many character planes were included, Taiwanese EUC uses different leading codes to designate different character planes.

       ASCII characters are represented in the form of single byte 7-bit data in Taiwanese EUC; that is, the most significant  bit  (MSB)  of  the
       byte that represents an ASCII character is always set off. For more information, refer to ascii(5).

       Although  the  standard	Taiwanese EUC codeset includes all characters defined by the CNS 11643-1992 standard, the operating system's eucTW
       implementation currently supports the following: Characters defined in the first and second planes of CNS 11643 The EDPC Recommended  Char-
       acter Set (refer to dechanyu(5) for more information) CNS 11643-1986 and DTSCS characters that have been remapped into the third and fourth
       character planes by the CNS 11643-1992 standard

       Characters that were added to CNS 11643-1986 by the CNS 11643-1992 standard are not supported.

       The characters that are defined in plane 1 and plane 2 of CNS 11643-1992 and that are the same as those defined in CNS  11643-1986  are	as
       follows:

       ---------------------------------------------------------------------
       Character Plane	 Character Type 	       Number of Characters
       ---------------------------------------------------------------------
       1		 Special characters	       651
			 Control characters	       33
			 Frequently-used characters    5401
       2		 Less  frequently-used char-   7650
			 acters
       ---------------------------------------------------------------------

       The characters defined in plane 3 and plane 4 of CNS 11643-1992 are as follows:

       ---------------------------------------------------------------------------
       Character Plane	 Character Type 			   Number      of
								   Characters
       ---------------------------------------------------------------------------
       3		 Rarely-used characters (EDPC Part I)	   6148
       4		 Used for residency system, ISO 2nd edi-   7298
			 tion DIS 10646 Han characters, 171 EDPC
			 Part II Characters
       ---------------------------------------------------------------------------

       The  characters	that  have been remapped into the third and fourth character planes of CNS 11643-1992 as specified by the EDPC are as fol-
       lows:

       ---------------------------------------------------------
       EDPC Characters	 Character Plane   Number of Characters
       ---------------------------------------------------------
       Part I		 Plane 3	   6148
       Part II		 Plane 4	   171
       ---------------------------------------------------------

   Taiwanese EUC Encoding
       Except for characters in the first plane of CNS 11643-1986, Taiwanese EUC makes use of a leading code (the  8-bit  Single-Shift	2  control
       character (SS2) and an additional byte) to designate characters to a character plane.

       The  position of a character on a plane is specified by two bytes. The first byte determines the character's row number and the second byte
       determines the character's column number. The MSB of both bytes is set on.

       The following table shows the encoding of Taiwanese EUC characters:

       -------------------------------------------------------
       CNS 11643-1986 Code Plane   Leading Code   Code Range
       -------------------------------------------------------
       1			   [nil]	  A1A1 - FEFE
       2			   SS2 A2	  A1A1 - FEFE
       3			   SS2 A3	  A1A1 - FEFE
       4			   SS2 A4	  A1A1 - FEFE
       5			   SS2 A5	  A1A1 - FEFE
       6			   SS2 A6	  A1A1 - FEFE

       7			   SS2 A7	  A1A1 - FEFE
       8			   SS2 A8	  A1A1 - FEFE
       9			   SS2 A9	  A1A1 - FEFE
       10			   SS2 AA	  A1A1 - FEFE
       11			   SS2 AB	  A1A1 - FEFE
       12			   SS2 AC	  A1A1 - FEFE
       13			   SS2 AD	  A1A1 - FEFE
       14			   SS2 AE	  A1A1 - FEFE
       15			   SS2 AF	  A1A1 - FEFE
       16			   SS2 B0	  A1A1 - FEFE
       -------------------------------------------------------

   Codeset Conversion
       The following codeset converter pairs are available for converting Traditional Chinese characters between eucTW and other encoding formats.
       Refer  to  iconv_intro(5)  for  an  introduction to codeset conversion. For more information about the other codeset for which eucTW is the
       input or output, see the reference page specified in the list item.  big5_eucTW, eucTW_big5

	      Converting from and to the Big-5 codeset: big5(5).

	      Note that Big-5 encoding is equivalent to the Microsoft code-page format used on PCs for Traditional Chinese. You can therefore  use
	      this set of converters to convert Traditional Chinese text between the eucTW and PC code-page formats. For information about how the
	      operating system supports PC code pages, see code_page(5).  dechanyu_eucTW, eucTW_dechanyu

	      Converting from and to the DEC Hanyu codeset: dechanyu(5).  dechanzi_eucTW, eucTW_dechanzi

	      Converting from and to the DEC Hanzi codeset: dechanzi(5).  sbig5_eucTW, eucTW_sbig5

	      Converting from and to the Shift Big-5 codeset: sbig5(5).  telecode_eucTW, eucTW_telecode

	      Converting from and to the Telecode codeset: telecode(5).  UCS-2_eucTW, eucTW_UCS-2

	      Converting from and to UCS-2 format: Unicode(5).	UCS-4_eucTW, eucTW_UCS-4

	      Converting from and to UCS-4 format: Unicode(5).	UTF-8_eucTW, eucTW_UTF-8

	      Converting from and to UTF--8 format: Unicode(5).

   Fonts for Taiwanese EUC
       For both display devices and printers, the operating system supports Taiwanese EUC through internal conversion to DEC Hanyu code and use of
       DEC Hanyu fonts (see dechanyu(5)).

       For general information on printing non-English text, refer to i18n_printing(5).

SEE ALSO
       Commands: locale(1)

       Others:	ascii(5),  big5(5), Chinese(5), code_page(5), dechanzi(5), GBK(5), iconv_intro(5), i18n_intro(5), i18n_printing(5), l10n_intro(5),
       sbig5(5), telecode(5), Unicode(5)

																	  eucTW(5)
euctw(5) osf1 man page | unix.com