euc(5) [osx man page]

EUC(5)							      BSD File Formats Manual							    EUC(5)

NAME

     euc -- EUC encoding of wide characters

SYNOPSIS

     ENCODING "EUC"

     VARIABLE len1 mask1 len2 mask2 len3 mask3 len4 mask4 mask

DESCRIPTION

     EUC implements a system of 4 multibyte codesets.  A multibyte character in the first codeset consists of len1 bytes starting with a byte in
     the range of 0x00 to 0x7f.  To allow use of ASCII, len1 is always 1.  A multibyte character in the second codeset consists of len2 bytes
     starting with a byte in the range of 0x80-0xff excluding 0x8e and 0x8f.  A multibyte character in the third codeset consists of len3 bytes
     starting with the byte 0x8e.  A multibyte character in the fourth codeset consists of len4 bytes starting with the byte 0x8f.

     The wchar_t encoding of EUC multibyte characters is dependent on the len and mask arguments.  First, the bytes are moved into a wchar_t as
     follows:

     byte0 << ((lenN-1) * 8) | byte1 << ((lenN-2) * 8) | ... | bytelenN-1

     The result is then ANDed with ~mask and ORed with maskN.  Codesets 2 and 3 are special in that the leading byte (0x8e or 0x8f) is first
     removed and the lenN argument is reduced by 1.

     For example, the ja_JP.eucJP locale has the following VARIABLE line:

     VARIABLE	     1 0x0000 2 0x8080 2 0x0080 3 0x8000 0x8080

     Codeset 1 consists of the values 0x0000 - 0x007f.

     Codeset 2 consists of the values who have the bits 0x8080 set.

     Codeset 3 consists of the values 0x0080 - 0x00ff.

     Codeset 4 consists of the values 0x8000 - 0xff7f excluding the values which have the 0x0080 bit set.

     Notice that the global mask is set to 0x8080, this implies that from those 2 bits the codeset can be determined.

SEE ALSO

     mklocale(1), setlocale(3)

BSD
								 November 8, 2003							       BSD

Check Out this Related Man Page

eucKR(5)							File Formats Manual							  eucKR(5)

NAME

       eucKR - A character encoding system (codeset) for Korean

DESCRIPTION

       EUC  (Extended  UNIX Code) is a codeset extended by AT&T Bell Laboratories for use in various countries in the world. Korean EUC is the EUC
       codeset for representing Korean data. The encoding of Korean EUC (eucKR) is currently identical to that of  the	DEC  Korean  codeset  (see
       deckorean(5)).

   Codeset Conversion
       The  following  codeset	converter pairs are available for converting Korean characters between eucKR and other encoding formats.  Refer to
       iconv_intro(5) for an introduction to codeset conversion. For more information about the other codeset for which eucKR is the input or out-
       put, see the reference page specified in the list item.	deckorean_eucKR, eucKR_deckorean

	      Converting from and to the DEC Korean codeset: deckorean(5).

	      Because  DEC  Korean  is	currently  identical  to  Korean  Extended  UNIX  Code,  this converter is not useful.	ISO-2022-KR_eucKR,
	      eucKR_ISO-2022-KR

	      Converting from and to the ISO 2022 Korean codeset: iso2022(5).

   Font Support for Korean EUC
       For both display devices and printers, the operating system supports the Korean EUC codeset through DEC Korean fonts. See deckorean(5)  for
       information about Korean bitmap and PostScript fonts.

SEE ALSO

       Commands: locale(1)

       Others: ascii(5), deckorean(5), iconv_intro(5), i18n_intro(5), i18n_printing(5), l10n_intro(5), Korean(5)

																	  eucKR(5)

Linux and UNIX Man Pages

euc(5) [osx man page]

Check Out this Related Man Page