Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

sbig5(5) [osf1 man page]

sbig5(5)							File Formats Manual							  sbig5(5)

NAME
sbig5 - A character encoding system (codeset) for Traditional Chinese DESCRIPTION
The Shift Big-5 (sbig5) codeset is a variant of the Big-5 codeset (see big5(5)). The only difference between these codesets is that the second byte of some Big-5 characters are mapped to different values in the Shift Big-5 codeset. The remapping is done to avoid having some metacharacters like *, which has special meaning to UNIX commands, in the second byte of a 2-byte Big-5 character. The mappings of Big-5 characters to Shift Big-5 characters are as follows: ------------------------------------------------------------------------ Big-5 (2nd Character Symbol Shift Big-5 (2nd Character Symbol byte) byte) ------------------------------------------------------------------------ 40 @ 30 0 5B [ 31 1 5C 32 2 5D ] 33 3 5E ^ 34 4 5F - 35 5 60 ` 36 6 7B { 37 7 7C | 38 8 7D } 39 9 7E ~ 9F nil ------------------------------------------------------------------------ The sbig5 codeset is not supported by a locale but only through codeset conversion. Codeset Conversion The following codeset converter pairs are available for converting Traditional Chinese characters between sbig5 and other encoding formats. Refer to iconv_intro(5) for an introduction to codeset conversion. For more information about the other codeset for which sbig5 is the input or output, see the reference page specified in the list item. big5_sbig5, sbig5_big5 Converting from and to the Big-5 codeset: big5(5). Note that Big-5 encoding is equivalent to the Microsoft code-page format used on PCs for Traditional Chinese. Therefore, you can use these converters to convert Traditional Chinese between PC code-page format and Shift Big-5 encoding. eucTW_sbig5, sbig5_eucTW Converting from and to Taiwanese Extended UNIX Code: eucTW(5). SEE ALSO
Commands: locale(1) Others: ascii(5), big5(5), Chinese(5), code_page(5), dechanyu(5), dechanzi(5), eucTW(5), GBK(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), l10n_intro(5), telecode(5) sbig5(5)

Check Out this Related Man Page

GBK(5)								File Formats Manual							    GBK(5)

NAME
GBK, gbk - A character encoding system (codeset) for Simplified Chinese DESCRIPTION
The GBK character set is an extension to the GB 2312-80 character set. (The "K" in "GBK" is the first sound in the Chinese word "Kuo Zhan," which means "extension.") GBK includes all the Hanzi characters specified by the ISO 10646-1:1993 standard (characters also known as the GB 13000.1.93 character set) that are not included in GB 2312-80. GBK is therefore defined as a normative annex of GB13000.1-93. GBK Value Ranges and Code Points The GBK codeset is divided into five levels, as follows: ------------------------------------------------------------ Level Encoding Range Code Points Characters ------------------------------------------------------------ GBK/1 0xA1A1-0xA9FE 846 717 GBK/2 0xB0A1-0xF7FE 6,768 6,763 GBK/3 0x8140-0xA0FE 6,080 6,080 GBK/4 0xAA40-0xFE40 8,160 8,160 GBK/5 0xA840-0xA9A0 192 166 ------------------------------------------------------------ In addition, GBK includes code points for user-defined characters, as follows: ----------------------------- Encoding Range Code Points ----------------------------- 0xAAA1-0xAFFE 564 0xF8A1-0xFEFE 658 0xA140-0xA7A0 672 ----------------------------- GBK therefore provides a total of 23,940 code points, 21,886 of which are assigned. Each row in the GBK code table consists of 190 characters. ASCII characters, which are single-byte characters, are defined in the range 0x21-0x7E. Encoding ranges for two-byte characters are as follows: Encoding range for the first byte: 0x81-0xFE Encoding ranges for the second byte: 0x40-0x7E and 0x80-0xFE Note In terms of character-to-code allocation, the sub-range for GB2321-80 characters (0xA1A1-0xFEFE) in GBK is the same encoding range defined for these characters in Extended UNIX Code (EUC). GBK is therefore backward compatible with Chinese EUC encoding as well as forward compat- ible with the encoding as defined by ISO 10646-1:1993. GBK is the standard character set and encoding used in the Simplified Chinese version of Windows 95. Codeset Converters for GBK The following codeset converters are available for GBK: GBK_UCS-2 GBK_UCS-4 GBK_UTF-8 UCS-2_GBK UCS-4_GBK UTF-8_GBK See iconv_intro(5) for more information about codeset converters and Unicode(5) for information about the UCS-2, UCS-4, and UTF-8 encoding formats. Fonts for GBK The operating system provides the following TrueType fonts for GBK: -huatian-fangsong-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0 -huatian- fangsong-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1 -huatian-fangsong-medium-r-normal--0-0-0-0-c-0-gbk-1 -huatian-fangsong-medium-r-nor- mal--0-0-0-0-m-0-iso8859-1 -huatian-heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0 -huatian-heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1 -huatian-heiti- medium-r-normal--0-0-0-0-c-0-gbk-1 -huatian-heiti-medium-r-normal--0-0-0-0-m-0-iso8859-1 -huatian-kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0 -huatian-kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1 -huatian-kaiti- medium-r-normal--0-0-0-0-c-0-gbk-1 -huatian-kaiti-medium-r-normal--0-0-0-0-m-0-iso8859-1 -huatian-songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0 -huatian-songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1 -huatian- songti-medium-r-normal--0-0-0-0-c-0-gbk-1 -huatian-songti-medium-r-normal--0-0-0-0-m-0-iso8859-1 These fonts can be used for printing only with Chinese text printers. The SongTi fonts are the default screen fonts for the GBK codeset. SEE ALSO
Commands: locale(1) Others: ascii(5), big5(5), Chinese(5), dechanyu(5), dechanzi(5), eucTW(5), i18n_intro(5), i18n_printing(5), l10n_intro(5), sbig5(5), tele- code(5) GBK(5)
Man Page