Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

telecode(5) [osf1 man page]

telecode(5)							File Formats Manual						       telecode(5)

NAME
telecode - A character encoding system (codeset) for Traditional Chinese DESCRIPTION
The Telecode codeset (called Mitac Telex in early versions of the operating system) consists of 2 character planes. Each character plane has 8836 character positions. In plane 1, standard characters occupy positions 0001 to 8045; the remaining 791 positions are for user- defined characters. In plane 2, standard characters occupy positions 0001 to 8489; the remaining 346 positions are for user-defined charac- ters. Telecode uses 2-byte values to represent characters on both planes. Plane 1 Character Encoding To differentiate plane 1 code from plane 2 code, the most significant bit (MSB) is set on in both bytes of a plane 1 character code. The following formula calculates the value of a plane 1 character from its position on the plane: 1st byte = M + 161 2nd byte = N + 161 - M x 94 In this formula, N is the position of the character and M = N / 94. For example, if a character is at position 2502 on plane 1, its encoding value is BBDB, which is calculated as follows: N = 2502, M = 2502/94 = 26 1st byte = 26 + 161 = 187 2nd byte = 2502 + 161 - 26 x 94 = 219 Plane 2 Character Encoding To differentiate plane 2 code from plane 1 code, the MSB of the first byte is set on and that of the second byte is set off for each plane 2 character code. The following formula calculates the value of a plane 2 character from its position: 1st byte = M + 161 2nd byte = N + 33 - M x 94 In this formula, N is the position of the character on the plane and M = N / 94. For example, if a character is at position 2502 on plane 2, its encoding value is BB5B, which is calculated as follows: N = 2502, M = 2502/94 = 26 1st byte = 26 + 161 = 187 2nd byte = 2502 + 33 - 26 x 94 = 91 Codeset Conversion The following codeset converter pairs are available for converting Traditional Chinese characters between telecode and other encoding for- mats. Refer to iconv_intro(5) for an introduction to codeset conversion. For more information about the other codeset for which telecode is the input or output, see the reference page specified in the list item. big5_telecode, telecode_big5 Converting from and to the Big-5 codeset: big5(5). Note that Big-5 encoding is equivalent to the Microsoft code-page format used on PCs for Traditional Chinese. You can therefore use these converters to convert Traditional Chinese characters between PC code page format and Telecode encoding format. For more infor- mation on how the operating system supports PC code pages, see code_page(5). dechanyu_telecode, telecode_dechanyu Converting from and to the DEC Hanyu codeset: dechanyu(5). eucTW_telecode, telecode_eucTW Converting from and to Taiwanese Extended UNIX Code: eucTW(5). Font Support for Telecode The operating system supports Telecode only through conversion to another codeset. SEE ALSO
Commands: locale(1) Others: ascii(5), big5(5), Chinese(5), code_page(5), dechanyu(5), dechanzi(5), eucTW(5), GBK(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), l10n_intro(5), sbig5(5) telecode(5)

Check Out this Related Man Page

dechanzi(5)							File Formats Manual						       dechanzi(5)

NAME
dechanzi - A character encoding system (codeset) for Simplified Chinese DESCRIPTION
The DEC Hanzi (dechanzi) codeset consists of the following character sets: ASCII GB2312-80 Extended GB DEC Hanzi uses a 2-byte data representation for symbols and ideographic characters that are defined in GB2312-80. ASCII Characters All ASCII characters are represented in the form of single-byte, 7-bit data in the DEC Hanzi codeset; that is, the most significant bit (MSB) of the byte that represents an ASCII character is always set off. For more information on ASCII characters, refer to ascii(5). GB2312-80 Characters The code table for GB2312-80 characters is divided into 94 rows(Qu), numbered from 1 to 94. Each row has 94 columns(Wei), also numbered from 1 to 94. The code table defines a total of 7445 characters, of which 6763 are Chinese characters. Chinese characters are grouped as follows: Graphic symbols There are 682 graphic symbols, which occupy rows 1 to 9 in the code table. Frequently used (Level 1) characters There are 3755 frequently used characters, which occupy rows 16 to 55 in the code table. Less frequently used (Level 2) characters There are 3008 less frequently used characters, which occupy rows 56-87 in the code table. To differentiate GB2312-80 character codes from ASCII and Extended GB character codes, the most significant bit (MSB) of both the first byte and the second byte are set on. The following formulas show how to calculate the value for a GB2312-80 character from its row and col- umn numbers: 1st byte = A0 + Row number 2nd byte = A0 + Column number For example, if a GB2312-80 character is in the first column of the 16th row, the character's value is B0A1, which is calculated as fol- lows: 1st byte = A0(hex) + 16 = B0(hex) 2nd byte = A0(hex) + 01 = A1(hex) Extended GB Characters The Extended GB code table is similar to the GB2312 code table and is divided into 94 rows and 94 columns (8894 code points). However, the Extended GB code table provides code points for user-defined characters (UDC). The 8836 code points in this table are divided into two areas: User-defined area This area spans rows 1 to 87 and provides 8178 code points. User-defined (reserved) area This area spans rows 88 to 94 and provides 658 code points. This area is where users can define special and long-lasting user- defined characters. To differentiate Extended GB codes from ASCII codes and GB2312-80 codes, the most significant bit (MSB) of the first byte is set on while that of the second byte is set off. The following formulas show how the code value of an Extended GB character is calculated from its row and column numbers: 1st byte = A0 + Row number 2nd byte = 20 + Column number For example, if a character is positioned at the first column of the 16th row on the GB2312-80 code plane, the character's value is B021, which is calculated as follows: 1st byte = A0(hex) + 16 = B0(hex) 2nd byte = 20(hex) + 01 = 21(hex) Codeset Conversion The following codeset converter pairs are available for converting Simplified Chinese characters between dechanzi and other encoding for- mats. Refer to iconv_intro(5) for an introduction to codeset conversion. For more information about the other codeset for which dechanzi is the input or output, see the reference page specified in the list item. big5_dechanzi, dechanzi_big5 Converting from and to the Big-5 codeset: big5(5) dechanyu_dechanzi, dechanzi_dechanyu Converting from and to the DEC Hanyu codeset: dechanyu(5) eucTW_dechanzi, dechanzi_eucTW Converting from and to Taiwanese Extended UNIX Code: eucTW(5) UCS-2_dechanzi, dechanzi_UCS-2 Converting from and to UCS-2 format: Unicode(5) UCS-4_dechanzi, dechanzi_UCS-4 Converting from and to UCS-4 format: Unicode(5) UTF-8_dechanzi, dechanzi_UTF-8 Converting from and to UTF-8 format: Unicode(5) DEC Hanzi encoding is identical to the Microsoft code-page format (cp936) used for Simplified Chinese characters on PC systems. However, DEC Hanzi supports fewer characters than supported by the code page. Therefore, using converters with dechanzi in the converter name to convert between cp936 and other formats can result in some data loss. Refer to code_page(5) for more information about PC code pages. DEC Hanzi Fonts The operating system provides both screen and printer fonts for DEC Hanzi characters. The following bitmap fonts are grouped according to family and reflect various sizes and typefaces for 75dpi and 100dpi display devices: Fangsongti Family: -adecw-fangsongti-medium-r-normal--24-240-75-75-m-240-gb2312.1980-1 -adecw-fangsongti-medium-r-normal--34-340-75-75-m-340-gb2312.1980-1 -adecw-fangsongti-medium-r-normal--24-240-100-100-m-240-gb2312.1980-1 -adecw-fangsongti-medium-r-normal--34-340-100-100-m-340-gb2312.1980-1 Heiti Family: -adecw-heiti-medium-r-normal--16-160-75-75-m-160-gb2312.1980-1 -adecw-heiti-medium-r-normal--24-240-75-75-m-240-gb2312.1980-1 -adecw-heiti- medium-r-normal--34-340-75-75-m-340-gb2312.1980-1 -adecw-heiti-medium-r-normal--16-160-100-100-m-160-gb2312.1980-1 -adecw-heiti-medium-r- normal--24-240-100-100-m-240-gb2312.1980-1 -adecw-heiti-medium-r-normal--34-340-100-100-m-340-gb2312.1980-1 Kaiti Family: -adecw-kaiti-medium-r-normal--24-240-75-75-m-240-gb2312.1980-1 -adecw-kaiti-medium-r-normal--34-340-75-75-m-340-gb2312.1980-1 -adecw-kaiti- medium-r-normal--24-240-100-100-m-240-gb2312.1980-1 -adecw-kaiti-medium-r-normal--34-340-100-100-m-340-gb2312.1980-1 Screen Family: -adecw-screen-medium-r-normal--18-180-75-75-m-160-gb2312.1980-1 -adecw-screen-medium-r-normal--24-240-75-75-m-240-gb2312.1980-1 -adecw- screen-medium-r-normal--18-180-100-100-m-160-gb2312.1980-1 -adecw-screen-medium-r-normal--24-240-100-100-m-240-gb2312.1980-1 -adecw-screen- medium-r-normal--18-180-100-100-m-160-gb2312.1980-UDC -adecw-screen-medium-r-normal--24-240-100-100-m-240-gb2312.1980-UDC Songti Family: -adecw-songti-medium-r-normal--16-160-75-75-m-160-gb2312.1980-1 -adecw-songti-medium-r-normal--24-240-75-75-m-240-gb2312.1980-1 -adecw- songti-medium-r-normal--34-340-75-75-m-340-gb2312.1980-1 -adecw-songti-medium-r-normal--16-160-100-100-m-160-gb2312.1980-1 -adecw-songti- medium-r-normal--24-240-100-100-m-240-gb2312.1980-1 -adecw-songti-medium-r-normal--34-340-100-100-m-340-gb2312.1980-1 The operating system provides the following PostScript printer fonts for DEC Hanzi characters: Hei-GB2312-80 XiSong-GB2312-80 For general information on printing Asian language text, refer to i18n_printing(5). SEE ALSO
Commands: locale(1) Others: ascii(5), big5(5), Chinese(5), code_page(5), dechanyu(5), eucTW(5), GBK(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), l10n_intro(5), sbig5(5), telecode(5), Unicode(5) dechanzi(5)
Man Page