Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

mskanji(5) [osx man page]

MSKANJI(5)						      BSD File Formats Manual							MSKANJI(5)

NAME
mskanji -- Shift-JIS (MS Kanji) encoding for Japanese text SYNOPSIS
ENCODING "MSKanji" DESCRIPTION
Shift-JIS, also known as MS Kanji or SJIS, is an encoding system for Japanese characters, developed by Microsoft Corporation. It encodes the characters from the JIS X 0201 (ASCII/JIS-Roman) and JIS X 0208 (Japanese) character sets as sequences of either one or two bytes. Characters from the ASCII/JIS-Roman character set are encoded as single bytes between 0x00 and 0x7F (ASCII) or 0xA1 and 0xDF (Half-width katakana). Characters from the JIS X 0208 character set are encoded as two bytes. The first ranges from 0x81 - 0x9F, 0xE0 - 0xEA, 0xED - 0xEE (not JIS: NEC-selected IBM extended characters), 0xF0 - 0xF9 (not JIS: user defined), or 0xFA - 0xFC (not JIS: IBM extended characters). The second byte ranges from 0x40 - 0xFC, excluding 0x7F (delete). SEE ALSO
euc(5), utf8(5) BSD
August 7, 2003 BSD

Check Out this Related Man Page

PCK(5)							Standards, Environments, and Macros						    PCK(5)

NAME
PCK, pck - map between PCK and character set DESCRIPTION
PCK stands for PC Kanji code. It is also known as Shift-JIS (MS kanji) code. This manual page shows map between PCK and character set. PCK allocates the first byte of each kanji code which corresponds to JIS X0208-1990 and so on to the code other than code range of 0x00-0x7f and 0xa1-0xdf. The following is a map table for PCK and character set. +-----------------------------------------------------------+ |PCK character set | |0x00 - 0x19 JIS X 0201-1976 function | | character set | |0x20 JIS X 0201-1976 space char- | | acter | |0x21 - 0x7e JIS X 0201-1976 figure | | character set for roman | | character | |0x7f JIS X 0201-1976 erase char- | | acter | |0xa1 - 0xdf JIS X 0201-1976 figure | | character set for katakana | | (except for an area of | | undefined character E/0 - | | F/14) | |0x8140 - 0x817e JIS X 0208-1990 (1 ku 1 | | ten - 1 ku 63 ten) | |0x8180 - 0x819e JIS X 0208-1990 (1 ku 64 | | ten - 1 ku 94 ten) | |0x819f - 0x81fc JIS X 0208-1990 (2 ku 1 | | ten - 2 ku 94 ten) | | : : | | : : | |0x9f9f - 0x9ffc JIS X 0208-1990 (62 ku 1 | | ten - 62 ku 94 ten) | |0xe040 - 0xe07e JIS X 0208-1990 (63 ku 1 | | ten - 63 ku 63 ten) | | : : | | : : | |0xea40 - 0xea7e JIS X 0208-1990 (83 ku 1 | | ten - 83 ku 63 ten) | |0xea80 - 0xea9e JIS X 0208-1990 (83 ku 64 | | ten - 83 ku 94 ten) | |0xea9f - 0xeafc JIS X 0208-1990 (84 ku 1 | | ten - 84 ku 94 ten) | +-----------------------------------------------------------+ The differences from character set of Japanese EUC are, o Character set of Japanese EUC codeset 3 (JIS X 0212-1990) is not assigned except a part of Vender Defined Character (IBM extension character) (see NOTES). o Character set which corresponds to Japanese EUC C1 control code is not assigned. NOTES
In PCK area below has special meaning. +---------------------------------------------------------------------------+ |PCK area of character set meaning | |0x8740 - 0x879e JIS X 0208-1990 13 ku Vender Defined Character | | (special symbols) | |0xeb40 - 0xecfc Invalid | |0xed40 - 0xeffc Vender Defined Character | | (IBM extension character NEC | | selected) | |0xf040 - 0xf9fc User Defined Character | | 1 ku - 20 ku | |0xfa40 - 0xfcfc Vender Defined Character | | (IBM extension character) | +---------------------------------------------------------------------------+ SEE ALSO
eucJP(5) SunOS 5.10 6 Jan 1999 PCK(5)
Man Page