GB18030(5) BSD File Formats Manual GB18030(5)NAME
gb18030 -- GB 18030 encoding method for Chinese text
SYNOPSIS
ENCODING "GB18030"
DESCRIPTION
The GB18030 encoding implements GB 18030-2000, a PRC national standard for the encoding of Chinese characters. It is a superset of the older
GB 2312-1980 and GBK encodings, and incorporates Unicode's Unihan Extension A completely. It also provides code space for all Unicode 3.0
code points.
Multibyte characters in the GB18030 encoding can be one byte, two bytes, or four bytes long. There are a total of over 1.5 million code
positions.
GB 11383-1981 (ASCII) characters are represented by single bytes in the range 0x00 to 0x7F.
Chinese characters are represented as either two bytes or four bytes. Characters that are represented by two bytes begin with a byte in the
range 0x81-0xFE and end with a byte either in the range 0x40-0x7E or 0x80-0xFE.
Characters that are represented by four bytes begin with a byte in the range 0x81-0xFE, have a second byte in the range 0x30-0x39, a third
byte in the range 0x81-0xFE and a fourth byte in the range 0x30-0x39.
SEE ALSO euc(5), gb2312(5), gbk(5), utf8(5)
Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange --
Extension for the basic set, March 2000.
The Unicode Standard, Version 3.0, The Unicode Consortium, 2000.
STANDARDS
The GB18030 encoding is believed to be compatible with GB 18030-2000.
BSD August 10, 2003 BSD
Check Out this Related Man Page
GB18030(5) BSD File Formats Manual GB18030(5)NAME
gb18030 -- GB 18030 encoding method for Chinese text
SYNOPSIS
ENCODING "GB18030"
DESCRIPTION
The GB18030 encoding implements GB 18030-2000, a PRC national standard for the encoding of Chinese characters. It is a superset of the older
GB 2312-1980 and GBK encodings, and incorporates Unicode's Unihan Extension A completely. It also provides code space for all Unicode 3.0
code points.
Multibyte characters in the GB18030 encoding can be one byte, two bytes, or four bytes long. There are a total of over 1.5 million code
positions.
GB 11383-1981 (ASCII) characters are represented by single bytes in the range 0x00 to 0x7F.
Chinese characters are represented as either two bytes or four bytes. Characters that are represented by two bytes begin with a byte in the
range 0x81-0xFE and end with a byte either in the range 0x40-0x7E or 0x80-0xFE.
Characters that are represented by four bytes begin with a byte in the range 0x81-0xFE, have a second byte in the range 0x30-0x39, a third
byte in the range 0x81-0xFE and a fourth byte in the range 0x30-0x39.
SEE ALSO euc(5), gb2312(5), gbk(5), utf8(5)
Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange --
Extension for the basic set, March 2000.
The Unicode Standard, Version 3.0, The Unicode Consortium, 2000.
STANDARDS
The GB18030 encoding is believed to be compatible with GB 18030-2000.
BSD August 10, 2003 BSD
hi,guy.i am from beijing.and i am translating <FreeBSD handbook>to Chinese.I have difficult in translating that.may you give me suggests?
by the way,where are you from?can you speak Chinese?
Thanks! (4 Replies)
Hi there,
I need to get a Chinese disclaimer attached to an email on a Solaris 2.7 box. The disclaimer we use is in English and stored as a text file although I've been asked to see if we can add the Chinsese one?
Is it simply just a matter of adding the Chinese locale to the OS or is there... (1 Reply)
Using PERL.
I have a variable called number. I need to match a range. The number can be no bigger than 6777 and no smaller than 2000. The number will always begin with a 2, 4, or 6. The next three numbers will range between 1 and 7.
$number = xxxx
I'm trying something like this but I... (1 Reply)
Dear All,
I have excel files containing Chinese characters. I have a requirement to display the contents of both the English and the Chinese files in the Unix box using the vi editor. But I when I try to open the Chinese files, the characters are junk. Can one of you help me in getting rid of... (4 Replies)
I installed RedHat Server 5 today. I had one problem that was the web
page contained Chinese characters could not be shown correctly. What I saw was white squares. How to setup Chinese envirenment on RedHat Server 5?
Thanks in advance. (1 Reply)
Dear All
My INPUT text is as follows
200706,82222219!,0,1,,,"2000-05-24"
200706,82222218!,0,1,,,"2000-06-24"
200706,82222218,0,1,,,"2000-06-24"
200706,82222200!,0,1,,,"2000-06-24"
I Want to remove all the lines or that line that contains ! (Exclamation) in the second column.how to do... (11 Replies)
Is there any way to delete the Junk Characters(Invalid Characters like ^,',",),(,&,# etc.,) at the end of each record in a file?
I want to do this using a single line script.
Thanks to all in advance!!! (5 Replies)
Hi,
I have a file in my PC with Chinese (PRC) characters. When I transfer that file into my AIX5.3 server through FTP, I get junk characters instead of Chinese chars. Last week I copied the content from my notepad and pasted in vi editor. The Chinese characters were correct, but when I try now,... (4 Replies)
Dear Members,
I have a file which contains ControlM characters in it.
I need a command by the means of which i should be able to identify if a file has controlM characters.
How can this be achieved.
Thanks
Sandeep (4 Replies)
Hello,
I'm trying to figure out how to display Chinese and Japanese Characters on my RHEL 6 Console. There is no more "bogl-bterm" for RHEL6, that is not supported anymore. Is there any way that I could display them?
Thank you. (2 Replies)
Hello,
I'm trying to figure out how to display Chinese and Japanese Characters on my SLES 11 Console. Is there any way that I could display those characters on my console?
Thank you. (3 Replies)
Hi All,
I have a issue that we are getting Junk characters from source and i am not able to load that records to Database.
Line breakers
Junk Characters (Â and different every time)
Japanese Characters
Every time I am using grep command and awk -F "\007" to find them and delete that... (1 Reply)