GBK(5) BSD File Formats Manual GBK(5)NAME
gbk -- Guojia biaozhun kuozhan (GBK) encoding method for Chinese text
SYNOPSIS
ENCODING "GBK"
DESCRIPTION
GBK is a backwards-compatible extension of the GB 2312-1980 encoding method for Chinese text, which adds the characters defined in the Uni-
fied Han portion of the Unicode 2.1 standard.
Multibyte characters in the GBK encoding can be one byte or two bytes long. GB 11383-1981 (ASCII) characters are represented by single bytes
in the range 0x00 to 0x7F. Chinese characters are represented by two bytes, beginning with a byte in the range 0x80-0xFE and ending with a
byte in the range 0x40-0xFE.
SEE ALSO euc(5), gb18030(5), gb2312(5), utf8(5)
The Unicode Standard, Version 2.1, The Unicode Consortium, 1999.
Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange --
Extension for the basic set, March 2000.
STANDARDS
GBK is not a standard, but has been superseded by GB 18030-2000.
BSD August 10, 2003 BSD
Check Out this Related Man Page
GB18030(5) BSD File Formats Manual GB18030(5)NAME
gb18030 -- GB 18030 encoding method for Chinese text
SYNOPSIS
ENCODING "GB18030"
DESCRIPTION
The GB18030 encoding implements GB 18030-2000, a PRC national standard for the encoding of Chinese characters. It is a superset of the older
GB 2312-1980 and GBK encodings, and incorporates Unicode's Unihan Extension A completely. It also provides code space for all Unicode 3.0
code points.
Multibyte characters in the GB18030 encoding can be one byte, two bytes, or four bytes long. There are a total of over 1.5 million code
positions.
GB 11383-1981 (ASCII) characters are represented by single bytes in the range 0x00 to 0x7F.
Chinese characters are represented as either two bytes or four bytes. Characters that are represented by two bytes begin with a byte in the
range 0x81-0xFE and end with a byte either in the range 0x40-0x7E or 0x80-0xFE.
Characters that are represented by four bytes begin with a byte in the range 0x81-0xFE, have a second byte in the range 0x30-0x39, a third
byte in the range 0x81-0xFE and a fourth byte in the range 0x30-0x39.
SEE ALSO euc(5), gb2312(5), gbk(5), utf8(5)
Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange --
Extension for the basic set, March 2000.
The Unicode Standard, Version 3.0, The Unicode Consortium, 2000.
STANDARDS
The GB18030 encoding is believed to be compatible with GB 18030-2000.
BSD August 10, 2003 BSD
Hi,
I have a file in my desktop which is a unicode format. After this file is transferred to Unix using FTP, we are seeing some special character (like rectangle box type) at the first line. The same file is saved as UTF8 (using textpad tool, selecting encode to UTF-8 option) on my desktopand... (7 Replies)
I am wondering if someone can help a brother out. I am trying to create a DB using a GUI and when I am about to finish, it gets stuck. I hit finish but nothing happens. Any help from the community will be highly appreciated.
... (0 Replies)
Hi,
i have been trying to extract multiple lines based on two different patterns as below:-
file1
@jkm|kdo|aas012|192.2.3.1 blablbalablablkabblablabla
sjfdsakfjladfjefhaghfagfkafagkjsghfalhfk
fhajkhfadjkhfalhflaffajkgfajkghfajkhgfkf
jahfjkhflkhalfdhfwearhahfl
@jkm|sdf|wud08q|168.2.1.3... (8 Replies)
Hello,
Im downloading the file from windows server through FTP, the downloaded file is containing some junk character at very start of the file as below and causing my whole script is to fail,
how to download without junk or how to remove these before processing it?
▒▒"nmdbfnmdsfsdf"
... (19 Replies)
Hi,
I am having some difficulty pulling out specific columns using awk. I think what I am doing is iterating through the various columns looking for a match and asking awk to print if a match is found.
Here are a few lines from my input:
NC_015011.2 Gnomon gene 18691 26481 . ... (1 Reply)
I have a file in my Unix ( SOLARIS ) with EBCDIC format...I want this file to read in ASCII OR unicode...Is it possible with UNIX to convert this file on ASCII OR UNICODE format from EBCDIC format?
I was searching through web and found only conversion table :(
Request Rejected
Below is... (16 Replies)
Hi,
Here is this weird thing happening here. I mounted RHEL 6.6 DVD on a directoy /a, I am trying to copy it's content to another folder by using command:
cp -pr /a/* /new/folder
But while I run ls -lrt on both locations it show me difference in number of files. Any specific reason for that.... (5 Replies)
Hi,
I have been using ksh till now but was forced to use tcsh from last few days. I am getting a gvimrc error which says "set: Syntax error". What makes this pesky is that it is not showing which line number is the error from. How do I get the line number of the error?
Also, I am using the... (11 Replies)
Hi
I want to create a shell script with the following awk command & also get the filenames in output.
awk '/<catetcsecuretty0>/ {p=1} /<catvarlogmessages0>/ {p=0} p' *.xml
As there will be multiple outputs related to many xml files I cannot identify which output belongs to which file
... (5 Replies)
grpdsku program allows user to check their group disk space in a server environment. The data in the dialog box queries a text file. Each text file is labeled with a current timestamp. Results output to a msgbox. Also, results output to a csv file. The csv file is sent to the user via email
... (13 Replies)
Hi Forum.
I have a unique problem that I'm hoping someone can assist me.
I'm generating a fixed width file and one of the output column (person_name at col. pos.#483 defined as string(36) sometimes contains french characters in the name and it causes the next column of data to shift to the... (10 Replies)
Ahoy friends.
Currently i got a bash script running to manage my minecraft servers.
All of them are stored in /home/minecraft_servers directory.
Using my script im able to start a server (e.g. ./minecraft start ftb_continuum) because server name and server name are the same.(e.g.... (2 Replies)