BIG5(5) BSD File Formats Manual BIG5(5)NAME
big5 -- ``Big Five'' encoding for Traditional Chinese text
``Big Five'' is the de facto standard for encoding Traditional Chinese text. Each character is represented by either one or two bytes.
Characters from the ASCII character set are represented as single bytes in the range 0x00 - 0x7F. Traditional Chinese characters are repre-
sented by two bytes: the first in the range 0xA1 - 0xFE, the second in the range 0x40 - 0xFE.
SEE ALSO euc(5), gb18030(5), utf8(5)BSD August 7, 2003 BSD
Check Out this Related Man Page
GB18030(5) BSD File Formats Manual GB18030(5)NAME
gb18030 -- GB 18030 encoding method for Chinese text
The GB18030 encoding implements GB 18030-2000, a PRC national standard for the encoding of Chinese characters. It is a superset of the older
GB 2312-1980 and GBK encodings, and incorporates Unicode's Unihan Extension A completely. It also provides code space for all Unicode 3.0
Multibyte characters in the GB18030 encoding can be one byte, two bytes, or four bytes long. There are a total of over 1.5 million code
GB 11383-1981 (ASCII) characters are represented by single bytes in the range 0x00 to 0x7F.
Chinese characters are represented as either two bytes or four bytes. Characters that are represented by two bytes begin with a byte in the
range 0x81-0xFE and end with a byte either in the range 0x40-0x7E or 0x80-0xFE.
Characters that are represented by four bytes begin with a byte in the range 0x81-0xFE, have a second byte in the range 0x30-0x39, a third
byte in the range 0x81-0xFE and a fourth byte in the range 0x30-0x39.
SEE ALSO euc(5), gb2312(5), gbk(5), utf8(5)
Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange --
Extension for the basic set, March 2000.
The Unicode Standard, Version 3.0, The Unicode Consortium, 2000.
The GB18030 encoding is believed to be compatible with GB 18030-2000.
BSD August 10, 2003 BSD
I'm trying to figure out how to change the Character code of a text file in a script. Such as the ability to render an out put to Chinese Simple (GB2312) or other language character encodes. (1 Reply)
Is there a way I can execute an installp command in debug mode, so that I can see whats happening when a fileset is being installed or updated? (What files are being replace etc etc).
I have an installp command failing for unknown reason. (7 Replies)
I have a file in my desktop which is a unicode format. After this file is transferred to Unix using FTP, we are seeing some special character (like rectangle box type) at the first line. The same file is saved as UTF8 (using textpad tool, selecting encode to UTF-8 option) on my desktopand... (7 Replies)
Any help appreciated. I am connecting (FTP) to a NAS PRO box that is currently installed on one of our subnets in Hong Kong from a Solaris
system running Solaris 10 (our backup server). The NAS box has a backup job that backs up all user documents locally. I am trying to ftp the docs from the NAS... (4 Replies)
We can can connect to mysql server from command line with some user. but when using mysql_connect() it says:
access is denied for user 'someuser'@'localhost' (using password: YES)
what should i do? (6 Replies)
While uploading an exl file to my application in Solaris 10 the upload failed with error Error! Parsing Error: /SPLM/TC83/tcdata83/model/model_dbextract.xml Line:65576 Column:73 An invalid XML character (Unicode: 0x1a) was found in the value of attribute "unitOfMeasureSymbol" and element is ... (12 Replies)
we have a file that shows encoding as small-endian (in NotePad+). I can not read this file in my program. If I change it to ANSI (again in Notepad+) and save it, I am able to read it in my program.
Can I use the iconv function to change this file to ANSI ?
Thanks in advance for your help. (4 Replies)
I had just added Japanese language as my secondary language with yast and I am still not able to view Japanese characters (they appear mangled)
I've copied two text files from windows to the SUSE Linux Enterprise Server 10 (i586), one with UTF and the other with ANSI. When viewed with cat and... (4 Replies)
I am stuck... i dunno why does my test fail... any idea ?
LINE=$(cat info.txt | sed -n 1p)
if ; then
LINE2=$(cat info.txt | sed -n 2p)
echo $Distor... (14 Replies)
Trying to load a specific Traditional Chinese Character on an HP UX platform for a Taiwan database. Data is converting to ?. Database has existed since 2008 and this is the first issue I've seen where a user is unable to save the multi-byte character in the database. I'm thinking this is may be... (0 Replies)
Im downloading the file from windows server through FTP, the downloaded file is containing some junk character at very start of the file as below and causing my whole script is to fail,
how to download without junk or how to remove these before processing it?
... (19 Replies)
I have a file in my Unix ( SOLARIS ) with EBCDIC format...I want this file to read in ASCII OR unicode...Is it possible with UNIX to convert this file on ASCII OR UNICODE format from EBCDIC format?
I was searching through web and found only conversion table :(
Below is... (16 Replies)
am running the following code on a RHEL 6.6 box to list which charsets are loaded and which are available:
my @list = Encode->encodings();
my @all_encodings = Encode->encodings(":all");
... (3 Replies)
I have a unique problem that I'm hoping someone can assist me.
I'm generating a fixed width file and one of the output column (person_name at col. pos.#483 defined as string(36) sometimes contains french characters in the name and it causes the next column of data to shift to the... (10 Replies)