Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

big5(5) [osx man page]

BIG5(5) 						      BSD File Formats Manual							   BIG5(5)

NAME
big5 -- ``Big Five'' encoding for Traditional Chinese text SYNOPSIS
ENCODING "BIG5" DESCRIPTION
``Big Five'' is the de facto standard for encoding Traditional Chinese text. Each character is represented by either one or two bytes. Characters from the ASCII character set are represented as single bytes in the range 0x00 - 0x7F. Traditional Chinese characters are repre- sented by two bytes: the first in the range 0xA1 - 0xFE, the second in the range 0x40 - 0xFE. SEE ALSO
euc(5), gb18030(5), utf8(5) BSD
August 7, 2003 BSD

Check Out this Related Man Page

GB18030(5)						      BSD File Formats Manual							GB18030(5)

NAME
gb18030 -- GB 18030 encoding method for Chinese text SYNOPSIS
ENCODING "GB18030" DESCRIPTION
The GB18030 encoding implements GB 18030-2000, a PRC national standard for the encoding of Chinese characters. It is a superset of the older GB 2312-1980 and GBK encodings, and incorporates Unicode's Unihan Extension A completely. It also provides code space for all Unicode 3.0 code points. Multibyte characters in the GB18030 encoding can be one byte, two bytes, or four bytes long. There are a total of over 1.5 million code positions. GB 11383-1981 (ASCII) characters are represented by single bytes in the range 0x00 to 0x7F. Chinese characters are represented as either two bytes or four bytes. Characters that are represented by two bytes begin with a byte in the range 0x81-0xFE and end with a byte either in the range 0x40-0x7E or 0x80-0xFE. Characters that are represented by four bytes begin with a byte in the range 0x81-0xFE, have a second byte in the range 0x30-0x39, a third byte in the range 0x81-0xFE and a fourth byte in the range 0x30-0x39. SEE ALSO
euc(5), gb2312(5), gbk(5), utf8(5) Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange -- Extension for the basic set, March 2000. The Unicode Standard, Version 3.0, The Unicode Consortium, 2000. STANDARDS
The GB18030 encoding is believed to be compatible with GB 18030-2000. BSD
August 10, 2003 BSD
Man Page

14 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Changing Charter Codeing in a script

Hello, I'm trying to figure out how to change the Character code of a text file in a script. Such as the ability to render an out put to Chinese Simple (GB2312) or other language character encodes. (1 Reply)
Discussion started by: blackfam972
1 Replies

2. AIX

Running Installp in debug mode

Is there a way I can execute an installp command in debug mode, so that I can see whats happening when a fileset is being installed or updated? (What files are being replace etc etc). I have an installp command failing for unknown reason. (7 Replies)
Discussion started by: balaji_prk
7 Replies

3. Shell Programming and Scripting

Converting Unicode file to UTF8 format

Hi, I have a file in my desktop which is a unicode format. After this file is transferred to Unix using FTP, we are seeing some special character (like rectangle box type) at the first line. The same file is saved as UTF8 (using textpad tool, selecting encode to UTF-8 option) on my desktopand... (7 Replies)
Discussion started by: vfrg
7 Replies

4. Solaris

Installing Hong Kong Locale

Any help appreciated. I am connecting (FTP) to a NAS PRO box that is currently installed on one of our subnets in Hong Kong from a Solaris system running Solaris 10 (our backup server). The NAS box has a backup job that backs up all user documents locally. I am trying to ftp the docs from the NAS... (4 Replies)
Discussion started by: jamba1
4 Replies

5. Programming

Access denied for user at localhost

Hi guys. We can can connect to mysql server from command line with some user. but when using mysql_connect() it says: access is denied for user 'someuser'@'localhost' (using password: YES) what should i do? (6 Replies)
Discussion started by: majid.merkava
6 Replies

6. Solaris

An invalid XML character (Unicode: 0x1a)

While uploading an exl file to my application in Solaris 10 the upload failed with error Error! Parsing Error: /SPLM/TC83/tcdata83/model/model_dbextract.xml Line:65576 Column:73 An invalid XML character (Unicode: 0x1a) was found in the value of attribute "unitOfMeasureSymbol" and element is ... (12 Replies)
Discussion started by: karghum
12 Replies

7. UNIX Desktop Questions & Answers

How to convert file in Unix

we have a file that shows encoding as small-endian (in NotePad+). I can not read this file in my program. If I change it to ANSI (again in Notepad+) and save it, I am able to read it in my program. Can I use the iconv function to change this file to ANSI ? Thanks in advance for your help. (4 Replies)
Discussion started by: ricka360
4 Replies

8. SuSE

Japanese Language in Suse

I had just added Japanese language as my secondary language with yast and I am still not able to view Japanese characters (they appear mangled) I've copied two text files from windows to the SUSE Linux Enterprise Server 10 (i586), one with UTF and the other with ANSI. When viewed with cat and... (4 Replies)
Discussion started by: hedkandi
4 Replies

9. Shell Programming and Scripting

Why does my test fail ??

Hello, I am stuck... i dunno why does my test fail... any idea ? #!/bin/bash dos2unix info.txt Distor=Distributeur LINE=$(cat info.txt | sed -n 1p) echo $LINE echo $Distor echo "" echo "123-$LINE-123" echo "123-$Distor-123" if ; then LINE2=$(cat info.txt | sed -n 2p) echo $Distor... (14 Replies)
Discussion started by: patx
14 Replies

10. HP-UX

Traditional Chinese Multi-Byte issue

Trying to load a specific Traditional Chinese Character on an HP UX platform for a Taiwan database. Data is converting to ?. Database has existed since 2008 and this is the first issue I've seen where a user is unable to save the multi-byte character in the database. I'm thinking this is may be... (0 Replies)
Discussion started by: OHcoder
0 Replies

11. Shell Programming and Scripting

Junk character appearing after downloading the file from windows server

Hello, Im downloading the file from windows server through FTP, the downloaded file is containing some junk character at very start of the file as below and causing my whole script is to fail, how to download without junk or how to remove these before processing it? ▒▒"nmdbfnmdsfsdf" ... (19 Replies)
Discussion started by: Riverstone
19 Replies

12. UNIX for Advanced & Expert Users

Conversion from EBCDIC to Ascii OR unicode

I have a file in my Unix ( SOLARIS ) with EBCDIC format...I want this file to read in ASCII OR unicode...Is it possible with UNIX to convert this file on ASCII OR UNICODE format from EBCDIC format? I was searching through web and found only conversion table :( Request Rejected Below is... (16 Replies)
Discussion started by: joshilalit2004
16 Replies

13. Red Hat

How to load a charset on RHEL 6.6 ?

Hi all, am running the following code on a RHEL 6.6 box to list which charsets are loaded and which are available: #!/usr/bin/perl -w use strict; use Encode; my @list = Encode->encodings(); my @all_encodings = Encode->encodings(":all"); print "@list\n\n"; print "@all_encodings\n"; ... (3 Replies)
Discussion started by: Fundix
3 Replies

14. Shell Programming and Scripting

Shifting of data because of special characters

Hi Forum. I have a unique problem that I'm hoping someone can assist me. I'm generating a fixed width file and one of the output column (person_name at col. pos.#483 defined as string(36) sometimes contains french characters in the name and it causes the next column of data to shift to the... (10 Replies)
Discussion started by: pchang
10 Replies