04-25-2014
Traditional Chinese Multi-Byte issue
Trying to load a specific Traditional Chinese Character on an HP UX platform for a Taiwan database. Data is converting to ?. Database has existed since 2008 and this is the first issue I've seen where a user is unable to save the multi-byte character in the database. I'm thinking this is may be an issue with Character set file and am wondering if I need a conversion map modified. Database uses Big-5 code page.
User keys in
陳 欩 仰
But the data is converting and storing as 陳 ? 仰 (second symbol changes to ?)
NOTE: I added spaces above to better visualize the data
---------- Post updated 04-25-14 at 01:55 PM ---------- Previous update was 04-24-14 at 03:57 PM ----------
Received an email today from a Translation specialist of our ERP Vendor who stated
"I searched for the character "欩" on a BIG-5 character code table (http://ash.jp/code/cn/big5tbl.htm) and also on a BIG-5 to Unicode table (http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT) and was unable to find it in these two tables.
So, the conclusion is that the character "欩" is not one of BIG-5 characters. The only way to display the character correctly is by using Unicode encoding to import/export.”
In other words the particular character we are trying to enter is not supported for Big-5 code page.
Please consider this issue closed.
8 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
In my server migration requirement, I need to compare if one file on old server is exactly the same as the corresponding file on the new server.
For diff and comm, the inputs need to be sorted. But I do not want to disturb the content of the file and need to find byte-to-byte match.
Please... (4 Replies)
Discussion started by: krishmaths
4 Replies
2. Shell Programming and Scripting
Hi All
Can anyone please suggest me how to remove the last byte from a falt file .This is from the last line's last BYTE.
Please suggest me something.
Thank's and regards
Vinay (1 Reply)
Discussion started by: vinayrao
1 Replies
3. Shell Programming and Scripting
Hi,
I'm trying to run the following command using sh -c
ie
sh -c "while read EachLine
do
rm -f $EachLine ;
done < file_list.lst;"
It doesn't seem to do anything.
When I run this at the command line, it does remove the files contained in the list so i know the command works
ie... (4 Replies)
Discussion started by: chrispward
4 Replies
4. Shell Programming and Scripting
Hi,
I have a problem where I need to make this input:
nameRow1a,text1a,text2a,floatValue1a,FloatValue2a,...,floatValue140a
nameRow1b,text1b,text2b,floatValue1b,FloatValue2b,...,floatValue140b
look like this output:
nameRow1a,text1b,text2a,(floatValue1a - floatValue1b),(floatValue2a -... (4 Replies)
Discussion started by: nricardo
4 Replies
5. UNIX for Advanced & Expert Users
We have a production file system which has 6+ million files with more than 1 tera byte in size. When trying to delete selective files through a weekly script files are not deleted.
Please advise with ideas. (5 Replies)
Discussion started by: kppublicmail
5 Replies
6. Programming
I am trying to write a large X app. I have successfully modified my xorg.conf to setup 4 monitors on an NVIDIA Quatro5200. I am trying to modify a simple hello world application to open a window on three of the four monitors. depending on the changes to loop the window creation section and event... (2 Replies)
Discussion started by: advorak
2 Replies
7. Shell Programming and Scripting
Greetings Experts,
I do have some basic knowledge of Unix. The task I am trying to do through shell script is to generate the view script for all of the tables which is in YYYYMMDD format (I assume I am on Ksh).
I have certain tables that ends in YYYYMMDD format (eg: tbl_20150630)
For each... (1 Reply)
Discussion started by: chill3chee
1 Replies
8. Shell Programming and Scripting
Greetings Experts,
I need to handle the views created over monthly retention tables
for which every new table in YYYYMMDD format, there is
equivalent view created and the older table which might be
dropped, the view over it has to be re-created over a dummy
table so that it doesn't fail.... (2 Replies)
Discussion started by: chill3chee
2 Replies
LEARN ABOUT OSF1
telecode
telecode(5) File Formats Manual telecode(5)
NAME
telecode - A character encoding system (codeset) for Traditional Chinese
DESCRIPTION
The Telecode codeset (called Mitac Telex in early versions of the operating system) consists of 2 character planes. Each character plane
has 8836 character positions. In plane 1, standard characters occupy positions 0001 to 8045; the remaining 791 positions are for user-
defined characters. In plane 2, standard characters occupy positions 0001 to 8489; the remaining 346 positions are for user-defined charac-
ters. Telecode uses 2-byte values to represent characters on both planes.
Plane 1 Character Encoding
To differentiate plane 1 code from plane 2 code, the most significant bit (MSB) is set on in both bytes of a plane 1 character code. The
following formula calculates the value of a plane 1 character from its position on the plane:
1st byte = M + 161
2nd byte = N + 161 - M x 94
In this formula, N is the position of the character and M = N / 94.
For example, if a character is at position 2502 on plane 1, its encoding value is BBDB, which is calculated as follows:
N = 2502, M = 2502/94 = 26 1st byte = 26 + 161 = 187 2nd byte = 2502 + 161 - 26 x 94 = 219
Plane 2 Character Encoding
To differentiate plane 2 code from plane 1 code, the MSB of the first byte is set on and that of the second byte is set off for each plane
2 character code. The following formula calculates the value of a plane 2 character from its position:
1st byte = M + 161
2nd byte = N + 33 - M x 94
In this formula, N is the position of the character on the plane and M = N / 94.
For example, if a character is at position 2502 on plane 2, its encoding value is BB5B, which is calculated as follows:
N = 2502, M = 2502/94 = 26 1st byte = 26 + 161 = 187 2nd byte = 2502 + 33 - 26 x 94 = 91
Codeset Conversion
The following codeset converter pairs are available for converting Traditional Chinese characters between telecode and other encoding for-
mats. Refer to iconv_intro(5) for an introduction to codeset conversion. For more information about the other codeset for which telecode
is the input or output, see the reference page specified in the list item. big5_telecode, telecode_big5
Converting from and to the Big-5 codeset: big5(5).
Note that Big-5 encoding is equivalent to the Microsoft code-page format used on PCs for Traditional Chinese. You can therefore use
these converters to convert Traditional Chinese characters between PC code page format and Telecode encoding format. For more infor-
mation on how the operating system supports PC code pages, see code_page(5). dechanyu_telecode, telecode_dechanyu
Converting from and to the DEC Hanyu codeset: dechanyu(5). eucTW_telecode, telecode_eucTW
Converting from and to Taiwanese Extended UNIX Code: eucTW(5).
Font Support for Telecode
The operating system supports Telecode only through conversion to another codeset.
SEE ALSO
Commands: locale(1)
Others: ascii(5), big5(5), Chinese(5), code_page(5), dechanyu(5), dechanzi(5), eucTW(5), GBK(5), i18n_intro(5), i18n_printing(5),
iconv_intro(5), l10n_intro(5), sbig5(5)
telecode(5)