Real UNICODE back to string


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Real UNICODE back to string
# 1  
Old 03-01-2011
Real UNICODE back to string

I'm looking for proper NLS_LANG settings if I've a real UNICODE delimited string (Hex code points) , containing also multibyte characters and using a small java program which converts them back to local.

i.e: '0056;0065;006E;0064;006F;0072;0020;0054;0065;0078;0074;003A;5355;8EAB;6D4B;8BD5;4E2D;6587;5B57;7B26 ;FF0C;9884;795D;5927;8FD0;4F1A;987A;5229;53EC;5F00;'

I've tried:

export NLS_LANG=Japanese_Japan.UTF8
export NLS_LANG=American_America.UTF8

None works for the multibyte characters, as it points to UTF8 but not real Unicode.
In UTF8 the multibyte chars are expected to start also with '00..'.
Somehow i do not get the Hex code points handled and it always expect UTF8 code units.

Last edited by strolchFX; 03-01-2011 at 08:45 AM..
# 2  
Old 03-01-2011
First off - UTF-8 supports from 1 - 4 bytes in a multibyte locale.

To set unicode correctly check out:
A Quick Primer On Unicode and Software Internationalization Under Linux and UNIX
# 3  
Old 03-01-2011
Seems the issue we have is that our java getBytes() does not return Hexadecimal code points and always uses UTF8 code units on our Solaris box.
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Programming

Unicode String Issue

I am storing some unicode characters "лфи" in a char array. When I view(x/30s <variable name>) the values in gdb it show me something like: 0x80ac47c: "?\004>\004 " 0x80ac482: "A\0048\004;\004L\004D\004>\004=\004:\0045\004/" Why it is happening so and what are these \004 representing? (1 Reply)
Discussion started by: rupeshkp728
1 Replies

2. Shell Programming and Scripting

Problem in Concatination of string in bash scripts containing back slashes.

My script is as follows: #!/bin/bash STR1="test" echo $STR1 STR2="/bldtmp/"$STR1 echo $STR2 STR3=$STR2'/tmp' echo $STR3 output i am geting ---------------- test /bldtmp/test /tmptmp/test but my need is: ------------------ test /bldtmp/test (1 Reply)
Discussion started by: dchoudhury
1 Replies

3. Shell Programming and Scripting

Bash shell script: Str(007) to int(7),increment it(8) & convert back to string(008)

Hi, I have the following requirement. There will be following text/line in a file (eg: search-build.txt) PRODUCT_VERSION="V:01.002.007.Build1234" I need to update the incremental build number (eg here 007) every time I give a build through script. I am able to search the string and get... (4 Replies)
Discussion started by: drwatson_droid
4 Replies

4. Solaris

Can't install Unicode::String due to String.so not found

CPAN.pm: Going to build G/GA/GAAS/Unicode-String-2.09.tar.gz Checking if your kit is complete... Looks good Writing Makefile for Unicode::String cp String.pm blib/lib/Unicode/String.pm cp lib/Unicode/CharName.pm blib/lib/Unicode/CharName.pm /usr/bin/perl /usr/perl5/5.8.4/lib/ExtUtils/xsubpp... (5 Replies)
Discussion started by: PatrickBaer
5 Replies

5. Shell Programming and Scripting

Removing back quotes from string in CSH

Hello, I am using csh to read a text file and save its words into variable $word in a foreach loop. These words have small back quotes ` as integral parts of them, for example, one word would be `abc`, another would be `xyz1` etc... These quotes are always the first and last characters of the... (5 Replies)
Discussion started by: aplaydoc
5 Replies

6. Programming

How to display unicode characters / unicode string

I have a stream of characters like "\u8BBE\u5907\u7BA1" and i want to display it. I tried following things already without any luck. 1) printf("%s",L("\u8BBE\u5907\u7BA1")); 2) printf("%lc",0x8BBE); 3) setlocale followed by fwide followed by wprintf 4) also changed the local manually... (3 Replies)
Discussion started by: jackdorso
3 Replies

7. Programming

How to make static unicode string?

In Windows, wchar_t *pStr = L"Hello"; works, but I can't find the equivalent to Unix system. How can I make static stack-memory-based wide character string in C in Unix? (1 Reply)
Discussion started by: sledge76
1 Replies

8. Shell Programming and Scripting

converting string to unicode

How can I can convert a string in a shell script that looks something like: ]] to unicode equivalent? thanks a lot, webtekie (1 Reply)
Discussion started by: webtekie
1 Replies
Login or Register to Ask a Question