![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Forum Rules | FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| High Level Programming Post questions about C, C++, Java, SQL, and other programming languages here. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| unicode problem | Akimaki | High Level Programming | 3 | 03-19-2007 07:00 PM |
| Unix and Unicode | effigy | UNIX for Advanced & Expert Users | 4 | 07-06-2006 06:07 PM |
| How to display unicode characters / unicode string | jackdorso | High Level Programming | 3 | 05-20-2005 10:09 AM |
| unicode | rein | UNIX for Advanced & Expert Users | 2 | 01-18-2005 10:40 AM |
| converting string to unicode | webtekie | Shell Programming and Scripting | 1 | 07-21-2004 08:43 AM |
|
|
Submit Tools | LinkBack | Thread Tools | Display Modes |
|
|||
|
Concerned about C and UNICODE
Dear experts,
While developping a C UNICODE application under AIX 5.3, I encountered the following problem, and after days of investigations I still could not find any solution. Please note that the application is full wchar_t based (not utf8) and that I could compile and run it without any problem on SunOS. I managed to isolate the problem into a simple c program: Code:
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
int main()
{
wchar_t arab[4] = { 1583, 1575, 1605, 0 };
wchar_t engl[4] = { 65, 66, 67, 0 };
wchar_t temp[4] = { 0, 0, 0, 0 };
printf("\n#1 copy arab into temp");
wsprintf(temp, "%S", arab);
printf("\narab bytes : "); for (int i=0; i<4; i++) printf("%d ", (int) arab[i]);
printf("\ntemp bytes : "); for (int i=0; i<4; i++) printf("%d ", (int) temp[i]);
printf("\n");
printf("\n#2 copy engl into temp");
wsprintf(temp, "%S", engl);
printf("\nengl bytes : "); for (int i=0; i<4; i++) printf("%d ", (int) engl[i]);
printf("\ntemp bytes : "); for (int i=0; i<4; i++) printf("%d ", (int) temp[i]);
printf("\n\n");
return 0;
}
Code:
#1 copy arab into temp arab bytes : 1583 1575 1605 0 temp bytes : 32 0 0 0 #2 copy engl into temp engl bytes : 65 66 67 0 temp bytes : 65 66 67 0 It seems it's related to the installed character sets and to the locales configuration... but then, why using wchar_t strings? This data type and the c functions using it (like wsprintf) are supposed to work whatever the language. so, my question are: - What should I do to make this example work ? - Are the C wchar_t function really character set independent ? (it's the case on Windows and SunOS) - If necessary, how do I install and use additional character sets ? Thank you very much for your input about this, Best regards, Thomas Gilbert |
| Forum Sponsor | ||
|
|
|
|||
|
I cannot find any manpage for wsprintf on my system at all, and when compiling your example the linker cannot find it. Is it possible that this is not a standard function, and therefore it varies from system to system?
My compiler goes bananas when you declare variables in a for statement like that, too. Apparently that syntax was depreciated in C99. |
|
|||
|
You are right, the declaration of the i variable in the for is not standard C programming... but I use cpp :-)
Here is the code without it: Code:
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
int main()
{
wchar_t arab[4] = { 1583, 1575, 1605, 0 };
wchar_t engl[4] = { 65, 66, 67, 0 };
wchar_t temp[4] = { 0, 0, 0, 0 };
int i = 0;
printf("\n#1 copy arab into temp");
wsprintf(temp, "%S", arab);
printf("\narab bytes : "); for (i=0; i<4; i++) printf("%d ", (int) arab[i]);
printf("\ntemp bytes : "); for (i=0; i<4; i++) printf("%d ", (int) temp[i]);
printf("\n");
printf("\n#2 copy engl into temp");
wsprintf(temp, "%S", engl);
printf("\nengl bytes : "); for (i=0; i<4; i++) printf("%d ", (int) engl[i]);
printf("\ntemp bytes : "); for (i=0; i<4; i++) printf("%d ", (int) temp[i]);
printf("\n\n");
return 0;
}
Thomas |
|
|||
|
Corona - see mbstowcs, which is ANSI C99...
tgilbert - try mbstowcs, but mind what locale you are set to. These functions are sensitive to that, and I don't see where you called setlocale(). Also, the conversion of characters is subject to change - see the man page warning. If you were on an old AIX system, wide chars/data written to files from that system may have problems on a newer system. YMMV. I"m not an AIX expert... |
|
|||
|
Ok, I eventually managed to make it work.
Actually, I had several problems: - I had to install UTF-8 codepage sets - I had to call setlocale(LC_ALL, "en_US.UTF-8") in my program. I did not have to under SunOS and Windows. (or call with "" and use the LANG environment variable) - I had to get rid of my own libiconv.a library and use the one provided by AIX (mine was OK under SunOS and Windows) Thomas |
|||
| Google The UNIX and Linux Forums |