Sponsored Content
Top Forums UNIX for Advanced & Expert Users Convert UTF-8 encoded hex value to a character Post 302252168 by cbkihong on Tuesday 28th of October 2008 09:52:57 PM
Old 10-28-2008
That means the decoding process treats the message as ISO-8859-1 (or ASCII) rather than UTF-8. There is no "conversion" going on here. It's simply the decoding process fails based on a wrong assumption of encoding.

Have you tried to investigate if anything is wrong that causes the message not to be interpreted as UTF-8? For instance, did you check the encoding in the mail header, was it erroneously specified as anything other than UTF-8? And you may try with other mail with UTF-8 and see if that is an issue with a particular mail (sometimes a misconfigured mail user agent is culprit) or a bigger issue. Try switching mail clients and see if you can always reproduce that.
 

10 More Discussions You Might Find Interesting

1. Programming

Howto convert Ascii -> UTF-8 & back C++

While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format. I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Discussion started by: macron
3 Replies

2. Red Hat

Can't convert 7bit ASCII to UTF-8

Hello, I am trying to convert a 7bit ASCII file to UTF-8. I have used iconv before though it can't recognize it for some reason and says unknown file encoding. When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file It still... (2 Replies)
Discussion started by: rockf1bull
2 Replies

3. Shell Programming and Scripting

How to modify character to UTF-8 in shell script?

I have a shell script running to load some data from a text file to database. Text file contains some non-ASCII characters like ü. How can i convert these characters to UTF-8 codes before loading to DB. (5 Replies)
Discussion started by: vel4ever
5 Replies

4. Shell Programming and Scripting

Convert hex to decimal

can someone help me in converting hex streams to decimal values using perl script Hex value: $my_hex_stream="0c07ac14001676"; Every hex value in the above stream should be converted in to decimal and separated by comma. The output should be: 12,07,172,20,00,22,118 (2 Replies)
Discussion started by: Arun_Linux
2 Replies

5. UNIX for Dummies Questions & Answers

Issue with UTF-8 BOM character in text file

Sometimes we recieve some excel files containing French/Japanese characters over the mail, and these files are manually transferred to the server by using SFTP (security is not a huge concern here). The data is changed to text format before transferring it using Notepad. Problem is: When saving... (4 Replies)
Discussion started by: jawsnnn
4 Replies

6. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

7. Shell Programming and Scripting

Trying to convert utf-8 to WINDOWS-1251

Hello all i have utf-8 file that i try to convert to WINDOWS-1251 on linux without any success the file name is utf-8 when i try to do : file -bi test.txt it gives me : text/plain; charset=utf-8 when i try to convert the file i do : /usr/bin/iconv -f UTF-8 -t WINDOWS-1251 test.txt >... (1 Reply)
Discussion started by: umen
1 Replies

8. UNIX for Advanced & Expert Users

UTF-8,16,32 character lengths using awk

Hi All, I am trying to obtain count of characters using awk, but "length" function returns a value of 1 for 2-byte or 3-byte characters as well unlike wc -c command. I have tried to use the below commands within awk function, but it does not seem to work { cmd="wc -c "stringtocheck ( cmd )... (6 Replies)
Discussion started by: tostay2003
6 Replies

9. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies

10. UNIX for Beginners Questions & Answers

Convert files to UTF-8 on AIX 7.1

Dears, I have a shell script - working perfectly on Oracle Linux - that detects the encoding (the charset to be exact) of the files in a specified directory using the "file" command (The file command outputs the charset in Linux, but doesn't do that in AIX), then if the file isn't a UTF-8 text... (4 Replies)
Discussion started by: JeanM-1
4 Replies
libcaca character set conversions(3caca)			      libcaca				  libcaca character set conversions(3caca)

NAME
libcaca character set conversions - Functions __extern uint32_t caca_utf8_to_utf32 (char const *, size_t *) Convert a UTF-8 character to UTF-32. __extern size_t caca_utf32_to_utf8 (char *, uint32_t) Convert a UTF-32 character to UTF-8. __extern uint8_t caca_utf32_to_cp437 (uint32_t) Convert a UTF-32 character to CP437. __extern uint32_t caca_cp437_to_utf32 (uint8_t) Convert a CP437 character to UTF-32. __extern char caca_utf32_to_ascii (uint32_t) Convert a UTF-32 character to ASCII. __extern int caca_utf32_is_fullwidth (uint32_t) Tell whether a UTF-32 character is fullwidth. Detailed Description These functions perform conversions between usual character sets. Function Documentation __extern uint32_t caca_utf8_to_utf32 (char const *s, size_t *bytes) Convert a UTF-8 character read from a string and return its value in the UTF-32 character set. If the second argument is not null, the total number of read bytes is written in it. If a null byte was reached before the expected end of the UTF-8 sequence, this function returns zero and the number of read bytes is set to zero. This function never fails, but its behaviour with illegal UTF-8 sequences is undefined. Parameters: s A string containing the UTF-8 character. bytes A pointer to a size_t to store the number of bytes in the character, or NULL. Returns: The corresponding UTF-32 character, or zero if the character is incomplete. Referenced by caca_put_str(). __extern size_t caca_utf32_to_utf8 (char *buf, uint32_tch) Convert a UTF-32 character read from a string and write its value in the UTF-8 character set into the given buffer. This function never fails, but its behaviour with illegal UTF-32 characters is undefined. Parameters: buf A pointer to a character buffer where the UTF-8 sequence will be written. ch The UTF-32 character. Returns: The number of bytes written. __extern uint8_t caca_utf32_to_cp437 (uint32_tch) Convert a UTF-32 character read from a string and return its value in the CP437 character set, or '?' if the character has no equivalent. This function never fails. Parameters: ch The UTF-32 character. Returns: The corresponding CP437 character, or '?' if not representable. __extern uint32_t caca_cp437_to_utf32 (uint8_tch) Convert a CP437 character read from a string and return its value in the UTF-32 character set, or zero if the character is a CP437 control character. This function never fails. Parameters: ch The CP437 character. Returns: The corresponding UTF-32 character, or zero if not representable. __extern char caca_utf32_to_ascii (uint32_tch) Convert a UTF-32 character into an ASCII character. When no equivalent exists, a graphically close equivalent is sought. This function never fails, but its behaviour with illegal UTF-32 characters is undefined. Parameters: ch The UTF-32 character. Returns: The corresponding ASCII character, or a graphically close equivalent if found, or '?' if not representable. __extern int caca_utf32_is_fullwidth (uint32_tch) Check whether the given UTF-32 character should be printed at twice the normal width (fullwidth characters). If the character is unknown or if its status cannot be decided, it is treated as a standard-width character. This function never fails. Parameters: ch The UTF-32 character. Returns: 1 if the character is fullwidth, 0 otherwise. Referenced by caca_put_char(), and caca_put_str(). Author Generated automatically by Doxygen for libcaca from the source code. Version 0.99.beta18 Fri Apr 6 2012 libcaca character set conversions(3caca)
All times are GMT -4. The time now is 01:05 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy