Sponsored Content
Top Forums UNIX for Advanced & Expert Users Convert UTF-8 encoded hex value to a character Post 302252172 by sumirmehta on Tuesday 28th of October 2008 10:07:15 PM
Old 10-28-2008
Hi there,
I am actually using perl to retrieve message from the mailbox. the perl module for encoding/decoding (MIME-Base64-3.07 > MIME::Base64) is the one to be used, but while decoding it does decode to ascii/iso-8859-1 (while the mail header correctly shows the encoding as UTF-8).

In this case, if i want to convert this data to utf-8 back (as detailed above) , is there a command/way to do it in unix ?
 

10 More Discussions You Might Find Interesting

1. Programming

Howto convert Ascii -> UTF-8 & back C++

While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format. I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Discussion started by: macron
3 Replies

2. Red Hat

Can't convert 7bit ASCII to UTF-8

Hello, I am trying to convert a 7bit ASCII file to UTF-8. I have used iconv before though it can't recognize it for some reason and says unknown file encoding. When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file It still... (2 Replies)
Discussion started by: rockf1bull
2 Replies

3. Shell Programming and Scripting

How to modify character to UTF-8 in shell script?

I have a shell script running to load some data from a text file to database. Text file contains some non-ASCII characters like ü. How can i convert these characters to UTF-8 codes before loading to DB. (5 Replies)
Discussion started by: vel4ever
5 Replies

4. Shell Programming and Scripting

Convert hex to decimal

can someone help me in converting hex streams to decimal values using perl script Hex value: $my_hex_stream="0c07ac14001676"; Every hex value in the above stream should be converted in to decimal and separated by comma. The output should be: 12,07,172,20,00,22,118 (2 Replies)
Discussion started by: Arun_Linux
2 Replies

5. UNIX for Dummies Questions & Answers

Issue with UTF-8 BOM character in text file

Sometimes we recieve some excel files containing French/Japanese characters over the mail, and these files are manually transferred to the server by using SFTP (security is not a huge concern here). The data is changed to text format before transferring it using Notepad. Problem is: When saving... (4 Replies)
Discussion started by: jawsnnn
4 Replies

6. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

7. Shell Programming and Scripting

Trying to convert utf-8 to WINDOWS-1251

Hello all i have utf-8 file that i try to convert to WINDOWS-1251 on linux without any success the file name is utf-8 when i try to do : file -bi test.txt it gives me : text/plain; charset=utf-8 when i try to convert the file i do : /usr/bin/iconv -f UTF-8 -t WINDOWS-1251 test.txt >... (1 Reply)
Discussion started by: umen
1 Replies

8. UNIX for Advanced & Expert Users

UTF-8,16,32 character lengths using awk

Hi All, I am trying to obtain count of characters using awk, but "length" function returns a value of 1 for 2-byte or 3-byte characters as well unlike wc -c command. I have tried to use the below commands within awk function, but it does not seem to work { cmd="wc -c "stringtocheck ( cmd )... (6 Replies)
Discussion started by: tostay2003
6 Replies

9. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies

10. UNIX for Beginners Questions & Answers

Convert files to UTF-8 on AIX 7.1

Dears, I have a shell script - working perfectly on Oracle Linux - that detects the encoding (the charset to be exact) of the files in a specified directory using the "file" command (The file command outputs the charset in Linux, but doesn't do that in AIX), then if the file isn't a UTF-8 text... (4 Replies)
Discussion started by: JeanM-1
4 Replies
MIME::Base64(3pm)					 Perl Programmers Reference Guide					 MIME::Base64(3pm)

NAME
MIME::Base64 - Encoding and decoding of base64 strings SYNOPSIS
use MIME::Base64; $encoded = encode_base64('Aladdin:open sesame'); $decoded = decode_base64($encoded); DESCRIPTION
This module provides functions to encode and decode strings into and from the base64 encoding specified in RFC 2045 - MIME (Multipurpose Internet Mail Extensions). The base64 encoding is designed to represent arbitrary sequences of octets in a form that need not be humanly readable. A 65-character subset ([A-Za-z0-9+/=]) of US-ASCII is used, enabling 6 bits to be represented per printable character. The following primary functions are provided: encode_base64( $bytes ) encode_base64( $bytes, $eol ); Encode data by calling the encode_base64() function. The first argument is the byte string to encode. The second argument is the line-ending sequence to use. It is optional and defaults to " ". The returned encoded string is broken into lines of no more than 76 characters each and it will end with $eol unless it is empty. Pass an empty string as second argument if you do not want the encoded string to be broken into lines. The function will croak with "Wide character in subroutine entry" if $bytes contains characters with code above 255. The base64 encoding is only defined for single-byte characters. Use the Encode module to select the byte encoding you want. decode_base64( $str ) Decode a base64 string by calling the decode_base64() function. This function takes a single argument which is the string to decode and returns the decoded data. Any character not part of the 65-character base64 subset is silently ignored. Characters occurring after a '=' padding character are never decoded. If you prefer not to import these routines into your namespace, you can call them as: use MIME::Base64 (); $encoded = MIME::Base64::encode($decoded); $decoded = MIME::Base64::decode($encoded); Additional functions not exported by default: encode_base64url( $bytes ) decode_base64url( $str ) Encode and decode according to the base64 scheme for "URL applications" [1]. This is a variant of the base64 encoding which does not use padding, does not break the string into multiple lines and use the characters "-" and "_" instead of "+" and "/" to avoid using reserved URL characters. encoded_base64_length( $bytes ) encoded_base64_length( $bytes, $eol ) Returns the length that the encoded string would have without actually encoding it. This will return the same value as "length(encode_base64($bytes))", but should be more efficient. decoded_base64_length( $str ) Returns the length that the decoded string would have without actually decoding it. This will return the same value as "length(decode_base64($str))", but should be more efficient. EXAMPLES
If you want to encode a large file, you should encode it in chunks that are a multiple of 57 bytes. This ensures that the base64 lines line up and that you do not end up with padding in the middle. 57 bytes of data fills one complete base64 line (76 == 57*4/3): use MIME::Base64 qw(encode_base64); open(FILE, "/var/log/wtmp") or die "$!"; while (read(FILE, $buf, 60*57)) { print encode_base64($buf); } or if you know you have enough memory use MIME::Base64 qw(encode_base64); local($/) = undef; # slurp print encode_base64(<STDIN>); The same approach as a command line: perl -MMIME::Base64 -0777 -ne 'print encode_base64($_)' <file Decoding does not need slurp mode if every line contains a multiple of four base64 chars: perl -MMIME::Base64 -ne 'print decode_base64($_)' <file Perl v5.8 and better allow extended Unicode characters in strings. Such strings cannot be encoded directly, as the base64 encoding is only defined for single-byte characters. The solution is to use the Encode module to select the byte encoding you want. For example: use MIME::Base64 qw(encode_base64); use Encode qw(encode); $encoded = encode_base64(encode("UTF-8", "x{FFFF} ")); print $encoded; COPYRIGHT
Copyright 1995-1999, 2001-2004, 2010 Gisle Aas. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Distantly based on LWP::Base64 written by Martijn Koster <m.koster@nexor.co.uk> and Joerg Reichelt <j.reichelt@nexor.co.uk> and code posted to comp.lang.perl <3pd2lp$6gf@wsinti07.win.tue.nl> by Hans Mulder <hansm@wsinti07.win.tue.nl> The XS implementation uses code from metamail. Copyright 1991 Bell Communications Research, Inc. (Bellcore) SEE ALSO
MIME::QuotedPrint [1] <http://en.wikipedia.org/wiki/Base64#URL_applications> perl v5.18.2 2014-01-06 MIME::Base64(3pm)
All times are GMT -4. The time now is 07:19 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy