Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Convert files to UTF-8 on AIX 7.1 Post 303013712 by JeanM-1 on Monday 26th of February 2018 09:43:42 AM
Old 02-26-2018
Convert files to UTF-8 on AIX 7.1

Dears,

I have a shell script - working perfectly on Oracle Linux - that detects the encoding (the charset to be exact) of the files in a specified directory using the "file" command (The file command outputs the charset in Linux, but doesn't do that in AIX), then if the file isn't a UTF-8 text file, it converts it to UTF-8 using "iconv" command.

I searched lots of forums and threads but it seems this is extremely hard to do in AIX, since the "file" command doesn't output the charset.

I also read this useful thread on this forum: Converting Unicode file to UTF8 format.

My problem is that if I want to use the "iconv" command to convert my files to UTF-8, how can I determine the charset of the original file ?
Code:
iconv -f FromCode -t ToCode

(The ToCode can be replaced by UTF-8, but I need to guess the FromCode).

Is there any way to do that ?

Does anyone have a working script on AIX that does what I want to do ?

Thank you,
Regards.

Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 02-26-2018 at 11:29 AM.. Reason: Added CODE tags.
 

10 More Discussions You Might Find Interesting

1. Programming

Howto convert Ascii -> UTF-8 & back C++

While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format. I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Discussion started by: macron
3 Replies

2. UNIX for Dummies Questions & Answers

Finding files with UTF-8 BOM

Hi, there: I am relatively new to Unix. So, I am not even sure if I am asking is an easy or difficult task. I want to peform GREP like command which will generate a list of files with a file format of UTF-8. I would especially like to know whether the files use UTF-8 or UTF-8N (in other... (0 Replies)
Discussion started by: kotoponus
0 Replies

3. UNIX for Advanced & Expert Users

Convert UTF-8 encoded hex value to a character

Hi, I have a non-ascii character (Ŵ), which can be represented in UTF-8 encoding as equivalent hex value (\xC5B4). Is there a function in unix to convert this hex value back to display the charcter ? (10 Replies)
Discussion started by: sumirmehta
10 Replies

4. Red Hat

Can't convert 7bit ASCII to UTF-8

Hello, I am trying to convert a 7bit ASCII file to UTF-8. I have used iconv before though it can't recognize it for some reason and says unknown file encoding. When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file It still... (2 Replies)
Discussion started by: rockf1bull
2 Replies

5. AIX

How to print UTF-8 from AIX (lp)

Hello everyone! I have a problem with printing ru_RU.UTF-8 from AIX using lp command. #locale -a C POSIX RU_RU.UTF-8 RU_RU en_US.8859-15 en_US.ISO8859-1 en_US ru_RU.ISO8859-5 ru_RU #locale LANG=en_US.UTF-8 LC_COLLATE=RU_RU.UTF-8 LC_CTYPE=RU_RU.UTF-8 LC_MONETARY="en_US" (3 Replies)
Discussion started by: burnAF
3 Replies

6. OS X (Apple)

Changing txt files to pure UTF-8

I have two Macs running 10.7.5. We download .txt files from remote site to these local Macs using 'rsync -e ssh -avz...'. The files on Mac1 are in the required format of pure UTF-8. The files on Mac2 are in UTF-8 (no BOM) which is wrong format for us; these formats are indicated using BBEdit.... (1 Reply)
Discussion started by: sovdia
1 Replies

7. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

8. AIX

Install EN_GB UTF-8 on AIX 5.3

Hello Gang Can you please help me in installing EN_GB UTF-8 on AIX 5.3 . I have worked on Solaris and Linux, but not familier with AIX so hoping if someone show the roadmap. (5 Replies)
Discussion started by: ningy
5 Replies

9. Shell Programming and Scripting

Trying to convert utf-8 to WINDOWS-1251

Hello all i have utf-8 file that i try to convert to WINDOWS-1251 on linux without any success the file name is utf-8 when i try to do : file -bi test.txt it gives me : text/plain; charset=utf-8 when i try to convert the file i do : /usr/bin/iconv -f UTF-8 -t WINDOWS-1251 test.txt >... (1 Reply)
Discussion started by: umen
1 Replies

10. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies
ICONV_MIME_ENCODE(3)							 1						      ICONV_MIME_ENCODE(3)

iconv_mime_encode - Composes a MIMEheader field

SYNOPSIS
string iconv_mime_encode NULL (string $field_name, string $field_value, [array $preferences]) DESCRIPTION
Composes and returns a string that represents a valid MIME header field, which looks like the following: Subject: =?ISO-8859-1?Q?Pr=FCfung_f=FCr?= Entwerfen von einer MIME kopfzeile PARAMETERS
o $field_name - The field name. o $field_value - The field value. o $preferences - You can control the behaviour of iconv_mime_encode(3) by specifying an associative array that contains configuration items to the optional third parameter $preferences. The items supported by iconv_mime_encode(3) are listed below. Note that item names are treated case-sensitive. Configuration items supported by iconv_mime_encode(3) +-----------------+--------------------------+---+---+---+ | Item | | | | | | | | | | | | | Type | | | | | | | | | | | | Description | | | | | | | | | | | | Default value | | | | | | | | | | | | Example | | | | | | | | | | +-----------------+--------------------------+---+---+---+ | scheme | | | | | | | | | | | | | | | | | | | string | | | | | | | | | | | | Specifies the method to | | | | | | encode a field value by. | | | | | | The value of this item | | | | | | may be either "B" or | | | | | | "Q", where "B" stands | | | | | | for base64 encoding | | | | | | scheme and "Q" stands | | | | | | for quoted-printable | | | | | | encoding scheme. | | | | | | | | | | | | B | | | | | | | | | | | | B | | | | | | | | | | | input-charset | | | | | | | | | | | | | | | | | | | string | | | | | | | | | | | | Specifies the character | | | | | | set in which the first | | | | | | parameter $field_name | | | | | | and the second parameter | | | | | | $field_value are pre- | | | | | | sented. If not given, | | | | | | iconv_mime_encode(3) | | | | | | assumes those parameters | | | | | | are presented to it in | | | | | | the iconv.inter- | | | | | | nal_encoding ini set- | | | | | | ting. | | | | | | | | | | | | iconv.internal_encoding | | | | | | | | | | | | ISO-8859-1 | | | | | | | | | | | output-charset | | | | | | | | | | | | | | | | | | | string | | | | | | | | | | | | Specifies the character | | | | | | set to use to compose | | | | | | the MIME header. | | | | | | | | | | | | iconv.internal_encoding | | | | | | | | | | | | UTF-8 | | | | | | | | | | | line-length | | | | | | | | | | | | | | | | | | | integer | | | | | | | | | | | | Specifies the maximum | | | | | | length of the header | | | | | | lines. The resulting | | | | | | header is "folded" to a | | | | | | set of multiple lines in | | | | | | case the resulting | | | | | | header field would be | | | | | | longer than the value of | | | | | | this parameter, accord- | | | | | | ing to RFC2822 - Inter- | | | | | | net Message Format. If | | | | | | not given, the length | | | | | | will be limited to 76 | | | | | | characters. | | | | | | | | | | | | 76 | | | | | | | | | | | | 996 | | | | | | | | | | |line-break-chars | | | | | | | | | | | | | | | | | | | string | | | | | | | | | | | | Specifies the sequence | | | | | | of characters to append | | | | | | to each line as an end- | | | | | | of-line sign when "fold- | | | | | | ing" is performed on a | | | | | | long header field. If | | | | | | not given, this defaults | | | | | | to " " ( CR LF). Note | | | | | | that this parameter is | | | | | | always treated as an | | | | | | ASCII string regardless | | | | | | of the value of input- | | | | | | charset. | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +-----------------+--------------------------+---+---+---+ RETURN VALUES
Returns an encoded MIME field on success, or FALSE if an error occurs during the encoding. EXAMPLES
Example #1 iconv_mime_encode(3) example <?php $preferences = array( "input-charset" => "ISO-8859-1", "output-charset" => "UTF-8", "line-length" => 76, "line-break-chars" => " " ); $preferences["scheme"] = "Q"; // This yields "Subject: =?UTF-8?Q?Pr=C3=BCfung=20Pr=C3=BCfung?=" echo iconv_mime_encode("Subject", "Prufung Prufung", $preferences); $preferences["scheme"] = "B"; // This yields "Subject: =?UTF-8?B?UHLDvGZ1bmcgUHLDvGZ1bmc=?=" echo iconv_mime_encode("Subject", "Prufung Prufung", $preferences); ?> SEE ALSO
imap_binary(3), mb_encode_mimeheader(3), imap_8bit(3), quoted_printable_encode(3). PHP Documentation Group ICONV_MIME_ENCODE(3)
All times are GMT -4. The time now is 09:36 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy