Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Convert files to UTF-8 on AIX 7.1 Post 303013712 by JeanM-1 on Monday 26th of February 2018 09:43:42 AM
Old 02-26-2018
Convert files to UTF-8 on AIX 7.1

Dears,

I have a shell script - working perfectly on Oracle Linux - that detects the encoding (the charset to be exact) of the files in a specified directory using the "file" command (The file command outputs the charset in Linux, but doesn't do that in AIX), then if the file isn't a UTF-8 text file, it converts it to UTF-8 using "iconv" command.

I searched lots of forums and threads but it seems this is extremely hard to do in AIX, since the "file" command doesn't output the charset.

I also read this useful thread on this forum: Converting Unicode file to UTF8 format.

My problem is that if I want to use the "iconv" command to convert my files to UTF-8, how can I determine the charset of the original file ?
Code:
iconv -f FromCode -t ToCode

(The ToCode can be replaced by UTF-8, but I need to guess the FromCode).

Is there any way to do that ?

Does anyone have a working script on AIX that does what I want to do ?

Thank you,
Regards.

Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 02-26-2018 at 11:29 AM.. Reason: Added CODE tags.
 

10 More Discussions You Might Find Interesting

1. Programming

Howto convert Ascii -> UTF-8 & back C++

While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format. I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Discussion started by: macron
3 Replies

2. UNIX for Dummies Questions & Answers

Finding files with UTF-8 BOM

Hi, there: I am relatively new to Unix. So, I am not even sure if I am asking is an easy or difficult task. I want to peform GREP like command which will generate a list of files with a file format of UTF-8. I would especially like to know whether the files use UTF-8 or UTF-8N (in other... (0 Replies)
Discussion started by: kotoponus
0 Replies

3. UNIX for Advanced & Expert Users

Convert UTF-8 encoded hex value to a character

Hi, I have a non-ascii character (Ŵ), which can be represented in UTF-8 encoding as equivalent hex value (\xC5B4). Is there a function in unix to convert this hex value back to display the charcter ? (10 Replies)
Discussion started by: sumirmehta
10 Replies

4. Red Hat

Can't convert 7bit ASCII to UTF-8

Hello, I am trying to convert a 7bit ASCII file to UTF-8. I have used iconv before though it can't recognize it for some reason and says unknown file encoding. When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file It still... (2 Replies)
Discussion started by: rockf1bull
2 Replies

5. AIX

How to print UTF-8 from AIX (lp)

Hello everyone! I have a problem with printing ru_RU.UTF-8 from AIX using lp command. #locale -a C POSIX RU_RU.UTF-8 RU_RU en_US.8859-15 en_US.ISO8859-1 en_US ru_RU.ISO8859-5 ru_RU #locale LANG=en_US.UTF-8 LC_COLLATE=RU_RU.UTF-8 LC_CTYPE=RU_RU.UTF-8 LC_MONETARY="en_US" (3 Replies)
Discussion started by: burnAF
3 Replies

6. OS X (Apple)

Changing txt files to pure UTF-8

I have two Macs running 10.7.5. We download .txt files from remote site to these local Macs using 'rsync -e ssh -avz...'. The files on Mac1 are in the required format of pure UTF-8. The files on Mac2 are in UTF-8 (no BOM) which is wrong format for us; these formats are indicated using BBEdit.... (1 Reply)
Discussion started by: sovdia
1 Replies

7. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

8. AIX

Install EN_GB UTF-8 on AIX 5.3

Hello Gang Can you please help me in installing EN_GB UTF-8 on AIX 5.3 . I have worked on Solaris and Linux, but not familier with AIX so hoping if someone show the roadmap. (5 Replies)
Discussion started by: ningy
5 Replies

9. Shell Programming and Scripting

Trying to convert utf-8 to WINDOWS-1251

Hello all i have utf-8 file that i try to convert to WINDOWS-1251 on linux without any success the file name is utf-8 when i try to do : file -bi test.txt it gives me : text/plain; charset=utf-8 when i try to convert the file i do : /usr/bin/iconv -f UTF-8 -t WINDOWS-1251 test.txt >... (1 Reply)
Discussion started by: umen
1 Replies

10. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies
Unaccent(3pm)						User Contributed Perl Documentation					     Unaccent(3pm)

NAME
Text::Unaccent - Remove accents from a string SYNOPSIS
use Text::Unaccent; $unaccented = unac_string($charset, $string); $unaccented = unac_string_utf16($string); $version = unac_version(); unac_debug($level); DESCRIPTION
Text::Unaccent is a module that remove accents from a string. "unac_string" converts the input string from the specified charset to UTF-16 and call "unac_string_utf16" to return the unaccented equivalent. The conversion from and to UTF-16 is done with iconv(1). METHODS
$unaccented = unac_string($charset, $string) Return the unaccented equivalent of the string $string. The character set of $string is specified by the $charset argument. The returned string is coded using the same character set. Valid values for the $charset argument are character sets known by iconv(1). Under GNU/Linux try "iconv -l" for a complete list. $unaccented = unac_string_utf16($string) Return the unaccented equivalent of the string $string. The character set of $string must be UTF-16. $version = unac_version() Return the version of the unac library used by this perl module. unac_debug($level) Set the debug level. Messages are printed on stderr. Possible debug levels are: AUTHOR
Loic Dachary (loic@senga.org) http://www.senga.org/unac/ SEE ALSO
iconv(1), unac(3). POD ERRORS
Hey! The above document had some coding errors, which are explained below: Around line 101: You forgot a '=back' before '=head1' perl v5.14.2 2004-10-17 Unaccent(3pm)
All times are GMT -4. The time now is 08:55 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy