Sponsored Content
Full Discussion: Explain iconv command
Top Forums UNIX for Beginners Questions & Answers Explain iconv command Post 303032481 by RudiC on Tuesday 19th of March 2019 06:10:28 AM
Old 03-19-2019
Not sure I understand correctly - you want to iconv multi-byte UTF-8 records to ASCII but retain record length? So - if a two byte representation (like Ñ) is converted to N, a space should be added, and for three bytes, two spaces, to keep the record length, PROVIDED the target representations is a one byte char. This doesn't always come true, e.g € -> EUR in ASCII.
If above assumption is true, some conditioning upfront the iconv might help, like
Code:
LC_ALL=C sed 's/[\xC0-\xDF]./& /g; s/[\xE0-\xEF]../& /g' non-ascii.txt | iconv -futf8 -tASCII//TRANSLIT//IGNORE

adds one space for two byte repr., two for three byte repr. For longer / more exotic codes, it must be expanded equivalently.
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

please explain the command

Hi all , please explain the following command : perl -e 'select(undef,undef,undef,.15)' Thanks and Regards Navatha (2 Replies)
Discussion started by: Navatha
2 Replies

2. UNIX for Dummies Questions & Answers

Can anyone explain what this command is doing?

Specifically what is the purpose of sed? What is f? Why is the 'cp f $phonefile' line needed when the script ‘goes live'? Why might that two commands following sed be commented out at the present time ( i.e., during development)? Thanks in... (2 Replies)
Discussion started by: knp808
2 Replies

3. Shell Programming and Scripting

Please Explain me this command

find . -type f -ctime +3 -exec mv {} /somedirectory/ \; in particular "-ctime v/s -mtime" and "difference between +3 and -3" (5 Replies)
Discussion started by: Rambo
5 Replies

4. Shell Programming and Scripting

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (0 Replies)
Discussion started by: Shruthi8818
0 Replies

5. UNIX for Dummies Questions & Answers

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (2 Replies)
Discussion started by: Shruthi8818
2 Replies

6. Shell Programming and Scripting

Characterset conversion problem using iconv command

Hi Friends, I am not able to conver character set from UTF-8 to IBM-284 throwing an error "cannot open convertor" . Could you please help me how to get out of this error. Below command is working fine iconv -f ISO8859-15 -t UTF-8 fromfile.txt > tofile.txt But the below command is... (2 Replies)
Discussion started by: sivakumarl
2 Replies

7. Shell Programming and Scripting

Help with command iconv

I need to convert a utf16 file to utf8. When i use the iconv command to do so it gives an error saying invalid function. When I ran the iconv -l function it did not list the utf16 and utf8 as part of its internal table. Is there anyway I can add these encodings in the library? Is there any other... (3 Replies)
Discussion started by: gaun
3 Replies

8. UNIX for Dummies Questions & Answers

Please explain this command?

Hi, I saw this. But I don't know why we need this? ls mydir > foo.txt ## I know what this will do, it will take the results and write to the file called foo.txt ls mydir > foo.txt 2>&1 ## Don't know why we need 2>&1 Thanks. (2 Replies)
Discussion started by: samnyc
2 Replies

9. Shell Programming and Scripting

Can any one explain this sqlplus command?

Hi , i am new to unix i need a small clarification regarding this sqlplus -s $USER_NAME/$PASSWD@$ORA_SID<< EOF >> SQL_CONN_LOG.log In the above command what is the meaning of <<EOF>> Thanks, krishna. (2 Replies)
Discussion started by: rams_krishna
2 Replies

10. Red Hat

Please help to explain the command

su - keibatch -c ""date ; /usr/local/kei/batch/apb/bin/JKEIKYK4140.sh -run "&$C$6&" WSUKE100201"" Not clear about : date ; /usr/local/kei/batch/apb/bin/JKEIKYK4140.sh -run "&$C$6&" WSUKE100201 Please help (2 Replies)
Discussion started by: honda_city
2 Replies
ICONV(1)                                                         Linux User Manual                                                        ICONV(1)

NAME
iconv - convert text from one character encoding to another SYNOPSIS
iconv [options] [-f from-encoding] [-t to-encoding] [inputfile]... DESCRIPTION
The iconv program reads in text in one encoding and outputs the text in another encoding. If no input files are given, or if it is given as a dash (-), iconv reads from standard input. If no output file is given, iconv writes to standard output. If no from-encoding is given, the default is derived from the current locale's character encoding. If no to-encoding is given, the default is derived from the current locale's character encoding. OPTIONS
-f from-encoding, --from-code=from-encoding Use from-encoding for input characters. -t to-encoding, --to-code=to-encoding Use to-encoding for output characters. If the string //IGNORE is appended to to-encoding, characters that cannot be converted are discarded and an error is printed after conversion. If the string //TRANSLIT is appended to to-encoding, characters being converted are transliterated when needed and possible. This means that when a character cannot be represented in the target character set, it can be approximated through one or several similar looking characters. Characters that are outside of the target character set and cannot be transliterated are replaced with a ques- tion mark (?) in the output. -l, --list List all known character set encodings. -c Silently discard characters that cannot be converted instead of terminating when encountering such characters. -o outputfile, --output=outputfile Use outputfile for output. -s, --silent This option is ignored; it is provided only for compatibility. --verbose Print progress information on standard error when processing multiple files. -?, --help Print a usage summary and exit. --usage Print a short usage summary and exit. -V, --version Print the version number, license, and disclaimer of warranty for iconv. EXIT STATUS
Zero on success, nonzero on errors. ENVIRONMENT
Internally, the iconv program uses the iconv(3) function which in turn uses gconv modules (dynamically loaded shared libraries) to convert to and from a character set. Before calling iconv(3), the iconv program must first allocate a conversion descriptor using iconv_open(3). The operation of the latter function is influenced by the setting of the GCONV_PATH environment variable: * If GCONV_PATH is not set, iconv_open(3) loads the system gconv module configuration cache file created by iconvconfig(8) and then, based on the configuration, loads the gconv modules needed to perform the conversion. If the system gconv module configuration cache file is not available then the system gconv module configuration file is used. * If GCONV_PATH is defined (as a colon-separated list of pathnames), the system gconv module configuration cache is not used. Instead, iconv_open(3) first tries to load the configuration files by searching the directories in GCONV_PATH in order, followed by the system default gconv module configuration file. If a directory does not contain a gconv module configuration file, any gconv modules that it may contain are ignored. If a directory contains a gconv module configuration file and it is determined that a module needed for this conversion is available in the directory, then the needed module is loaded from that directory, the order being such that the first suitable module found in GCONV_PATH is used. This allows users to use custom modules and even replace system-provided modules by pro- viding such modules in GCONV_PATH directories. FILES
/usr/lib/gconv Usual default gconv module path. /usr/lib/gconv/gconv-modules Usual system default gconv module configuration file. /usr/lib/gconv/gconv-modules.cache Usual system gconv module configuration cache. CONFORMING TO
POSIX.1-2001. EXAMPLE
Convert text from the ISO 8859-15 character encoding to UTF-8: $ iconv -f ISO-8859-15 -t UTF-8 < input.txt > output.txt The next example converts from UTF-8 to ASCII, transliterating when possible: $ echo abc B a EUR ac | iconv -f UTF-8 -t ASCII//TRANSLIT abc ss ? EUR abc SEE ALSO
locale(1), iconv(3), nl_langinfo(3), charsets(7), iconvconfig(8) COLOPHON
This page is part of release 4.15 of the Linux man-pages project. A description of the project, information about reporting bugs, and the latest version of this page, can be found at https://www.kernel.org/doc/man-pages/. GNU 2018-02-02 ICONV(1)
All times are GMT -4. The time now is 06:59 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy