Sponsored Content
Full Discussion: Explain iconv command
Top Forums UNIX for Beginners Questions & Answers Explain iconv command Post 303032481 by RudiC on Tuesday 19th of March 2019 06:10:28 AM
Old 03-19-2019
Not sure I understand correctly - you want to iconv multi-byte UTF-8 records to ASCII but retain record length? So - if a two byte representation (like Ñ) is converted to N, a space should be added, and for three bytes, two spaces, to keep the record length, PROVIDED the target representations is a one byte char. This doesn't always come true, e.g € -> EUR in ASCII.
If above assumption is true, some conditioning upfront the iconv might help, like
Code:
LC_ALL=C sed 's/[\xC0-\xDF]./& /g; s/[\xE0-\xEF]../& /g' non-ascii.txt | iconv -futf8 -tASCII//TRANSLIT//IGNORE

adds one space for two byte repr., two for three byte repr. For longer / more exotic codes, it must be expanded equivalently.
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

please explain the command

Hi all , please explain the following command : perl -e 'select(undef,undef,undef,.15)' Thanks and Regards Navatha (2 Replies)
Discussion started by: Navatha
2 Replies

2. UNIX for Dummies Questions & Answers

Can anyone explain what this command is doing?

Specifically what is the purpose of sed? What is f? Why is the 'cp f $phonefile' line needed when the script ‘goes live'? Why might that two commands following sed be commented out at the present time ( i.e., during development)? Thanks in... (2 Replies)
Discussion started by: knp808
2 Replies

3. Shell Programming and Scripting

Please Explain me this command

find . -type f -ctime +3 -exec mv {} /somedirectory/ \; in particular "-ctime v/s -mtime" and "difference between +3 and -3" (5 Replies)
Discussion started by: Rambo
5 Replies

4. Shell Programming and Scripting

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (0 Replies)
Discussion started by: Shruthi8818
0 Replies

5. UNIX for Dummies Questions & Answers

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (2 Replies)
Discussion started by: Shruthi8818
2 Replies

6. Shell Programming and Scripting

Characterset conversion problem using iconv command

Hi Friends, I am not able to conver character set from UTF-8 to IBM-284 throwing an error "cannot open convertor" . Could you please help me how to get out of this error. Below command is working fine iconv -f ISO8859-15 -t UTF-8 fromfile.txt > tofile.txt But the below command is... (2 Replies)
Discussion started by: sivakumarl
2 Replies

7. Shell Programming and Scripting

Help with command iconv

I need to convert a utf16 file to utf8. When i use the iconv command to do so it gives an error saying invalid function. When I ran the iconv -l function it did not list the utf16 and utf8 as part of its internal table. Is there anyway I can add these encodings in the library? Is there any other... (3 Replies)
Discussion started by: gaun
3 Replies

8. UNIX for Dummies Questions & Answers

Please explain this command?

Hi, I saw this. But I don't know why we need this? ls mydir > foo.txt ## I know what this will do, it will take the results and write to the file called foo.txt ls mydir > foo.txt 2>&1 ## Don't know why we need 2>&1 Thanks. (2 Replies)
Discussion started by: samnyc
2 Replies

9. Shell Programming and Scripting

Can any one explain this sqlplus command?

Hi , i am new to unix i need a small clarification regarding this sqlplus -s $USER_NAME/$PASSWD@$ORA_SID<< EOF >> SQL_CONN_LOG.log In the above command what is the meaning of <<EOF>> Thanks, krishna. (2 Replies)
Discussion started by: rams_krishna
2 Replies

10. Red Hat

Please help to explain the command

su - keibatch -c ""date ; /usr/local/kei/batch/apb/bin/JKEIKYK4140.sh -run "&$C$6&" WSUKE100201"" Not clear about : date ; /usr/local/kei/batch/apb/bin/JKEIKYK4140.sh -run "&$C$6&" WSUKE100201 Please help (2 Replies)
Discussion started by: honda_city
2 Replies
iconv(1)						      General Commands Manual							  iconv(1)

NAME
iconv - codeset conversion SYNOPSIS
frommap tomap [file]... fromcode tocode] [file]... tocode fromcode] [file]... DESCRIPTION
The command converts the encoding of characters in the input files from one codeset to another, and writes the results to standard output. If no input files are given, reads from standard input. If appears as an input file name, the command reads standard input at that point. can be used to delimit the end of options (see getopt(3C)). Options The command recognizes the following options: Omit any characters that are invalid in the codeset of the input file from the output. This is the default. Identify the codeset corresponding to option argument fromcode as the codeset that the input will be converted "from". The fromcode option argument must not contain a slash character. If this option is omitted, the codeset of the current locale will be used. The frommap option argument must contain a slash character. It is interpreted as the pathname of a charmap file which is parsed to determine the codeset name defined in the charmap file. Write all the supported fromcode and tocode values to standard output as defined in the iconv configuration files, or See iconv(3C) for details. Suppress any error messages written to standard error concerning invalid characters. The presence or absence of does not affect the exit status of Identify the codeset corresponding to option argument tocode as the codeset that the input will be converted "to". The tocode option argument must not contain a slash character. If this option is omitted, the codeset of the current locale will be used. The tomap option argument must contain a slash character. It is interpreted as the pathname of a charmap file which is parsed to determine the codeset name defined in the charmap file. The fromcode and tocode names can be any of the base and alias names listed in the iconv configuration files, or See iconv(3C) for details and the configuration files for a list of supported codeset names. EXTERNAL INFLUENCES
Environment Variables provides a default value for the internationalization variables that are unset or null. If is unset or null, the default value of "C" (see lang(5)) is used. If any of the internationalization variables contains an invalid setting, the command will behave as if all internation- alization variables are set to "C". See environ(5). Setting to a non-empty string value overrides the values of all the other internationalization variables. determines the interpretation of text as single and/or multi-byte characters, the classification of characters as printable, and the char- acters matched by character class expressions in regular expressions. During translation of the file, this variable is superseded by the use of the fromcode option argument. determines the locale that should be used to affect the format and contents of diagnostic messages written to standard error and informa- tive messages written to standard output. determines the location of message catalogs for the processing of International Code Set Support Single and multi-byte character codesets are supported. RETURN VALUE
returns the following values: Successful completion. An error occurred. EXAMPLES
Convert the contents of file from codeset Roman8 to ISO 8859/1 and store the results in file WARNINGS
If an input character does not have a valid equivalent in the codeset selected by the option (the "to" codeset), it is mapped to the "gal- ley character", if it has been defined for that conversion (see genxlt(1) and iconv(3C)). If an input character does not belong to the codeset selected by the option (the "from" codeset), the command terminates. FILES
User customizable configuration file System configuration file, containing codenames supported by the operating system AUTHOR
The command was developed by HP. SEE ALSO
getopt(3C), iconv(3C). STANDARDS CONFORMANCE
iconv(1)
All times are GMT -4. The time now is 08:59 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy