Explain iconv command


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Explain iconv command
# 8  
Old 03-18-2019
dd can pad newline-terminated records to appropriate lengths with spaces, but they must be newline-terminated. Here is a utility to convert to newline-terminated records:

Code:
// convert to an executable with cc block.c -o block
#include <stdio.h>
#include <limits.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
        long int BLOCK=0;
        char *buf=NULL;

        if(argc < 2)
        {
                fprintf(stderr, "usage:  %s blocksize [inputfile]\n", argv[0]);
                exit(1);
        }

        if(argc >= 2) BLOCK=strtol(argv[1], NULL, 10);
        if(argc >= 3) {
                if(freopen(argv[2], "rb", stdin)==NULL)
                {
                        perror("Couldn't open");
                        return(1);
                }
        }

        if(BLOCK<=0)
        {
                fprintf(stderr, "Invalid block size %ld\n", BLOCK);
                return(1);
        }

        buf=malloc(BLOCK+1);
        if(buf == NULL)
        {
                perror("Couldn't allocate");
                return(1);
        }

//      fprintf(stderr, "block size %ld\n", BLOCK);

        buf[BLOCK]='\n';
        while(fread(buf,BLOCK,1,stdin))
                fwrite(buf,BLOCK+1,1,stdout);

        free(buf);
        return(0);
}

Having done that, you can convert it with iconv, pad to the required length with dd, then remove the newlines with tr.

In my example, I delete capital letters from aAabBbcccdDdeeEfffggghhhiIijjjkkklLlmmmnnn without changing length.

Code:
#!/bin/sh

BLOCKSIZE=3

# Add newlines using block utility
./block $BLOCKSIZE |
        # Remove characters, i.e. you'd put iconv here
        tr -d 'A-Z' |
        # Pad shrunken records with spaces
        dd ibs=$BLOCKSIZE cbs=$BLOCKSIZE conv=sync,block 2>/dev/null |
        # Remove newlines
        tr -d '\n'

Code:
$ cat fixed.txt
aAabBbcccdDdeeEfffggghhhiIijjjkkklLlmmmnnn

$ ./dump.sh < fixed.txt
aa bb cccdd ee fffggghhhii jjjkkkll mmmnnn   

$

You end up with one blank record at the end, full of spaces, which I haven't figured out how to avoid yet.
# 9  
Old 03-19-2019
Not sure I understand correctly - you want to iconv multi-byte UTF-8 records to ASCII but retain record length? So - if a two byte representation (like Ñ) is converted to N, a space should be added, and for three bytes, two spaces, to keep the record length, PROVIDED the target representations is a one byte char. This doesn't always come true, e.g € -> EUR in ASCII.
If above assumption is true, some conditioning upfront the iconv might help, like
Code:
LC_ALL=C sed 's/[\xC0-\xDF]./& /g; s/[\xE0-\xEF]../& /g' non-ascii.txt | iconv -futf8 -tASCII//TRANSLIT//IGNORE

adds one space for two byte repr., two for three byte repr. For longer / more exotic codes, it must be expanded equivalently.
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Red Hat

Please help to explain the command

su - keibatch -c ""date ; /usr/local/kei/batch/apb/bin/JKEIKYK4140.sh -run "&$C$6&" WSUKE100201"" Not clear about : date ; /usr/local/kei/batch/apb/bin/JKEIKYK4140.sh -run "&$C$6&" WSUKE100201 Please help (2 Replies)
Discussion started by: honda_city
2 Replies

2. Shell Programming and Scripting

Can any one explain this sqlplus command?

Hi , i am new to unix i need a small clarification regarding this sqlplus -s $USER_NAME/$PASSWD@$ORA_SID<< EOF >> SQL_CONN_LOG.log In the above command what is the meaning of <<EOF>> Thanks, krishna. (2 Replies)
Discussion started by: rams_krishna
2 Replies

3. UNIX for Dummies Questions & Answers

Please explain this command?

Hi, I saw this. But I don't know why we need this? ls mydir > foo.txt ## I know what this will do, it will take the results and write to the file called foo.txt ls mydir > foo.txt 2>&1 ## Don't know why we need 2>&1 Thanks. (2 Replies)
Discussion started by: samnyc
2 Replies

4. Shell Programming and Scripting

Help with command iconv

I need to convert a utf16 file to utf8. When i use the iconv command to do so it gives an error saying invalid function. When I ran the iconv -l function it did not list the utf16 and utf8 as part of its internal table. Is there anyway I can add these encodings in the library? Is there any other... (3 Replies)
Discussion started by: gaun
3 Replies

5. Shell Programming and Scripting

Characterset conversion problem using iconv command

Hi Friends, I am not able to conver character set from UTF-8 to IBM-284 throwing an error "cannot open convertor" . Could you please help me how to get out of this error. Below command is working fine iconv -f ISO8859-15 -t UTF-8 fromfile.txt > tofile.txt But the below command is... (2 Replies)
Discussion started by: sivakumarl
2 Replies

6. UNIX for Dummies Questions & Answers

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (2 Replies)
Discussion started by: Shruthi8818
2 Replies

7. Shell Programming and Scripting

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (0 Replies)
Discussion started by: Shruthi8818
0 Replies

8. Shell Programming and Scripting

Please Explain me this command

find . -type f -ctime +3 -exec mv {} /somedirectory/ \; in particular "-ctime v/s -mtime" and "difference between +3 and -3" (5 Replies)
Discussion started by: Rambo
5 Replies

9. UNIX for Dummies Questions & Answers

Can anyone explain what this command is doing?

Specifically what is the purpose of sed? What is f? Why is the 'cp f $phonefile' line needed when the script ‘goes live'? Why might that two commands following sed be commented out at the present time ( i.e., during development)? Thanks in... (2 Replies)
Discussion started by: knp808
2 Replies

10. Shell Programming and Scripting

please explain the command

Hi all , please explain the following command : perl -e 'select(undef,undef,undef,.15)' Thanks and Regards Navatha (2 Replies)
Discussion started by: Navatha
2 Replies
Login or Register to Ask a Question