Sponsored Content
Top Forums Shell Programming and Scripting Fixing corrupted vcard files. Post 302247006 by Annihilannic on Tuesday 14th of October 2008 06:35:59 PM
Old 10-14-2008
As far as I can see my solution does what you describe:

Code:
$ cat testfile.vcard
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A
 8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=
 A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A
 8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=
 A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A
 8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=
 A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
$ perl -pi.bak -e 'BEGIN { $/=""; } s/\n //gm' *.vcard
$ cat testfile.vcard
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n

There aren't any hidden/funny characters in the input files are there? Check with cat -vet.

man perlrun is the page you really need to look at for the command-line options. -p and -i are separate options that I combined for the sake of brevity.

-p makes perl behave like awk, including supporting a BEGIN clause before processing any input. In that clause I've redefined the input record separator to be an empty string... this means that perl "slurps" the entire input file in one go rather than reading it line-by-line, which allows us to do regex matches against multiple lines. It is separate from the actual s/// command to do the search and replace.

s/// is documented on the man perlop page.

I'm glad to see you don't just want spoonfeeding (all too common around here!).
 

8 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Corrupted files from Windows to Unix Sco

I downloaded some applications from CD on a windows2000 PC to a Unix Sco machine using the WS-FTP program. When I tried to run the applications on the Unix machines I got an error. The files must have been corrupted in the process of transferring files from a Windows 2000 to a Unix Sco... (9 Replies)
Discussion started by: BAM
9 Replies

2. UNIX for Dummies Questions & Answers

Extracting files from corrupted tape

I've got a backuptape in cpio format that was accidentally overwritten with a very small batch file. As I assume that the cpio header has been overwritten, I cannot extract files from the backup in the conventional manner: ( cpio -itv </dev/rct0 cpio: this is not a cpio file, bad header) ... (0 Replies)
Discussion started by: mart4179
0 Replies

3. Linux

fixing with sed

I am trying to replace the value of $f3 but its not working . I don't know what I am missing here . cat dim_copy.20080516.sql | grep -i "create view" | grep -v OPSDM002 | while read f1 f2 f3 f4 f5 f6 f7 f8 f9 do echo " $f3 " sed -e... (13 Replies)
Discussion started by: capri_drm
13 Replies

4. Solaris

PAM login library files corrupted, have ILOM, can I get root?

I was installing sfw sudo and its dependencies (libiconv, libintl, libgcc)on Solaris 10, running on an x86 x4200 and I corrupted some PAM library files. It's a standard Solaris 10 base install, with some added software & libraries from a vendor. I am on console trying to get root access back,... (1 Reply)
Discussion started by: Mariognarly
1 Replies

5. HP-UX

WinRAR files are corrupted after FTP

In my Windows 2003 server machine I have a winrar or winzip file that i around 3GB. This zip/rar file is ftped to a unix mahine (HPUX) . FTP is successful. But when it get this file to check if its has been ftped correctly, the file is corrupted. Is there something wrong that i am doing while... (4 Replies)
Discussion started by: maroli
4 Replies

6. Shell Programming and Scripting

Help fixing awk code to print values from 2 files

Hi everyone, Please help on this: I have file1: <file title="Title 1 and 2"> <report> <title>Title 1</title> <number>No. 1234</number> <address>Address 1</address> <date>October 07, 2009</date> <description>Some text</description> </report> ... (6 Replies)
Discussion started by: Ophiuchus
6 Replies

7. Shell Programming and Scripting

help fixing awk statement

awk "BEGIN {if($MessageREAD<$ThresholdW) {print \"OK\" ; exit 0} else if(($MessageREAD>=$ThresholdW) && ($MessageREAD<$ThresholdC)) {print \"WARNING\" ; exit 1}" else if($MessageREAD<=$ThresholdC) {print \"CRITICAL\" ;... (4 Replies)
Discussion started by: SkySmart
4 Replies

8. Hardware

Files getting corrupted

$ uname -a Linux darksun 3.13.0-36-generic #63-Ubuntu SMP Wed Sep 3 21:30:45 UTC 2014 i686 athlon i686 GNU/Linux My files are getting corrupted on a frequent basis. $ sudo fdisk -l /dev/sda Disk /dev/sda: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders, total... (10 Replies)
Discussion started by: rlopes
10 Replies
Charset::JA_JP(3pm)					User Contributed Perl Documentation				       Charset::JA_JP(3pm)

NAME
MIME::Charset::JA_JP - MIME XXXXXXXXXXXXXX SYNOPSIS
use MIME::Charset: $charset = MIME::Charset->new("euc-jp"); XXXXXXXXXXXXX: $benc = $charset->body_encoding; # X "Q" $cset = $charset->as_string; # X "US-ASCII" $henc = $charset->header_encoding; # X "S" $cset = $charset->output_charset; # X "ISO-2022-JP" XXXXXXXXXX: ($text, $charset, $encoding) = $charset->header_encode( "xc9xc2xc5xaaxc0xdexc3xefxc5xaa". "xc7xd1xcaxaaxbdxd0xcexcfxb4xef", Charset => 'euc-jp'); # ...XXX (<XXXXXXX>, "ISO-2022-JP", "B") XXXX ($text, $charset, $encoding) = $charset->body_encode( "Collectioneur pathxe9tiquement ", Charset => 'latin1'); # ...XXX (<XXXXX>, "ISO-8859-1", "QUOTED-PRINTABLE") XXXX $len = $charset->encoded_header_len( "Perlxe8xa8x80xe8xaax9e", Charset => "utf-8", Encoding => "b"); # ...XXX 28 XXXX XXXXXXXXXXX: MIME::Charset::alias("csEUCKR", "euc-kr"); MIME::Charset::default("iso-8859-1"); MIME::Charset::fallback("us-ascii"); XOOXX (XXXXXXX): use MIME::Charset qw(:info); $benc = body_encoding("iso-8859-2"); # "Q" $cset = canonical_charset("ANSI X3.4-1968"); # "US-ASCII" $henc = header_encoding("utf-8"); # "S" $cset = output_charset("shift_jis"); # "ISO-2022-JP" use MIME::Charset qw(:trans); ($text, $charset, $encoding) = header_encode( "xc9xc2xc5xaaxc0xdexc3xefxc5xaa". "xc7xd1xcaxaaxbdxd0xcexcfxb4xef", "euc-jp"); # ...(<XXXXXXXXX>, "ISO-2022-JP", "B") XXXX ($text, $charset, $encoding) = body_encode( "Collectioneur pathxe9tiquement ". "xe9clectique de dxe9chets", "latin1"); # ...(<XXXXXX>, "ISO-8859-1", "QUOTED-PRINTABLE") XXXX $len = encoded_header_len( "Perlxe8xa8x80xe8xaax9e", "b", "utf-8"); # 28 DESCRIPTION
MIME::Charset XXXXXXXXXXXX MIME XXXXXXXXXXXXXXXXXXXXXXXXXX XX XXXXXXXX XXXMIME XX ``character set'' XXXXX XXXXXXXXXXXXXXXXXXXXXXX XXXXISO/IEC XXXX ``XXXXXXX'' (CCS) X ``XXXXXX'' (CES) XXXXXXXXXXXX XXXXXXXX XXXMIME XXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXX US-ASCII XXXXXXXXXXXXXXXXX XXXXXXX $charset = MIME::Charset->new([CHARSET [, OPTS]]) XXXXXXXXXXXXXXXXXXXXXX OPTS XXXXXXXXXXXX NOTE: Unicode/XXXXXXXXXXXXXXXXXXXX ("USE_ENCODE" XX) XX XXXXXXXXXXXXXXXXXXXXXXXXXX Mapping => MAPTYPE XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX "EXTENDED" XXXXXXXXXXXX "STANDARD" XXXXXXXXXXXXXXXXXXXXX XXX "EXTENDED"X XXXXXXXXXXXXX $charset->body_encoding body_encoding CHARSET CHARSET XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXX "B" (BASE64)X"Q" (QUOTED-PRINTABLE)X"S" (XXXXXXXX)X "undef" (XXXXXXXXXXXXX --- 7BIT X 8BIT) XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX $charset->as_string canonical_charset CHARSET XXXXXXXXXXXXXXXXXXXX $charset->decoder XXXXXXXXX Unicode XXXXXXXXX "Encode::Encoding" XXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXundef XXXXX detect_7bit_charset STRING XXX STRING XXXXXXXX7 XXXXXXXXXXXXXXXXX STRING X8XXXXXXXXXXXXX "undef" XXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX $charset->dup XXXXXXXXXXXXXXXXXXXX $charset->encoder([CHARSET]) XXXXXXXXX MIME XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX "Encode::Encoding" XXXXXXXXXX CHARSET XXXXXXXXXX$charset XXXXXXXXXXX (XXXXXXXXXXXXXX) XXCHARSET XXXXXXXXXX XXXX$charset XXXXXXXXXXXXXXXXXXXXXX CHARSET XXXXXXXXX $charset->header_encoding header_encoding CHARSET CHARSET XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXX "B"X"Q"X"S" (XXXXXXXXXX)X "undef" (XXXXXXXXXXX) XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX $charset->output_charset output_charset CHARSET XXXX CHARSET XXXXXXXXXXXXXX MIME XXXXXXXXXXXXXXXXXXXXXXXXXX (XXXXXXXXXXXXX) XXXXX Unicode/XXXXXXXXXXXXXXXXXXXX ("USE_ENCODE" XX) XX XXXXXXX "canonical_charset" XXXXXXX XXXXXXXXXX $charset->body_encode(STRING [, OPTS]) body_encode STRING, CHARSET [, OPTS] STRING X (XXXX) XXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX CHARSET X STRING XXXXXXXXXXXXXXXXX OPTS XXXXXXXXXXXXX NOTE: Unicode/XXXXXXXXXXXXXXXXXXXX ("USE_ENCODE" XX) XX XXXXXXXXXXXXXXXXXXXXXXXXXXX Detect7bit => YESNO CHARSET XXXXXX7XXXXXXXXXXXXXXXXXXXXXXXX XXX "YES"X Replacement => REPLACEMENT XXXXXXXXXX"XXXXX" XXX 3XXXXXX (XXXXXXXX, XXXXXXXXXXX, XXXXXXXXXX) XXXX XXXXXXXXXX X "BASE64"X"QUOTED-PRINTABLE"X "7BIT"X"8BIT" XXXXXXXXXXXXXXXXX XXXXXXX XXXXXXXX X ASCIIXXXXXXXXXXXXX XXXXXXXXXXX X "undef"XXXXXXXXXXX X "BASE64" XXXX XXXXXXXXXXX X "US-ASCII" XXXXXXXXXX ASCIIXXXXXXXXXXXXXXXXX $charset->decode(STRING [,CHECK]) STRING X Unicode XXXXXXXXX NOTE: Unicode/XXXXXXXXXXXXXXXXXXXX ("USE_ENCODE" XX) XX XXXXXXXXXXXXX $charset->encode(STRING, [, CHECK]) STRING (Unicode XXXXXXXXXXXX) XX XXXXXXXXXXXXXXXXXXXXXXX MIME XXXXXXXXXXXXXXXXXXXXXXXXX (XXXXXXXXXXXXX) XXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXX XXXX Unicode XXXXXXXXXXXXXXXXXX NOTE: Unicode/XXXXXXXXXXXXXXXXXXXX ("USE_ENCODE" XX) XX XXXXXXXXXXXXX $charset->encoded_header_len(STRING [,ENCODING]) encoded_header_len STRING, ENCODING, CHARSET STRING XXXXXXXXXXXXXXXXXXXXXXXX (XXXXXXXXXX) XXXXXX ENCODING X "B"X"Q"X"S" ("B" X "Q" XXXXXXXXX) XXXXXX $charset->heder_encode(STRING [, OPTS]) header_encode STRING, CHARSET [, OPTS] STRING X (XXXX) XXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX CHARSET X STRING XXXXXXXXXXXXXXXXX OPTS XXXXXXXXXXXXX NOTE: Unicode/XXXXXXXXXXXXXXXXXXXX ("USE_ENCODE" XX) XX XXXXXXXXXXXXXXXXXXXXXXXXXXX Detect7bit => YESNO CHARSET XXXXXX7XXXXXXXXXXXXXXXXXXXXXXXX XXX "YES"X Replacement => REPLACEMENT XXXXXXXXXX"XXXXX" XXX 3XXXXXX (XXXXXXXX, XXXXXXXXXXX, XXXXXXXXX) XXXX XXXXXXXXX X "B"X"Q"X"undef" (XXXXXXXXXXX) XXXXXX XXXXXXXXXXX XXXXXXXXXXXXXXX X ASCIIXXXXXXXXXXXXXXXXXXXXXXXX X "8BIT" (XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX) X XXXXXXXXX X "undef" (XXXXXXXXXXXXX) XXXX XXXXXXXXXXX X "US-ASCII" XXXXXXXXXX ASCIIXXXXXXXXXXXXXXXXX $charset->undecode(STRING [,CHECK]) Unicode XXX string XX $charset XXXXXXXXXXXXXXXXXXXXXXXX XXX "$charset->decoder->encode()" XXXXXXX NOTE: Unicode/XXXXXXXXXXXXXXXXXXXX ("USE_ENCODE" XX) XX XXXXXXXXXXXXX XXXXXXXXXXX alias ALIAS [, CHARSET] "canonical_charset" XXXXXXXXXXXXXXXXXXXXXXXXXX/XXXXX CHARSET XXXXXXXXXXXALIAS X CHARSET XXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXX XX ALIAS XXXXXXXXXXXXXXXXXXXX default [CHARSET] XXXXXXXXXXXXX/XXXXX XXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXX XXXXXXX "US-ASCII"X CHARSET XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXX NOTE: XXXXXXXXXXXXXXXXXXXXXX fallback [CHARSET] XXXXXXXXXXXXX/XXXXX XXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXX "FALLBACK" XXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXX "UTF-8"X CHARSET XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX CHARSET X "NONE" XXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXX NOTE: XXXXXXXXXXX "US-ASCII" XXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX recommended CHARSET [, HEADERENC, BODYENC [, ENCCHARSET]] XXXXXXXXXXXXXX/XXXXX XXXXXXXXXXXXXXXXXXXXXXXX XXXXX CHARSET XXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXCHARSET XXXXXXX 3 XXXXXX (HEADERENC, BODYENC, ENCCHARSET) XXXXXX HEADERENC XXXXXXXXXXXXXXXXXXXXXXXXX "B"X"Q"X"S" (XXXXXXXXXX)X "undef" (XXXXXXXXXXX) XXXXXXX BODYENC XXXXXXXXXXXXXXXXXXXXXXXXX "B"X"Q"X"S" (XXXXXXXXXX)X"undef" (XXXXXXXXXXXXX) XXXXXXX ENCCHARSET XXXXXX CHARSET XXXXXXXXXXXXXX MIME XXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXX (XXXXXXXXXXXXXXXXXXXXXXXXXX) XXXX ENCCHARSET X "undef"X NOTE: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (XXXXXXXXXXXXXXXXXXXXXXXX)X XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX "header_encoding"X"body_encoding"X"output_charset" XXXXXXXX XX USE_ENCODE Unicode/XXXXXXXXXXXX Unicode XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXX Perl 5.7.3 XXXXXXXXXXXXXXX Perl XXXXXXXX XXXXX "body_encode" X "header_encode" X "Replacement" XXXXXXXXXXXXXXXXXX: "DEFAULT" XXXXXXXXXXXXXXXXXXX UCM XXXXXXXXXXXXXXXXXXXXX <subchar> XXXXXXXXX "FALLBACK" XXXXXXXXXX XXXX "DEFAULT" XXXXXXXX ("fallback" XX)X XXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX "CROAK" XXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXX eval{} XXXXXXXXXXXXXXXXXXXXX "STRICT" XXXXX "PERQQ" "HTMLCREF" "XMLCREF" Encode XXXXXXXXXXXX "FB_PERLQQ"X"FB_HTMLCREF"X"FB_XMLCREF" XXXXXXX XX XXXXXXXXXXXXXX XXX "Handling Malformed Data" in Encode XXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX "DEFAULT" XXXXX XXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXX MIME/Charset/Defaults.pm XXXXXXXXXXXX XXXX MIME/Charset/Defaults.pm.sample XXXXXXXX VERSION
$VERSION XXXXXXXXX XXXXXXXXXXXX http://hatuka.nezumi.nu/repos/MIME-Charset/ <http://hatuka.nezumi.nu/repos/MIME-Charset/> XXXX XXXXXX XXXX 1.001 o new() XXXXX CHARSET XXXXXXXXXXXXXXXXXXXXXXXXXX XXXX 1.005 o encoded-word XXXXXXXXX RFC 2047 X 5(3) XXXXXXXXX encoded_header_len() XXXXXXXXXXXX XXXX 1.008.2 o body_encoding() XXXXX "S" XXXXXXXXXXX o body_encode() XXXXX UTF-8 XXXXXXXXXXXXXXXXXXX XXXXXXXXXXX "BASE64" XXXXXXXX"QUOTED-PRINTABLE" XXXXXXXXX SEE ALSO
Multipurpose Internet Mail Extensions (MIME). AUTHOR
Hatuka*nezumi - IKEDA Soji <hatuka(at)nezumi.nu> COPYRIGHT
Copyright (C) 2006-2011 Hatuka*nezumi - IKEDA Soji. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.14.2 2012-06-03 Charset::JA_JP(3pm)
All times are GMT -4. The time now is 09:28 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy