![]() |
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| fixing with sed | capri_drm | Linux | 13 | 05-27-2008 03:13 PM |
| vCard Creator Full 0.0.1 (Default branch) | iBot | Software Releases - RSS News | 0 | 05-06-2008 06:50 PM |
| Extracting files from corrupted tape | mart4179 | UNIX for Dummies Questions & Answers | 0 | 03-28-2008 06:40 AM |
| Corrupted files from Windows to Unix Sco | BAM | UNIX for Dummies Questions & Answers | 9 | 08-29-2002 11:09 AM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
|||||
|
Thanks. I am trying to see what happens here:
perl: this is obvious -pi.bak: simply copy the current file to it's name + .bak? -e: there is no mention of this in man perl. 'BEGIN { $/=""; } s/\n //gm': the actual regex. I don't quite get it *.vcard: go through all these files? I actually need to change the regex so that it not only removes the space at the beginning of a line, but removes the newline character as well. The only newline characters that should remain are those not followed by a space. In php that would be str_replace("\n ", "", $string); however I cannot figure out the perl regex to modify it as such. And regexes are hard to google for! I do appreciate the code example, but I am also trying to learn a bit (unusual, I know). I very much appreciate your assistance and patience. |
|
||||
|
As far as I can see my solution does what you describe: Code:
$ cat testfile.vcard
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A
8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=
A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A
8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=
A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A
8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=
A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
$ perl -pi.bak -e 'BEGIN { $/=""; } s/\n //gm' *.vcard
$ cat testfile.vcard
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=94=D7=A9=D7=95=D7=A8=D7=94 =D7=94=D7=A9=D7=A0=D7=99=D7=94 =D7=9B=D7=\n
There aren't any hidden/funny characters in the input files are there? Check with cat -vet. man perlrun is the page you really need to look at for the command-line options. -p and -i are separate options that I combined for the sake of brevity. -p makes perl behave like awk, including supporting a BEGIN clause before processing any input. In that clause I've redefined the input record separator to be an empty string... this means that perl "slurps" the entire input file in one go rather than reading it line-by-line, which allows us to do regex matches against multiple lines. It is separate from the actual s/// command to do the search and replace. s/// is documented on the man perlop page. I'm glad to see you don't just want spoonfeeding (all too common around here!). |
|
|||||
|
Thanks. I'm going through the docs as we speak. Perl is _complicated_! That does not seem to be my own opinion, either. Googling some example leads me to lots of frustrated people! In any case, I probably should have posted the entire vcard file. Here it is, along with the results of the code: Code:
hardy2@hardy2-laptop:~/test$ cat test.vcf
BEGIN:VCARD
FN:First Last
N:Last;First;;;
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:First Line.\nThe Second Line i
s long so that it will wrap. Long\, long\, and wrapping!=\n\nThird Line.\n
UID:frh74xvYZ9
VERSION:2.1
END:VCARD
BEGIN:VCARD
FN;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=90=D7=90=D7=A4=D7=A8=D7=98=D
7=99 =D7=9E=D7=A9=D7=A4=D7=97=D7=94
N;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=9E=D7=A9=D7=A4=D7=97=D7=94;=D
7=90=D7=90=D7=A4=D7=A8=D7=98=D7=99;;;
NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A
8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=A9=D7=95=D7=A8=D7=94 =D7=A9=D7=A0=D7=
99=D7=94 =D7=94=D7=99=D7=90 =D7=\n=90=D7=A8=D7=95=D7=9B=D7=94\, =D7=9B=D7=9
3=D7=99 =D7=A9=D7=A0=D7=A8=D7=90=\n =D7=90=D7=95=D7=AA=D7=94 =D7=92=D7=95=D
7=9C=D7=A9=D7=AA. =D7=90=D7=A8=D7=\n=95=D7=9B=D7=94\, =D7=90=D7=A8=D7=95=D7
=9B=D7=94\, =D7=95=D7=92=D7=95=D7=9C=\n=D7=A9=D7=AA!\n=D7=A9=D7=95=D7=A8=D7
=94 =D7=A9=D7=9C=D7=99=D7=A9=D7=99=D7=AA.\n
UID:KqbQKbfBaF
VERSION:2.1
END:VCARD
hardy2@hardy2-laptop:~/test$ perl -pi.bak -e 'BEGIN { $/=""; } s/\n //gm' *.vcf
hardy2@hardy2-laptop:~/test$ cat test.vcf
BEGIN:VCARD
FN:First Last
N:Last;First;;;
s long so that it will wrap. Long\, long\, and wrapping!=\n\nThird Line.\ni
UID:frh74xvYZ9
VERSION:2.1
END:VCARD
BEGIN:VCARD
7=99 =D7=9E=D7=A9=D7=A4=D7=97=D7=94INTABLE:=D7=90=D7=90=D7=A4=D7=A8=D7=98=D
7=90=D7=90=D7=A4=D7=A8=D7=98=D7=99;;;ABLE:=D7=9E=D7=A9=D7=A4=D7=97=D7=94;=D
=94 =D7=A9=D7=9C=D7=99=D7=A9=D7=99=D7=AA.\nA9=D7=AA!\n=D7=A9=D7=95=D7=A8=D7
UID:KqbQKbfBaF
VERSION:2.1
END:VCARD
hardy2@hardy2-laptop:~/test$
As can be easily seen, the lines still wrap, and worse, critical parts of the file are destroyed. I have been playing around with the line of code, but it is slow going and I could really use a hand with this. I do appreciate your patience and willingness to teach a noob. |
|
||||
|
I'm suspecting there are some funny line terminators in this file. Can you post the output of cat -vet test.vcf?
I agree about perl, it looks pretty horrible and I was a very slow adopter; but its brevity, power and ubiquity make it difficult to live without. I generally use awk when I can, but perl is ideal for this problem due to its convenient handling of multi-line regex. |
|
|||||
|
Code:
hardy2@hardy2-laptop:~$ cat -vet test.vcf BEGIN:VCARD^M$ FN:First Last^M$ N:Last;First;;;^M$ NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:First Line.\nThe Second Line i^M$ s long so that it will wrap. Long\, long\, and wrapping!=\n\nThird Line.\n^M$ UID:frh74xvYZ9^M$ VERSION:2.1^M$ END:VCARD^M$ ^M$ BEGIN:VCARD^M$ FN;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=90=D7=90=D7=A4=D7=A8=D7=98=D^M$ 7=99 =D7=9E=D7=A9=D7=A4=D7=97=D7=94^M$ N;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=9E=D7=A9=D7=A4=D7=97=D7=94;=D^M$ 7=90=D7=90=D7=A4=D7=A8=D7=98=D7=99;;;^M$ NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=D7=A9=D7=95=D7=A8=D7=94 =D7=A^M$ 8=D7=90=D7=A9=D7=95=D7=A0=D7=94.\n=D7=A9=D7=95=D7=A8=D7=94 =D7=A9=D7=A0=D7=^M$ 99=D7=94 =D7=94=D7=99=D7=90 =D7=\n=90=D7=A8=D7=95=D7=9B=D7=94\, =D7=9B=D7=9^M$ 3=D7=99 =D7=A9=D7=A0=D7=A8=D7=90=\n =D7=90=D7=95=D7=AA=D7=94 =D7=92=D7=95=D^M$ 7=9C=D7=A9=D7=AA. =D7=90=D7=A8=D7=\n=95=D7=9B=D7=94\, =D7=90=D7=A8=D7=95=D7^M$ =9B=D7=94\, =D7=95=D7=92=D7=95=D7=9C=\n=D7=A9=D7=AA!\n=D7=A9=D7=95=D7=A8=D7^M$ =94 =D7=A9=D7=9C=D7=99=D7=A9=D7=99=D7=AA.\n^M$ UID:KqbQKbfBaF^M$ VERSION:2.1^M$ END:VCARD^M$ ^M$ hardy2@hardy2-laptop:~$ |
![]() |
| Bookmarks |
| Tags |
| operating systems |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|