Grab from file with sed

01-19-2013

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

Try:

Code:

sed -n '/^=001/{s/.* //;h;}; /^=003/{s/.* //;H;}; /^=025/{s/.*\$//;H;}; /^=590.*\$aEn Curs/{g;s/\n/;/g;p;}' infile

This User Gave Thanks to Scrutinizer For This Post:

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

01-19-2013

Registered User

66, 0

Join Date: Jan 2010

Last Activity: 9 April 2014, 5:15 AM EDT

Posts: 66

Thanks Given: 5

Thanked 0 Times in 0 Posts

Ok Great !!!
Thanks a lot, now I see the sctructure better.
I'll look for some tutorial in youtube in order to get it better.

Have a nive week end !!

PD: If you need some help with oracle database, just ask !
Cheers

---------- Post updated at 09:34 AM ---------- Previous update was at 06:00 AM ----------

Hello

looking the result more carefully I see that the field 310 is never present in the output file. Look the exemple:

Script executed:

Code:

sed -n '/^=001/{s/.* //;h;}; /^=310/{s/.* //;H;}; /^=037/{s/.* //;H;}; /^=050/{s/.* //;H;}; /^=099/{s/.*\$//; H;}; /^=590.*\$aEn Curs/{g;s/\n/;/g;p;}' bib.mrk >seriadas.csv

result:

Code:

vtls000000101;\\$a1570$i090$j090$k03;\\$a1570$i090$j090$k03;\\$a951$i090$j090$k03

source file:

Code:

=001  vtls000000101
=037  \\$a1570$i090$j090$k03
=037  \\$a1570$i090$j090$k03
=037  \\$a951$i090$j090$k03
=590  \\$aEn Curs
=027  \\$aAnuari
=050  \\$aA-0189
=099  \\$aSALA-17.01(I)
=310  \\$aAnual

As I see in the result is that only the field 037 have been passed, the other not...
The result have to be like:

Code:

vtls000000101;\\$aAnual;\\$a1570$i090$j090$k03-\\$a1570$i090$j090$k03-\\$a951$i090$j090$k03;\\$aA-0189;\\$aSALA-17.01(I)

Maybe one little detail to adapt no?
Thanks !

ldiaz2106

View Public Profile for ldiaz2106

Find all posts by ldiaz2106

01-19-2013

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

That has to do with the changed order of sample fields. This gets a bit too complicated for sed, perhaps you could try if awk might be a better choice:

Code:

awk '
  {
    i=$1
    sub(i " *",x)
    A[i]=A[i](A[i]?"-":x) $0
  } 
  i=="=310"{
    if (A["=590"]~/\$aEn Curs/) print A["=001"],A["=310"],A["=037"],A["=050"],A["=099"]
    for(i in A) delete A[i]
  }
' OFS=\; file

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

01-19-2013

Registered User

66, 0

Join Date: Jan 2010

Last Activity: 9 April 2014, 5:15 AM EDT

Posts: 66

Thanks Given: 5

Thanked 0 Times in 0 Posts

Ok,
I try and this is the result:

good

Code:

vtls000000013;\\$aAnual;\\$a1327$i090$j090$k03;\\$aG-0644;\\$aSALA-14.04(E)
vtls000000017;\\$aAnual;\\$a1465$i090$j090$k03-\\$a46$i090$j090$k03;\\$aG-0022;\\$aSALA-11.00(E)
vtls000000021;\\$aAnual;\\$a1196$i090$j090$k03-\\$a1196$i090$j090$k03;\\$aG-0541;\\$aDIP�SIT
vtls000000028;\\$aAnual;\\$a1156$i090$j090$k03;\\$aG-0949;\\$aDIP�SIT

and later you have this
not good

Code:

vtls000000167-vtls000000169-vtls000000171-vtls000000174-vtls000026748;\\$aAnual;\\$a1898$i090$j090$k03;\\$aB-0258-\\$aG-0781-\\$aGI-0144-\\$aGI-0160-\\$aA-0791;\\$aDIP�SIT-\\$aDIP�SIT-\\$aDIP�SIT-\\$aDIP�SIT-\\$aDIP�SIT
vtls000000176-vtls000000179-vtls000021436-vtls000000202;\\$aAnual;\\$a832$i090$j090$k03-\\$a832$i090$j090$k03-\\$a832$i090$j090$k03;\\$aGI-0180-\\$aGI-0246-\\$aG-0180;\\$aDIP�SIT-\\$aDIP�SIT-\\$aSALA-17.01(E)

and then the file become good again.
Strange...
Maybe it's because the source file have some strange caracters.

THanks if you can take a look
I post the complete source file comressed if you want

bib.zip (4.69 MB)

ldiaz2106

View Public Profile for ldiaz2106

Find all posts by ldiaz2106

01-19-2013

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

Your file does not appear to be standard UTF-8. There is a byte order mark and some extra strange characters for example in some of the =099 lines.

This User Gave Thanks to Scrutinizer For This Post:

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

01-20-2013

Registered User

66, 0

Join Date: Jan 2010

Last Activity: 9 April 2014, 5:15 AM EDT

Posts: 66

Thanks Given: 5

Thanked 0 Times in 0 Posts

mmmm
so with notepad++ I open the file and "conversion=>utf8" will be enougth?
I try !

---------- Post updated at 12:22 PM ---------- Previous update was at 12:16 PM ----------

I try this way

I remove from you awk script the last ,A["=099"] in order not to evaluate this
But stell same result

Code:

vtls000000094;\\$aAnual;\\$a951$i090$j090$k03;\\$aA-0245
vtls000000167-vtls000000169-vtls000000171-vtls000000174-vtls000026748;\\$aAnual;\\$a1898$i090$j090$k03;\\$aB-0258-\\$aG-0781-\\$aGI-0144-\\$aGI-0160-\\$aA-0791

The test is valid?

---------- Post updated 01-20-13 at 05:02 AM ---------- Previous update was 01-19-13 at 12:22 PM ----------

Hello

ok I found the problem, but I don't know how to fix it with the awk script.
The problem happend when one of the field in the list do NOT exists into the bib.mrk file:
Exemple =310
I can see that adding =310 manually to the bib.mrk file, this record is created properly into the csv output file.
Can you modify the awk script in order to get this probability ?
Some fields in the list can NOT be present.

Thanks

ldiaz2106

View Public Profile for ldiaz2106

Find all posts by ldiaz2106

01-20-2013

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

Some thing like this?

Code:

awk '
  function pr(){                                                                         # define the print array elements function
    if (A["=590"]~/\$aEn Curs/) print A["=001"],A["=310"],A["=037"],A["=050"],A["=099"]  # Print the array elements
    for(i in A) delete A[i]                                                              # Delete the array elements
  }
  {
    i=$1                                                                                 # i becomes the index in field $1
    sub(i " *",x)                                                                        # delete the index and spaces following it from the line
    A[i]=A[i](A[i]?"-":x) $0                                                             # add  the line to the array element with index "i" and insert a "-" when there is already an entry present
  } 
  !NF{                                                                                   # if there is an empty line then
    pr()                                                                                 # print array elements
  }
  END{                                                                                   # if there are no more records
    pr()                                                                                 # print array elements  
  }
' OFS=\; bib.mrk                                                                         # set the Output Field Separator to ";"

Last edited by Scrutinizer; 01-20-2013 at 06:36 AM..

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

Shell Programming and Scripting

Grab from file with sed

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk remove/grab lines from file with pattern from other file

Discussion started by: SDohmen

2. Shell Programming and Scripting

sed to grab first instance of a range

Discussion started by: Lazarix

3. Shell Programming and Scripting

Read file, grab ip with fail2ban

Discussion started by: baris35

4. Shell Programming and Scripting

Grab nth occurence in between two patterns using awk or sed

Discussion started by: OTNA

5. Shell Programming and Scripting

Grab 2 pieces of data within a file

Discussion started by: greglocke

6. Shell Programming and Scripting

Help me grab a passage out of this text file?

Discussion started by: aabbasi

7. UNIX for Dummies Questions & Answers

Grab Portion of Output Text (sed, grep, awk?)

Discussion started by: compulsiveguile

8. Shell Programming and Scripting

Grab from the file in one command

Discussion started by: bilalghazi

9. UNIX for Dummies Questions & Answers

search and grab data from a huge file

Discussion started by: ting123

10. UNIX for Dummies Questions & Answers

How to Grab the latest file

Discussion started by: n9ninchd