sed to remove

05-17-2010

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

Yes, but then you would have to test for space or tab again:

Code:

sed '/^[0-9]\{9\}[\t ]650/s/[034]$//'

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

05-17-2010

Registered User

2,759, 420

Join Date: Jun 2006

Last Activity: 13 September 2015, 8:58 PM EDT

Posts: 2,759

Thanks Given: 44

Thanked 420 Times in 408 Posts

I guess there are many LDR && BK sessions, try to use this rule in code:

Code:

awk '$2=="LDR" {a=1} ($2=="650" && a=1) {sub(/[034]$/,"")} $4=="BK" {a=0}1 ' urfile

rdcwayx

View Public Profile for rdcwayx

Find all posts by rdcwayx

05-18-2010

Registered User

66, 0

Join Date: Jan 2010

Last Activity: 9 April 2014, 5:15 AM EDT

Posts: 66

Thanks Given: 5

Thanked 0 Times in 0 Posts

Hi
Very good
Thanks a lot, this time I have the solution !!
Great

Cheers

---------- Post updated at 05:17 AM ---------- Previous update was at 12:04 AM ----------

Hello again,

an other question about this file,
In some case I have this:

Code:

000000008 650   L $$aGeneral y miscelánea;4001
000000008 650   L $$aGeneral and miscellaneus;4001

Is it possible to rename the last number with a string? the objective is to get this:

Code:

000000008 650   L $$aGeneral y miscelánea;Quimica
000000008 650   L $$aGeneral and miscellaneus;Chemistery

THanks if you have any idea.
Cheers

ldiaz2106

View Public Profile for ldiaz2106

Find all posts by ldiaz2106

05-18-2010

Registered User

6,384, 2,214

Join Date: May 2005

Last Activity: 28 October 2019, 4:59 PM EDT

Location: In the leftmost byte of /dev/kmem

Posts: 6,384

Thanks Given: 143

Thanked 2,214 Times in 1,548 Posts

I am not sure how to distinguish between the first and second line in your example - they differ only in the content of the penultimate field ("General y miscelánea" vs. "General and miscellaneus").

Based on scrutinizers solution and assuming these two are just two examples and you do not have to differentiate:

Code:

sed '/^[0-9]\{9\}[\t ]650/s/;[0-9]*$/;Quimica/'

This will change the "4001" in your example to "Quimica" in all lines having "650" in the second field.

I hope this helps.

bakunin

bakunin

View Public Profile for bakunin

Find all posts by bakunin

05-18-2010

Registered User

66, 0

Join Date: Jan 2010

Last Activity: 9 April 2014, 5:15 AM EDT

Posts: 66

Thanks Given: 5

Thanked 0 Times in 0 Posts

Hi

The only diferrence between both records is one is in spanish and the second is in english.
This is a library catalogue with references to BOOKS, so some books exists in several languages.

So if we take my exemple:

Code:

000000008 650   L $$aGeneral y miscelánea;4001
000000008 650   L $$aGeneral and miscellaneus;4001

Am other exemple (other record)

Code:

000000038 650   L $$aQu�*mica1; 4007
000000038 650   L $$aChemistry1; 4007

So I need to replace all the code after the ;
I have the correspondance for all the codes.

---------- Post updated at 06:29 AM ---------- Previous update was at 06:14 AM ----------

Hi
sorry
to be more clear, maybe I should explain the issue from the begining...

The file is a list of bib records like this:

Code:

000000001 LDR   L ^^^^^nam^^2200325Iia^45e0
000000001 022   L $$a0081-3397
000000001 041   L $$aspa
000000001 088   L $$aJ.E.N. 551
000000001 090   L $$aINFORMES JEN
000000001 650   L $$a59000;4001
000000001 FMT   L BK

As you see, LDR and FMT are the start and end of each record.
AS you see, there is a field 650 with some values.
So the objective is to replace the value in each record...by its correspondence (a string as Quimica....etc) in spanish and english.
Some of this records have fields 650 with only one value, but others have 2 values.

The results for the exemple will be:

Code:

000000001 LDR   L ^^^^^nam^^2200325Iia^45e0
000000001 022   L $$a0081-3397
000000001 041   L $$aspa
000000001 088   L $$aJ.E.N. 551
000000001 090   L $$aINFORMES JEN
000000001 650   L $$aFraseUno;FraseDos
000000001 650   L $$aStringOne;StringTwo
 000000001 FMT   L BK

This code is ok if you have only one value:

Code:

 sed -i '/[0-9]\{9\} 650   L $$a55[0-9]\{4\}/{G;s/^\(.* $$a\)[0-9]\{5\}\(.*\)\(\n\)$/\1Ciencias biom�dicas, Estudios b�sicos\2\3\1Biomedical sciences, Basic studies\2/g}' file.dat

But if I have 2 values...it do not works.
Thanks in advance

ldiaz2106

View Public Profile for ldiaz2106

Find all posts by ldiaz2106

05-18-2010

Registered User

6,384, 2,214

Join Date: May 2005

Last Activity: 28 October 2019, 4:59 PM EDT

Location: In the leftmost byte of /dev/kmem

Posts: 6,384

Thanks Given: 143

Thanked 2,214 Times in 1,548 Posts

Understood. I noticed the difference, but wasn't sure if this constitutes the distinctive criterion. In your case, i presume, there is nothing left than to ad-hoc-replace the lines parts:

Code:

sed '/^[0-9]\{9\}[\t ]650/ {
                   s/\(General y miscelánea;\)[0-9]*$/\1Quimica/'
                   s/\(General and miscellaneus;\)[0-9]*$/\1Chemistry/'
                   s/\(Qu�\*mica1;\)[0-9]*$/\1<whatever-your-replacement-is>/'
                   ...
                   }' /path/to/file > /path/to/outputfile

Some caveats: probably the asterisk in "Qu�\*mica1" is from differing code pages used, but generally speaking, there are some characters which have a special meaning to sed (asterisk is among them). If you want to match for such a character you will have to precede it with a backslash as escape character. To match an asterisk you will have to search for "\*", not "*", like i did above.

In your second example you have whitespace between the last semicolon and the number following it ("; 4007" instead of ";4007"). If this isn't a typo you will have to change all the matching patterns like this. Notice the added space in the character range definition ("[ 0-9]" instead of "[0-9]"):

Code:

s/\(General and miscellaneus;\)[ 0-9]*$/\1Chemistry/'

I hope this helps.

bakunin

bakunin

View Public Profile for bakunin

Find all posts by bakunin

05-19-2010

Registered User

66, 0

Join Date: Jan 2010

Last Activity: 9 April 2014, 5:15 AM EDT

Posts: 66

Thanks Given: 5

Thanked 0 Times in 0 Posts

Hi
ok so imagine I have a big list of correspondences, in my exemple it was:
000000001 650 L $$a59000;4001

But there is a lot of code more...
000000001 650 L $$a01000;30000
etc

In reality, only the 2 first number are relevant.Look this list:
01 => stringOne/filaUno
02 => stringTwo/filaDos
30 => ...
40 => ...
So if I have 59000 in the field 650, I must take the corresponance of 59

So in the sed commande I must tell that depending of the 2 first number it's a correspondence or an other...

---------- Post updated at 06:45 AM ---------- Previous update was at 06:44 AM ----------

Hi
and the * is a copy paste issue I think because the real world is:
Qu�mica

Qu�\*mica1

---------- Post updated 05-19-10 at 05:27 AM ---------- Previous update was 05-18-10 at 06:45 AM ----------

Hello
forget the last post, I resolve this by other way.
Now I just need to remove lines like this:

Code:

000000283 008   L 100211s9999\\\\xx\\\\\\\\\\\000\0\spa\d
000000284 008   L 100211s9999\\\\xx\\\\\\m\\\\\000\0\spa\d
000000285 008   L 100211s9999\\\\xx\\\\\\\\\\\000\0\eng\d
000000286 008   L 100211s9999\\\\xx\\\\\\m\\\\\000\0\fre\d

The common point is this:

Code:

'/[0-9]\{9\} 008   L'

The rest can be different
Is there any way to tell SED to remove all the line starting as this pattern?

THanks

ldiaz2106

View Public Profile for ldiaz2106

Find all posts by ldiaz2106

Shell Programming and Scripting

sed to remove

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Remove space with sed

Discussion started by: arm

2. Shell Programming and Scripting

sed remove everything between two string

Discussion started by: snayper

3. UNIX for Dummies Questions & Answers

How to remove certain lines using sed?

Discussion started by: Sivajee

4. Shell Programming and Scripting

Need to remove a character using sed

Discussion started by: ranjancom2000

5. Shell Programming and Scripting

Remove the Characters '[' and ']' with Sed

Discussion started by: andrewborg

6. Shell Programming and Scripting

using sed to remove lines

Discussion started by: BeefStu

7. Shell Programming and Scripting

sed over writes my original file (using sed to remove leading spaces)

Discussion started by: laser

8. Shell Programming and Scripting

sed to remove character ['

Discussion started by: manishabh

9. Shell Programming and Scripting

sed remove

Discussion started by: jamwong

10. Shell Programming and Scripting

how to remove ^@ from a file using sed...or anything

Discussion started by: sayonm