The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #1 (permalink)  
Old 04-27-2007
selkirk selkirk is offline
Registered User
  
 

Join Date: Apr 2007
Posts: 4
Question Delete multiple lines containting a variable string using SED.

Good morning,
Novice scripter in Unix here, and I've run into and sed task I can't quite wrap my head around. I'm pulling my hair out fast enough as it is and thought I would go to the knowledge bank.

I have a sorted file that I'm trying to trim down by deleting any line whose first few characters are repeats of a previous line.
i.e.

1 ABCD
1 CDEF
1 EFGH
2 ACDE
2 GLKGI
2 KLIGH
.
.
.
10 ABSD
10 OIHIHN
10 OHOIN
.
.
.
XX LIHIN
XX OIHNM
XX OHINK

I need to delete any line for which the line header (first three charachters are a repeat.) So for the previous lines it would keep the first line that begins with "1 " the first line that begins with "2 " etc. So the end result of the would be

1 ABCD
2 ACDE
.
10 ABSD
.
XX LIHIN

The first three characters are always whole numbers, no more than 2 digits, and followed by a space, but the maximum number changes (it could anywhere between 11 and 40)

I suppose the other option would be to print out the first line that contains "1 ', the first line that contains "2 " etc. and drop them into a new file.
I'm more familiar with SED, but using AWK or something else would be fine too.

Thanks in advance!