Good morning,
Novice scripter in Unix here, and I've run into and
sed task I can't quite wrap my head around. I'm pulling my hair out fast enough as it is and thought I would go to the knowledge bank.
I have a sorted file that I'm trying to trim down by deleting any line whose first few characters are repeats of a previous line.
i.e.
1 ABCD
1 CDEF
1 EFGH
2 ACDE
2 GLKGI
2 KLIGH
.
.
.
10 ABSD
10 OIHIHN
10 OHOIN
.
.
.
XX LIHIN
XX OIHNM
XX OHINK
I need to delete any line for which the line header (first three charachters are a repeat.) So for the previous lines it would keep the first line that begins with "1 " the first line that begins with "2 " etc. So the end result of the would be
1 ABCD
2 ACDE
.
10 ABSD
.
XX LIHIN
The first three characters are always whole numbers, no more than 2 digits, and followed by a space, but the maximum number changes (it could anywhere between 11 and 40)
I suppose the other option would be to print out the first line that contains "1 ', the first line that contains "2 " etc. and drop them into a new file.
I'm more familiar with
SED, but using AWK or something else would be fine too.
Thanks in advance!