Removing duplicated first field rows


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing duplicated first field rows
# 1  
Old 06-17-2014
Removing duplicated first field rows

Hello,
I am trying to eliminate rows where the first field is duplicated, leaving the row where the last field is "NET".


Data file:
Code:
345234|22.34|LST
546543|55.33|LST
793929|98.23|LST
793929|64.69|NET
149593|49.22|LST


Desired output:
Code:
345234|22.34|LST
546543|55.33|LST
793929|64.69|NET
149593|49.22|LST

Thanks very much!
# 2  
Old 06-18-2014
Code:
akshay@Aix:/tmp$ cat file
345234|22.34|LST
546543|55.33|LST
793929|98.23|LST
793929|64.69|NET
149593|49.22|LST

akshay@Aix:/tmp$ awk -F'|' 'FNR==NR{A[$1]=$0;next}($1 in A){print A[$1];delete A[$1]}' file file
345234|22.34|LST
546543|55.33|LST
793929|64.69|NET
149593|49.22|LST

# If order is not important
akshay@Aix:/tmp$ awk -F'|' '{A[$1]=$0}END{for(i in A)print A[i]}' file 
546543|55.33|LST
149593|49.22|LST
793929|64.69|NET
345234|22.34|LST

---------- Post updated at 09:34 AM ---------- Previous update was at 09:26 AM ----------

I didn't understand this "leaving the row where the last field is "NET". " can you explain ?
This User Gave Thanks to Akshay Hegde For This Post:
# 3  
Old 06-18-2014
Adapting Akshay Hegde's second proposal:
Code:
awk -F'|' '{if (substr (A[$1],length(A[$1])-2) != "NET") A[$1]=$0}END{for(i in A)print A[i]}' file

This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing top few and bottom few rows in a file

Hi, I have a requirement where I need to delete given number of top and bottom rows in a flat file which has new line as its delimiter. For ex: if top_rows=2 & bottom_rows=1 Then in a given file which looks like: New York DC LA London Tokyo Prague Paris Bombay Sydney... (7 Replies)
Discussion started by: calredd
7 Replies

2. UNIX for Dummies Questions & Answers

Removing rows that contain non-unique column entry

Background: I have a file of thousands of potential SSR primers from Batch Primer 3. I can't use primers that will contain the same sequence ID or sequence as another primer. I have some basic shell scripting skills, but not enough to handle this. What you need to know: I need to remove the... (1 Reply)
Discussion started by: msatseqs
1 Replies

3. Shell Programming and Scripting

Merge files and remove duplicated rows

In a folder I'll several times daily receive new files that I want to combine into one big file, without any duplicate rows. The file name in the folder will look like e.q: MissingData_2014-08-25_09-30-18.txt MissingData_2014-08-25_09-30-14.txt MissingData_2014-08-26_09-30-12.txt The content... (9 Replies)
Discussion started by: Bergans
9 Replies

4. Shell Programming and Scripting

Removing Duplicate Rows in a file

Hello I have a file with contents like this... Part1 Field2 Field3 Field4 (line1) Part2 Field2 Field3 Field4 (line2) Part3 Field2 Field3 Field4 (line3) Part1 Field2 Field3 Field4 (line4) Part4 Field2 Field3 Field4 (line5) Part5 Field2 Field3 Field4 (line6) Part2 Field2 Field3 Field4... (7 Replies)
Discussion started by: ekbaazigar
7 Replies

5. Shell Programming and Scripting

Removing rows and chars from text file

Dear community, maybe I'm asking the moon :rolleyes:, but I'm scratching my head to find a solution for it. :wall: I have a file called query.out (coming from Oracle query), the file is like this: ADDR TOTAL -------------------- ---------- TGROUPAGGR... (16 Replies)
Discussion started by: Lord Spectre
16 Replies

6. UNIX for Dummies Questions & Answers

Removing duplicated lines??

Hi Guys.. I have a problem for some reason my database has copied everything 4 times. My Database looks like this: >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh >ANI124365 afrhtjykulilil htrjykuk rtkjryky ukrykyrk >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh... (6 Replies)
Discussion started by: Iifa
6 Replies

7. Shell Programming and Scripting

Remove rows with first 4 fields duplicated in awk

Hi, I am trying to use awk to remove all rows where the first 4 fields are duplicates. e.g. in the following data lines 6-9 would be removed, leaving one copy of the duplicated row (row 5) Borgarhraun FH9822 ol24 FH9822_ol24_m20 ol Deformed c Borgarhraun FH9822 ol24 ... (3 Replies)
Discussion started by: tomahawk
3 Replies

8. Shell Programming and Scripting

Removing rows based on a another file

Hi, I am not sure if this has already been asked (I tried the search but the search was too broad). Basically I want to remove rows based on another file. So file1 looks like this (tab seperated): HHN 3 5 5 HUJ 2 2 1 JJJ 3 1 1 JUN 2 1 3 I have another file (file2)... (2 Replies)
Discussion started by: kylle345
2 Replies

9. Shell Programming and Scripting

Removing rows from a file

I have a file like below and want to use awk to solve this problem. The record separator is ">". I want to look at each record section enclosed within ">". Find the row with the 2nd and 3rd columns being 0, such as 10 0 0 I need to take the first number which in this case is 10. Then... (15 Replies)
Discussion started by: kristinu
15 Replies

10. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks (14 Replies)
Discussion started by: yahyaaa
14 Replies
Login or Register to Ask a Question