Filter first column duplicates


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Filter first column duplicates
# 1  
Old 11-05-2015
Filter first column duplicates

Dear All,
I really enjoy your help or suggestion for resolving an issue.
Briefly, I have a file like this:

Code:
a b c
a d e 
f g h
k g h 
x y z

If the first column has the same ID, for example
Code:
a

, just remove it.
The output should be this:

Code:
f g h 
k g h 
x y z

I was thinking to do it with
Code:
uniq

for column1 and then
Code:
grep

the results obtained.
I mean, it works basically, but if you have some other solution I ll be grateful.

Best

Giuliano
# 2  
Old 11-05-2015
Hello giuliangiuseppe,

Could you please try following and let me know if this helps you.
1st: If you doesn't care about order as per your Input_file:
Code:
awk '{A[$1]=$0} END{for(i in A){print A[i]}}'  Input_file

2nd: If you need output in same order as Input_file then following may help you in same.
Code:
awk 'FNR==NR{A[$1]=$0;next} ($1 in A){print A[$1];delete A[$1]}'  Input_file  Input_file

Thanks,
R. Singh
# 3  
Old 11-05-2015
Hi there,
thank you for reply and suggestion.
well, maybe my explanation was awful (probably!!) my I d like to remove completely the duplicate ID in column 1 and relative rows. If an IDs in column 1 compare more that 1 time, remove all the line with that IDs.

best

G
# 4  
Old 11-05-2015
Hello giuliangiuseppe,

Could you please try this and let me know if this helps you.
Code:
awk 'FNR==NR{A[$1]=$0;B[$1]++;next} {if(B[$1]==1){print $0}}'  Input_file  Input_file

Output will be as follows.
Code:
f g h
k g h
x y z

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 5  
Old 11-05-2015
perfect!
# 6  
Old 11-05-2015
This could be shortened to:

Code:
awk 'NR==FNR{A[$1]++; next} A[$1]==1' file file

This User Gave Thanks to Scrutinizer For This Post:
# 7  
Old 11-05-2015
Hi Scrutinizer just a question because I am learning awk.
Does you command suppose that the input file should be sorted by the column 1?
Or not?

thank you
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to Sum columns when other column has duplicates and append one column value to another with Care

Hi Experts, Please bear with me, i need help I am learning AWk and stuck up in one issue. First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique. Second point : For... (1 Reply)
Discussion started by: as7951
1 Replies

2. Shell Programming and Scripting

Count and keep duplicates in Column

Hi folks, I've got a csv file called test.csv Column A Column B Apples 1900 Apples 1901 Pears 1902 Pears 1903I want to count and keep duplicates in the first column. Desired output Column A Column B Column C Apples 2 1900 Apples ... (5 Replies)
Discussion started by: pshields1984
5 Replies

3. Shell Programming and Scripting

Remove duplicates according to their frequency in column

Hi all, I have huge a tab-delimited file with the following format and I want to remove the duplicates according to their frequency based on Column2 and Column3. Column1 Column2 Column3 Column4 Column5 Column6 Column7 1 user1 access1 word word 3 2 2 user2 access2 ... (10 Replies)
Discussion started by: corfuitl
10 Replies

4. Shell Programming and Scripting

Filter on one column and then perform conditional calculations on another column with a Linux script

Hi, I have a file (stats.txt) with columns like in the example below. Destination IP address, timestamp, TCP packet sequence number and packet length. destIP time seqNo packetLength 1.2.3.4 0.01 123 500 1.2.3.5 0.03 44 1500 1.3.2.5 0.08 44 1500 1.2.3.4 0.44... (12 Replies)
Discussion started by: Zooma
12 Replies

5. Shell Programming and Scripting

Remove duplicates within row and separate column

Hi all I have following kind of input file ESR1 PA156 leflunomide PA450192 leflunomide CHST3 PA26503 docetaxel Pa4586; thalidomide Pa34958; decetaxel docetaxel docetaxel I want to remove duplicates and I want to separate anything before and after PAxxxx entry into columns or... (1 Reply)
Discussion started by: manigrover
1 Replies

6. Shell Programming and Scripting

Request to check:remove duplicates only in first column

Hi all, I have an input file like this Now I have to remove duplicates only in first column and nothing has to be changed in second and third column. so that output would be Please let me know scripting regarding this (20 Replies)
Discussion started by: manigrover
20 Replies

7. Shell Programming and Scripting

need to remove duplicates based on key in first column and pattern in last column

Given a file such as this I need to remove the duplicates. 00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt 00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt 0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt 0624-01 RUT CORPORATION ... (13 Replies)
Discussion started by: script_op2a
13 Replies

8. Shell Programming and Scripting

To Filter out duplicates..

I have a text file of this format... 55 55-45345.xml 20070615 55 55-87655.xml 20070613 34 34-56753.xml 20070614 The text file has values like a number,xml file name, and a date.The first column can have n number of duplicates.And no two dates are equal.Now I sorted out the file.So, it is... (1 Reply)
Discussion started by: gameboy87
1 Replies

9. Shell Programming and Scripting

How can i delete the duplicates based on one column of a line

I have my data something like this (08/03/2009 22:57:42.414)(:) king aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbb (08/03/2009 22:57:42.416)(:) John cccccccccccc cccccvssssssssss baaaaa (08/03/2009 22:57:42.417)(:) Michael ddddddd tststststtststts (08/03/2009 22:57:42.425)(:) Ravi... (11 Replies)
Discussion started by: rdhanek
11 Replies

10. Shell Programming and Scripting

duplicates lines with one column different

Hi I have the following lines in a file SANDI108085FRANKLIN WRAP 7285 SANDI109514ZIPLOC STRETCH N SEAL 7285 SANDI110198CHOICE DM 0911 SANDI111144RANDOM WEIGHT BRAND 0704 SANDI111144RANDOM WEIGHT BRAND 0738... (10 Replies)
Discussion started by: dhanamurthy
10 Replies
Login or Register to Ask a Question