Remove duplicate line detail based on column one data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove duplicate line detail based on column one data
# 1  
Old 01-06-2010
Remove duplicate line detail based on column one data

My input file:
Code:
AVI.out     <detail>named as the RRM .</detail>
AVI.out     <detail>Contains 1 RRM .</detail>

AR0.out     <detail>named as the tellurite-resistance.</detail>

AWG.out     <detail>Contains 2 HTH .</detail>

ADV.out     <detail>named as the DENR family.</detail>
ADV.out     <detail>Contains 1 SUI1 domain.</detail>

AH7.out     <detail>Contains 1 box .</detail>
AH7.out     <detail>Contains 11 rich.</detail>

AZM.out     <detail>named as the family.</detail>

My desired output file:
Code:
AVI.out     <detail>named as the RRM .</detail>

AR0.out     <detail>named as the tellurite-resistance.</detail>

AWG.out     <detail>Contains 2 HTH .</detail>

ADV.out     <detail>named as the DENR family.</detail>

AH7.out     <detail>Contains 1 box .</detail>

AZM.out     <detail>named as the family.</detail>

I would like to remove those detail that appear at the second and keep only those detail at the first row.
Thanks for advice.
# 2  
Old 01-06-2010
Quote:
Originally Posted by patrick87
My input file:
Code:
AVI.out     <detail>named as the RRM .</detail>
AVI.out     <detail>Contains 1 RRM .</detail>

AR0.out     <detail>named as the tellurite-resistance.</detail>

AWG.out     <detail>Contains 2 HTH .</detail>

ADV.out     <detail>named as the DENR family.</detail>
ADV.out     <detail>Contains 1 SUI1 domain.</detail>

AH7.out     <detail>Contains 1 box .</detail>
AH7.out     <detail>Contains 11 rich.</detail>

AZM.out     <detail>named as the family.</detail>

My desired output file:
Code:
AVI.out     <detail>named as the RRM .</detail>

AR0.out     <detail>named as the tellurite-resistance.</detail>

AWG.out     <detail>Contains 2 HTH .</detail>

ADV.out     <detail>named as the DENR family.</detail>

AH7.out     <detail>Contains 1 box .</detail>

AZM.out     <detail>named as the family.</detail>

I would like to remove those detail that appear at the second and keep only those detail at the first row.
Thanks for advice.
try this,
Code:
awk 'n!=$1{print;n=$1}' infile

regards
# 3  
Old 01-06-2010
thanks gaurav1086,
your awk code work perfectly for my problem ^^
thanks again Smilie
# 4  
Old 01-06-2010
hello ,

your are most welcome.

Regards.
# 5  
Old 01-06-2010
Try:

Code:
awk '!NF || ++A[$1]==1'  < file

# 6  
Old 01-06-2010
thanks dennis,
your awk code is another way to solve my problem as well ^^
thanks a lot.
# 7  
Old 01-06-2010
Quote:
Originally Posted by gaurav1086
try this,
Code:
awk 'n!=$1{print;n=$1}' infile

regards
this is better, which can remove duplicated empty line.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To append new data at the end of each line based on substring of last column

Hi guys, I need to append new data at the end of each line of the files. This new data is based on substring (3rd fields) of last column. Input file xxx.csv: U1234|1-5X|orange|1-5X|Act|1-5X|0.1 /sac/orange 12345 0 U5678|1-7X|grape|1-7X|Act|1-7X|0.1 /sac/grape 5678 0... (5 Replies)
Discussion started by: null7
5 Replies

2. Shell Programming and Scripting

Remove sections based on duplicate first line

Hi, I have a file with many sections in it. Each section is separated by a blank line. The first line of each section would determine if the section is duplicate or not. if the section is duplicate then remove the entire section from the file. below is the example of input and output.... (5 Replies)
Discussion started by: ahmedwaseem2000
5 Replies

3. Shell Programming and Scripting

Remove duplicate rows based on one column

Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed. example of my file... (3 Replies)
Discussion started by: clarissab
3 Replies

4. Shell Programming and Scripting

How to remove a line based on contents of the first column?

Good day all. Using basic UNIX/Linux tools, how would you delete a line based on a character found in column 1? For example, if the CITY name contains an 'a' or 'A', delete the line: New York City; New York Los Angeles; California Chicago; Illinois Houston; Texas Philadelphia;... (3 Replies)
Discussion started by: BRH
3 Replies

5. Shell Programming and Scripting

Help with figuring division and addition based on column data and line numbers

I have a data file in the format of 1234 xxx 1234 xxx 1234 xxx 1234 xxxI want to be able to calculate the following - COLUMN1+((LINENUMBER-1)/365) The output needs to preserve the 2nd column - 1234 xxx 1234.00274 xxx 1234.00548 xxx What is the best way to do this? I am somewhat... (9 Replies)
Discussion started by: ncwxpanther
9 Replies

6. UNIX for Dummies Questions & Answers

Remove duplicate rows when >10 based on single column value

Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column. e.g. a 1 a 2 a 3 b 1 c 1 gives b 1 c 1 but requires 11 duplicates before it deletes. Thanks for the help Video tutorial on how to use code tags in The UNIX... (11 Replies)
Discussion started by: informaticist
11 Replies

7. Shell Programming and Scripting

How to remove a subset of data from a large dataset based on values on one line

Hello. I was wondering if anyone could help. I have a file containing a large table in the format: marker1 marker2 marker3 marker4 position1 position2 position3 position4 genotype1 genotype2 genotype3 genotype4 with marker being a name, position a numeric... (2 Replies)
Discussion started by: davegen
2 Replies

8. Shell Programming and Scripting

Help with remove duplicate content and only keep the first content detail

Input data_10 SSA data_2 TYUE data_3 PEOCV data_6 SSAT data_21 SSA data_19 TYUEC data_14 TYUE data_15 SSA data_32 PEOCV . . Desired Output data_10 SSA data_2 TYUE data_3 PEOCV data_6 SSAT data_19 TYUEC (9 Replies)
Discussion started by: patrick87
9 Replies

9. Shell Programming and Scripting

remove new line characters from a partcular column data

Dear friends, I have a pipe delimited file having 5 columns. However the column no-3 is having extra new line characters as the data owing to owing , I am having issues. Ideally my file should have only newline termination at the end of each record and not within column data of any of... (1 Reply)
Discussion started by: sureshg_sampat
1 Replies

10. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Hi, I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g. COL1 COL2 COL3 A 1234 1234 B 3k32 2322 C Xk32 TTT A NEW XX22 B 3k32 ... (7 Replies)
Discussion started by: risk_sly
7 Replies
Login or Register to Ask a Question