Deleting lines containing duplicated strings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Deleting lines containing duplicated strings
# 1  
Old 03-29-2016
Deleting lines containing duplicated strings

Dear all,

I always appreciate your help.

I would like to delete lines containing duplicated strings in the second column.

test.txt
Code:
658	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
659	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[31]	0.825692
660	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[63]	0.825692
661	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
665	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[62]	0.825692
666	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
668	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
670	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
673	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
675	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
677	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
678	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[27]	0.825692
679	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[27]	0.8120
.
.
.

output.txt
Code:
658	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
659	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[31]	0.825692
660	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[63]	0.825692
661	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
665	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[62]	0.825692
678	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[27]	0.825692
.
.
.

I know sed can delete lines with predefined specific strings, but in my cases, I could not expect the strings are duplicated. Also, duplicated strings will be more than 1000.


I used “uniq” to do this job, but this does not work.
uniq -u -f 4 test.txt
(-u prints unique lines. -f skips the first 4 letters. )

Is there any way to do this with sed/awk/perl? Or please correct my uniq semantics.

Best,

Jaeyoung
# 2  
Old 03-29-2016
Try:
Code:
awk '!H[$2]++' infile

Or using sort you could sort on 2nd field and print 1st of an equal run, then re-sort back into numeric order on field 1:

Code:
sort -u -k 2,2 infile | sort -n


Last edited by Chubler_XL; 03-29-2016 at 12:51 AM..
This User Gave Thanks to Chubler_XL For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Deleting duplicated chunks in a file using awk/sed

Hi all, I'd always appreciate all helps from this site. I would like to delete duplicated chunks of strings on the same row(?). One chunk is comprised of four lines such as: path name starting point ending point voltage number I would like to delete duplicated chunks on the same... (5 Replies)
Discussion started by: jypark22
5 Replies

2. Shell Programming and Scripting

How to remove duplicated lines?

Hi, if i have a file like this: Query=1 a a b c c c d Query=2 b b b c c e . . . (7 Replies)
Discussion started by: the_simpsons
7 Replies

3. Shell Programming and Scripting

awk to insert duplicated lines

Dear All, Suppose I have a file: 1 1 1 1 2 2 2 2 3 3 3 3I want to insert new line under each old line so that the file would become: 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3How can this be accomplished using awk (or sed)? (5 Replies)
Discussion started by: littlewenwen
5 Replies

4. UNIX for Dummies Questions & Answers

Removing duplicated lines??

Hi Guys.. I have a problem for some reason my database has copied everything 4 times. My Database looks like this: >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh >ANI124365 afrhtjykulilil htrjykuk rtkjryky ukrykyrk >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh... (6 Replies)
Discussion started by: Iifa
6 Replies

5. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of... (11 Replies)
Discussion started by: raidzero
11 Replies

6. Shell Programming and Scripting

awk to count duplicated lines

We have an input file as follows: 2010-09-15-12.41.15 2010-09-15-12.41.15 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.25 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26... (3 Replies)
Discussion started by: ux4me
3 Replies

7. UNIX for Dummies Questions & Answers

duplicated lines not recognized by sort and uniq

Hello all, I've got a strange behaviour of sort and uniq commands: they do not recognise apparently duplicated lines in a file (already sorted). The lines are identical by eye, but they must differ in smth, because when they are put in two files, those have slightly different size. What can make... (8 Replies)
Discussion started by: roussine
8 Replies

8. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks (14 Replies)
Discussion started by: yahyaaa
14 Replies

9. Shell Programming and Scripting

remove duplicated lines without sort

Hi Just wondering whether or not I can remove duplicated lines without sort For example, I use the command who, which shows users who are logging on. In some cases, it shows duplicated lines of users who are logging on more than one terminal. Normally, I would do who | cut -d" " -f1 |... (6 Replies)
Discussion started by: lalelle
6 Replies
Login or Register to Ask a Question