Deleting lines containing duplicated strings Post: 302969823

Sponsored Content

Top Forums Shell Programming and Scripting Deleting lines containing duplicated strings Post 302969823 by jypark22 on Monday 28th of March 2016 11:29:43 PM

03-29-2016

Registered User

Deleting lines containing duplicated strings

Dear all,

I always appreciate your help.

I would like to delete lines containing duplicated strings in the second column.

test.txt

Code:

658	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
659	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[31]	0.825692
660	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[63]	0.825692
661	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
665	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[62]	0.825692
666	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
668	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
670	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
673	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
675	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
677	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
678	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[27]	0.825692
679	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[27]	0.8120
.
.
.

output.txt

Code:

658	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
659	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[31]	0.825692
660	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[63]	0.825692
661	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
665	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[62]	0.825692
678	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[27]	0.825692
.
.
.

I know sed can delete lines with predefined specific strings, but in my cases, I could not expect the strings are duplicated. Also, duplicated strings will be more than 1000.

I used �uniq� to do this job, but this does not work.
uniq -u -f 4 test.txt
(-u prints unique lines. -f skips the first 4 letters. )

Is there any way to do this with sed/awk/perl? Or please correct my uniq semantics.

Best,

Jaeyoung

jypark22

View Public Profile for jypark22

Find all posts by jypark22

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove duplicated lines without sort

Hi Just wondering whether or not I can remove duplicated lines without sort For example, I use the command who, which shows users who are logging on. In some cases, it shows duplicated lines of users who are logging on more than one terminal. Normally, I would do who | cut -d" " -f1 |...

2. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks

3. UNIX for Dummies Questions & Answers

duplicated lines not recognized by sort and uniq

Hello all, I've got a strange behaviour of sort and uniq commands: they do not recognise apparently duplicated lines in a file (already sorted). The lines are identical by eye, but they must differ in smth, because when they are put in two files, those have slightly different size. What can make...

4. Shell Programming and Scripting

awk to count duplicated lines

We have an input file as follows: 2010-09-15-12.41.15 2010-09-15-12.41.15 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.25 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26...

5. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of...

6. UNIX for Dummies Questions & Answers

Removing duplicated lines??

Hi Guys.. I have a problem for some reason my database has copied everything 4 times. My Database looks like this: >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh >ANI124365 afrhtjykulilil htrjykuk rtkjryky ukrykyrk >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh...

7. Shell Programming and Scripting

awk to insert duplicated lines

Dear All, Suppose I have a file: 1 1 1 1 2 2 2 2 3 3 3 3I want to insert new line under each old line so that the file would become: 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3How can this be accomplished using awk (or sed)?

8. Shell Programming and Scripting

How to remove duplicated lines?

Hi, if i have a file like this: Query=1 a a b c c c d Query=2 b b b c c e . . .

9. Shell Programming and Scripting

Deleting duplicated chunks in a file using awk/sed

Hi all, I'd always appreciate all helps from this site. I would like to delete duplicated chunks of strings on the same row(?). One chunk is comprised of four lines such as: path name starting point ending point voltage number I would like to delete duplicated chunks on the same...

LEARN ABOUT XFREE86

uniq

UNIQ(1) 							   User Commands							   UNIQ(1)

NAME

       uniq - report or omit repeated lines

SYNOPSIS

       uniq [OPTION]... [INPUT [OUTPUT]]

DESCRIPTION

       Filter adjacent matching lines from INPUT (or standard input), writing to OUTPUT (or standard output).

       With no options, matching lines are merged to the first occurrence.

       Mandatory arguments to long options are mandatory for short options too.

       -c, --count
	      prefix lines by the number of occurrences

       -d, --repeated
	      only print duplicate lines, one for each group

       -D     print all duplicate lines

       --all-repeated[=METHOD]
	      like -D, but allow separating groups with an empty line; METHOD={none(default),prepend,separate}

       -f, --skip-fields=N
	      avoid comparing the first N fields

       --group[=METHOD]
	      show all items, separating groups with an empty line; METHOD={separate(default),prepend,append,both}

       -i, --ignore-case
	      ignore differences in case when comparing

       -s, --skip-chars=N
	      avoid comparing the first N characters

       -u, --unique
	      only print unique lines

       -z, --zero-terminated
	      line delimiter is NUL, not newline

       -w, --check-chars=N
	      compare no more than N characters in lines

       --help display this help and exit

       --version
	      output version information and exit

       A field is a run of blanks (usually spaces and/or TABs), then non-blank characters.  Fields are skipped before chars.

       Note:  'uniq'  does  not  detect  repeated  lines unless they are adjacent.  You may want to sort the input first, or use 'sort -u' without
       'uniq'.	Also, comparisons honor the rules specified by 'LC_COLLATE'.

AUTHOR

       Written by Richard M. Stallman and David MacKenzie.

REPORTING BUGS

       GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
       Report uniq translation bugs to <http://translationproject.org/team/>

COPYRIGHT

       Copyright (C) 2017 Free Software Foundation, Inc.  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
       This is free software: you are free to change and redistribute it.  There is NO WARRANTY, to the extent permitted by law.

SEE ALSO

       comm(1), join(1), sort(1)

       Full documentation at: <http://www.gnu.org/software/coreutils/uniq>
       or available locally via: info '(coreutils) uniq invocation'

GNU coreutils 8.28						   January 2018 							   UNIQ(1)