Sponsored Content
Top Forums Shell Programming and Scripting Deleting lines containing duplicated strings Post 302969823 by jypark22 on Monday 28th of March 2016 11:29:43 PM
Old 03-29-2016
Deleting lines containing duplicated strings

Dear all,

I always appreciate your help.

I would like to delete lines containing duplicated strings in the second column.

test.txt
Code:
658	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
659	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[31]	0.825692
660	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[63]	0.825692
661	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
665	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[62]	0.825692
666	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
668	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
670	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
673	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
675	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
677	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
678	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[27]	0.825692
679	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[27]	0.8120
.
.
.

output.txt
Code:
658	invert_d2e_q_reg_0_/Qalu_ecl_zlow_e	0.825692
659	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[31]	0.825692
660	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[63]	0.825692
661	invert_d2e_q_reg_0_/Qalu_ecl_zhigh_e	0.825692
665	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[62]	0.825692
678	invert_d2e_q_reg_0_/Qalu_byp_rd_data_e[27]	0.825692
.
.
.

I know sed can delete lines with predefined specific strings, but in my cases, I could not expect the strings are duplicated. Also, duplicated strings will be more than 1000.


I used “uniq” to do this job, but this does not work.
uniq -u -f 4 test.txt
(-u prints unique lines. -f skips the first 4 letters. )

Is there any way to do this with sed/awk/perl? Or please correct my uniq semantics.

Best,

Jaeyoung
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove duplicated lines without sort

Hi Just wondering whether or not I can remove duplicated lines without sort For example, I use the command who, which shows users who are logging on. In some cases, it shows duplicated lines of users who are logging on more than one terminal. Normally, I would do who | cut -d" " -f1 |... (6 Replies)
Discussion started by: lalelle
6 Replies

2. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks (14 Replies)
Discussion started by: yahyaaa
14 Replies

3. UNIX for Dummies Questions & Answers

duplicated lines not recognized by sort and uniq

Hello all, I've got a strange behaviour of sort and uniq commands: they do not recognise apparently duplicated lines in a file (already sorted). The lines are identical by eye, but they must differ in smth, because when they are put in two files, those have slightly different size. What can make... (8 Replies)
Discussion started by: roussine
8 Replies

4. Shell Programming and Scripting

awk to count duplicated lines

We have an input file as follows: 2010-09-15-12.41.15 2010-09-15-12.41.15 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.25 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26... (3 Replies)
Discussion started by: ux4me
3 Replies

5. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of... (11 Replies)
Discussion started by: raidzero
11 Replies

6. UNIX for Dummies Questions & Answers

Removing duplicated lines??

Hi Guys.. I have a problem for some reason my database has copied everything 4 times. My Database looks like this: >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh >ANI124365 afrhtjykulilil htrjykuk rtkjryky ukrykyrk >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh... (6 Replies)
Discussion started by: Iifa
6 Replies

7. Shell Programming and Scripting

awk to insert duplicated lines

Dear All, Suppose I have a file: 1 1 1 1 2 2 2 2 3 3 3 3I want to insert new line under each old line so that the file would become: 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3How can this be accomplished using awk (or sed)? (5 Replies)
Discussion started by: littlewenwen
5 Replies

8. Shell Programming and Scripting

How to remove duplicated lines?

Hi, if i have a file like this: Query=1 a a b c c c d Query=2 b b b c c e . . . (7 Replies)
Discussion started by: the_simpsons
7 Replies

9. Shell Programming and Scripting

Deleting duplicated chunks in a file using awk/sed

Hi all, I'd always appreciate all helps from this site. I would like to delete duplicated chunks of strings on the same row(?). One chunk is comprised of four lines such as: path name starting point ending point voltage number I would like to delete duplicated chunks on the same... (5 Replies)
Discussion started by: jypark22
5 Replies
UNIQ(1) 							   User Commands							   UNIQ(1)

NAME
uniq - report or omit repeated lines SYNOPSIS
uniq [OPTION]... [INPUT [OUTPUT]] DESCRIPTION
Filter adjacent matching lines from INPUT (or standard input), writing to OUTPUT (or standard output). With no options, matching lines are merged to the first occurrence. Mandatory arguments to long options are mandatory for short options too. -c, --count prefix lines by the number of occurrences -d, --repeated only print duplicate lines, one for each group -D print all duplicate lines --all-repeated[=METHOD] like -D, but allow separating groups with an empty line; METHOD={none(default),prepend,separate} -f, --skip-fields=N avoid comparing the first N fields --group[=METHOD] show all items, separating groups with an empty line; METHOD={separate(default),prepend,append,both} -i, --ignore-case ignore differences in case when comparing -s, --skip-chars=N avoid comparing the first N characters -u, --unique only print unique lines -z, --zero-terminated line delimiter is NUL, not newline -w, --check-chars=N compare no more than N characters in lines --help display this help and exit --version output version information and exit A field is a run of blanks (usually spaces and/or TABs), then non-blank characters. Fields are skipped before chars. Note: 'uniq' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use 'sort -u' without 'uniq'. Also, comparisons honor the rules specified by 'LC_COLLATE'. AUTHOR
Written by Richard M. Stallman and David MacKenzie. REPORTING BUGS
GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Report uniq translation bugs to <http://translationproject.org/team/> COPYRIGHT
Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. SEE ALSO
comm(1), join(1), sort(1) Full documentation at: <http://www.gnu.org/software/coreutils/uniq> or available locally via: info '(coreutils) uniq invocation' GNU coreutils 8.28 January 2018 UNIQ(1)
All times are GMT -4. The time now is 09:09 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy