How to delete a duplicate line and original with sed.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to delete a duplicate line and original with sed.
# 1  
Old 06-14-2011
How to delete a duplicate line and original with sed.

I am completely new to shell scripting but have been assigned the task of creating several batch files to manipulate data. My final task requires me to find lines that have duplicates present then delete not only the duplicate but the original as well. The script will be used in a windows environment so I am using GNU sed. Below is a sample of the data:
Code:
180222,1,7.3,1Z0E947E0353634,9.49,UPAC
180223,1,7.3,1Z0E947E0373254,9.49,UPAC
180224,1,7.3,1Z0E947E0371556,8.33,UPAC
180222,1,7.3,1Z0E947E0353634,9.49,UPAC

In this example the first and last lines are duplicates and I would like to delete them both. I have been searching for several days and have not been able to figure out how to achieve this Smilie. Unfortunately I am short on time and would greatly appreciate any help possible. Thanks.

Last edited by Franklin52; 06-20-2011 at 03:32 AM.. Reason: Please use code tags
# 2  
Old 06-15-2011
Need Awk
Code:
awk 'NR==FNR{a[$0]++;next} a[$0]<2' infile infile

# 3  
Old 06-15-2011
If you can sort data then to show:

Code:
sort | uniq -d < YOURFILE

To remove:

Code:
sort | uniq -u < YOURFILE

There are sort and uniq in MSYS or Cygwin or GNU utils for Windows.
This User Gave Thanks to yazu For This Post:
# 4  
Old 06-15-2011
Thanks for responding so quickly. rdcwayx, I downloaded and installed awk however, my output file comes up blank. using target file in place of first "infile" and destination file in place of second "infile". Did I misunderstand that part?
# 5  
Old 06-15-2011
no, the same file is read two times.

and I test the code in Solaris, still get right output.

Do you download the latest gawk version?

Quote:
Originally Posted by yazu

Code:
sort | uniq -u < YOURFILE

There are sort and uniq in MSYS or Cygwin or GNU utils for Windows.
This looks more simple.

but I have to change the command to:

Code:
sort YOURFILE |uniq -u

This User Gave Thanks to rdcwayx For This Post:
# 6  
Old 06-15-2011
Thank you both. I will try again in the am and let you know how it goes.
# 7  
Old 06-15-2011
Quote:
but I have to change the command to:


Code:
sort YOURFILE |uniq -u
Yes. I'm stupid. I tested like this

Code:
cat YOURFILE | sort | uniq -u

but then changed when posting. Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Log file - Delete duplicate line & keep last date

Hello All ! I need your help on this case, I have a csv file with this: ITEM105;ARI FSR;2016-02-01 08:02;243 ITEM101;ARI FSR;2016-02-01 06:02;240 ITEM032;RNO TLE;2016-02-01 11:03;320 ITEM032;RNO TLE;2016-02-02 05:43;320 ITEM032;RNO TLE;2016-02-01 02:03;320 ITEM032;RNO... (2 Replies)
Discussion started by: vadim-bzh
2 Replies

2. Shell Programming and Scripting

Delete duplicate strings in a line

Hi, i need help to remove duplicates in my file. The problem is i need to delete one duplicate for each line only. the input file as follows and it is not tab delimited:- The output need to remove 2nd word (in red) that duplicate with 1st word (in blue). Other duplicates should remained... (12 Replies)
Discussion started by: redse171
12 Replies

3. Shell Programming and Scripting

sed - Print Edited Line Along With Original Line

Hi, I have an input file like this line1 line2 line3 hello unix how are you This is what I am expecting my output to be line1 line2 #line3 hello unix how are you line3 hello (3 Replies)
Discussion started by: jacobs.smith
3 Replies

4. Shell Programming and Scripting

Need to print duplicate row along with highest version of original

There are some duplicate field on description column .I want to print duplicate row along with highest version of number and corresponding description column. file1.txt number Description === ============ 34567 nl21a00is-centerdb001:ncdbareq:Error in loading init 34577 ... (7 Replies)
Discussion started by: vijay_rajni
7 Replies

5. Shell Programming and Scripting

sed command to grep multiple pattern present in single line and delete that line

here is what i want to achieve.. i have a file with below contents cat fileName blah blah blah . .DROP this REJECT that . --sport 7800 -j REJECT --reject-with icmp-port-unreachable --dport 7800 -j REJECT --reject-with icmp-port-unreachable . . . more blah blah blah --dport 3306... (14 Replies)
Discussion started by: vivek d r
14 Replies

6. Shell Programming and Scripting

Delete Duplicate line (not really) from the file

I need help in figuring out hoe to delete lines in a data file. The data file is huge. I am currently using "vi" to search and delete the lines - which is cumbersome since it takes lots of time to save that file (due to its huge size). Here is the issue. I have a data file with the following... (4 Replies)
Discussion started by: GosarJunk
4 Replies

7. UNIX for Dummies Questions & Answers

Delete duplicate second line

Hi ALL I need a help I need to retain only the first line of 035 if I have two line before =040 , if only one then need to take that Eg: Input =035 (ABC)12324141241 =035 (XYZPQR)704124 =040 AB$QS$WEWR =035 (ABC)08080880809 =035 (XYZPQR)9809314 =040 ... (4 Replies)
Discussion started by: umapearl
4 Replies

8. Shell Programming and Scripting

Sed or Grep to delete line containing patter plus extra line

I'm new to using sed and grep commands, but have found them extremely useful. However I am having a hard time figuring this one out: Delete every line containing the word CEN and the next line as well. ie. test.txt blue 324 CEN green red blue 324 CEN green red blue to produce:... (2 Replies)
Discussion started by: rocketman88
2 Replies

9. Shell Programming and Scripting

USING sed to delete a line

Please let me know wat would be sed command to delete any partcular line from a file and also moving lines below it to up. ie wen line #9 is deleted data in line #10 should move to #9 and so on. (2 Replies)
Discussion started by: fidelis
2 Replies

10. Shell Programming and Scripting

sed: delete regex line and next line if blank

Hi, I want to write a sed script which from batiato: batiato/giubbe: pip_b.2.txt pip_b.3.txt pip_b.3mmm.txt bennato: bennato/peterpan: 123.txt consoli: pip_a.12.txt daniele: (2 Replies)
Discussion started by: one71
2 Replies
Login or Register to Ask a Question