Visit Our UNIX and Linux User Community


Remove the duplicate content in a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove the duplicate content in a file
# 1  
Old 08-12-2013
Remove the duplicate content in a file

Here is the contents of test.txt

Code:
Dependencies Resolved
Changes in packages about to be updated:

ChangeLog for: 1:perl-Archive-Extract-0.38-131.el6_4.x86_64,

- Resolves: #915692 - CVE-2013-1667 (DoS in rehashing code)

Dependencies Resolved
Changes in packages about to be updated:

ChangeLog for: 1:perl-Archive-Extract-0.38-131.el6_4.x86_64,

- Resolves: #915692 - CVE-2013-1667 (DoS in rehashing code)

ChangeLog for: openldap-2.4.23-32.el6_4.1.x86_64
* Mon Apr 22 12:00:00 2013 Jan Synáček <jsynacek@redhat.com> 2.4.23-32.1
- fix: NSS related resource leak (#954299)

Expecting result like:

Code:
Dependencies Resolved
Changes in packages about to be updated:

ChangeLog for: 1:perl-Archive-Extract-0.38-131.el6_4.x86_64,

- Resolves: #915692 - CVE-2013-1667 (DoS in rehashing code)

ChangeLog for: openldap-2.4.23-32.el6_4.1.x86_64
* Mon Apr 22 12:00:00 2013 Jan Synáček <jsynacek@redhat.com> 2.4.23-32.1
- fix: NSS related resource leak (#954299)

Thanks

Last edited by ashokvpp; 08-12-2013 at 11:37 AM..
# 2  
Old 08-12-2013
What have you tried and where are you stuck?
This User Gave Thanks to Scott For This Post:
# 3  
Old 08-12-2013
Hi Scott,

So far I have collected uniq package names list.

Tried like:

Code:
$sed -n '/ChangeLog for: libblkid-2.17.2-12.9.el6_4.3.x86_64/,/Dependencies Resolved/p' centos6.txt 
ChangeLog for: libblkid-2.17.2-12.9.el6_4.3.x86_64,
             : libuuid-2.17.2-12.9.el6_4.3.x86_64,
             : util-linux-ng-2.17.2-12.9.el6_4.3.x86_64
* Tue Apr 23 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.3
- fix #917678 - mount in RHEL 6.4 ignores user option

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.2
- make patch for #911756 more robust

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.1
- fix patch for #911756 to make it usable on big-endian machines

* Wed Apr 10 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4
- fix #911756 - Make silicon medley signature recognition more robust


Dependencies Resolved
ChangeLog for: libblkid-2.17.2-12.9.el6_4.3.x86_64,
             : libuuid-2.17.2-12.9.el6_4.3.x86_64,
             : util-linux-ng-2.17.2-12.9.el6_4.3.x86_64
* Tue Apr 23 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.3
- fix #917678 - mount in RHEL 6.4 ignores user option

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.2
- make patch for #911756 more robust

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.1
- fix patch for #911756 to make it usable on big-endian machines

* Wed Apr 10 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4
- fix #911756 - Make silicon medley signature recognition more robust


Dependencies Resolved
ChangeLog for: libblkid-2.17.2-12.9.el6_4.3.x86_64,
             : libuuid-2.17.2-12.9.el6_4.3.x86_64,
             : util-linux-ng-2.17.2-12.9.el6_4.3.x86_64
* Tue Apr 23 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.3
- fix #917678 - mount in RHEL 6.4 ignores user option

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.2
- make patch for #911756 more robust

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.1
- fix patch for #911756 to make it usable on big-endian machines

* Wed Apr 10 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4
- fix #911756 - Make silicon medley signature recognition more robust

I'm looking out to print only 1 occurrence of it.

Thanks
# 4  
Old 08-12-2013
Code:
awk '!A[$0]++' test.txt

This User Gave Thanks to Yoda For This Post:
# 5  
Old 08-12-2013
@Yoda: that might be dangerous as might remove single lines that could be duplicates of other records. I have to admit this is a remote possibility. Still, you can make it safer adding sth like RS="\n\n\n" to your proposal
This User Gave Thanks to RudiC For This Post:
# 6  
Old 08-12-2013
The occurrences are just random number of times.. all I need is just print the 1st occurrence of the start pattern match..

---------- Post updated at 11:51 AM ---------- Previous update was at 11:42 AM ----------

Code:
awk '!A[$0]++' RS="\n\n\n"  test.txt   # works great .

Have a good day Smilie

Last edited by Scott; 08-12-2013 at 01:52 PM.. Reason: Code tags

Previous Thread | Next Thread
Test Your Knowledge in Computers #10
Difficulty: Easy
Charles Babbage was a British mathematician and inventor, known as the 'Father of the Computer'. He designed a mechanical computer called the Analytical Engine which was an early forerunner of the modern computer.
True or False?

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to remove exisiting file content from a file and have to append new file content?

hi all, i had the below script x=`cat input.txt |wc -1` awk 'NR>1 && NR<'$x' ' input.txt > output.txt by using above script i am able to remove the head and tail part from the input file and able to append the output to the output.txt but if i run it for second time the output is... (2 Replies)
Discussion started by: hemanthsaikumar
2 Replies

2. Shell Programming and Scripting

Remove duplicate lines from a file

Hi, I have a csv file which contains some millions of lines in it. The first line(Header) repeats at every 50000th line. I want to remove all the duplicate headers from the second occurance(should not remove the first line). I don't want to use any pattern from the Header as I have some... (7 Replies)
Discussion started by: sudhakar T
7 Replies

3. Shell Programming and Scripting

How to Remove duplicate value from file?

if different branch code is available for same BIC code and one of the branch code is XXX.only one row will be stored and with branch code as XXX .rest of the rows for the BIC code will not be stored. for example if $7 is BIC code and $8 is branch code INPUT file are following... (9 Replies)
Discussion started by: mohan sharma
9 Replies

4. Shell Programming and Scripting

Help with remove duplicate content

Input file data_1 10 US data_1 2 US data_1 5 UK data_2 20 ENGLAND data_2 12 KOREA data_3 4 CHINA . . data_60 123 US data_60 23 UK data_60 45 US Desired output file data_1 10 US data_1 5 UK data_2 20 ENGLAND data_2 12 KOREA (2 Replies)
Discussion started by: perl_beginner
2 Replies

5. Shell Programming and Scripting

How do I remove the duplicate lines in this file?

Hey guys, need some help to fix this script. I am trying to remove all the duplicate lines in this file. I wrote the following script, but does not work. What is the problem? The output file should only contain five lines: Later! (5 Replies)
Discussion started by: Ernst
5 Replies

6. Shell Programming and Scripting

Formatting a file - Remove Duplicate

Hi I have a file in the following format. Basically the file contains tablename and their aliases: TABLE1 TABLE1 A TABLE2 TABLE2 B TABLE3 TABLE4 TABLE4 C TABLE4 Upon formatting an sql statement I am getting such output. Problem: Whenever a tablename appears with alias, it has... (5 Replies)
Discussion started by: freakygs
5 Replies

7. Shell Programming and Scripting

Help with remove duplicate content and only keep the first content detail

Input data_10 SSA data_2 TYUE data_3 PEOCV data_6 SSAT data_21 SSA data_19 TYUEC data_14 TYUE data_15 SSA data_32 PEOCV . . Desired Output data_10 SSA data_2 TYUE data_3 PEOCV data_6 SSAT data_19 TYUEC (9 Replies)
Discussion started by: patrick87
9 Replies

8. UNIX for Dummies Questions & Answers

Remove Duplicate lines from File

I have a log file "logreport" that contains several lines as seen below: 04:20:00 /usr/lib/snmp/snmpdx: Agent snmpd appeared dead but responded to ping 06:38:08 /usr/lib/snmp/snmpdx: Agent snmpd appeared dead but responded to ping 07:11:05 /usr/lib/snmp/snmpdx: Agent snmpd appeared dead but... (18 Replies)
Discussion started by: Nysif Steve
18 Replies

9. Shell Programming and Scripting

Remove Duplicate Lines in File

I am doing KSH script to remove duplicate lines in a file. Let say the file has format below. FileA 1253-6856 3101-4011 1827-1356 1822-1157 1822-1157 1000-1410 1000-1410 1822-1231 1822-1231 3101-4011 1822-1157 1822-1231 and I want to simply it with no duplicate line as file... (5 Replies)
Discussion started by: Teh Tiack Ein
5 Replies

Featured Tech Videos