Remove Duplicate lines from File


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Remove Duplicate lines from File
# 1  
Old 08-30-2007
Remove Duplicate lines from File

I have a log file "logreport" that contains several lines as seen below:

04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping


I would like to edit the report to remove entries that report duplicate events. I am trying to produce an output close to what is seen below:

04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
This Error was reproduced 2 times
# 2  
Old 08-30-2007
you can use uniq.

can be like this:
for i in `more logreport | sort |uniq `; do echo "no.of time &i:`grep -c $i logreport`"; done

-ilan

Last edited by ilan; 08-30-2007 at 03:49 PM..
# 3  
Old 08-30-2007
Code:
awk '{x[substr($0,9)]++;y[substr($0,9)]=$1}END{
	for(i in x)
printf "%s\nThis Error was reproduced %d times\n",y[i]i,x[i]
}' logfile

As always, use nawk or /usr/xpg4/bin/awk on Solaris.
# 4  
Old 08-30-2007
Quote:
Originally Posted by ilan
you can use uniq.

can be like this:
for i in `more logreport | sort |uniq `; do echo "no.of time &i:`grep -c $i logreport`"; done

-ilan
stand corrected!!
My code works only for uniq words Smilie

-ilan

PS:not getting any idea on top of my head to serve your question!!
# 5  
Old 08-31-2007
What exactly is the output from this? I can't get it to run correctly.

Code:
awk '{x[substr($0,9)]++;y[substr($0,9)]=$1}END{
	for(i in x)
printf "%s\nThis Error was reproduced %d times\n",y[i]i,x[i]
}' logfile

# 6  
Old 08-31-2007
Quote:
Originally Posted by Nysif Steve
What exactly is the output from this? I can't get it to run correctly.
What is the output you get?

Code:
zsh 4.3.2% cat file
04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to peng
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to pong
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to pong
zsh 4.3.2% awk '{x[substr($0,9)]++;y[substr($0,9)]=$1}END{
for(i in x)
printf "%s\nThis Error was reproduced %d times\n",y[i]i,x[i]
}' file
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to pong
This Error was reproduced 2 times
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to peng
This Error was reproduced 1 times
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
This Error was reproduced 3 times

# 7  
Old 08-31-2007
I tried radoulov's script and it works for me. I had a sample log file on my desktop called "logfile.log" with these entries:

Code:
04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
06:38:08 /usr/lib/snmp/snmpdx: [ID 999999 daemon.error] Agent snmpd appeared dead but responded to DING
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
09:20:00 /usr/lib/snmp/snmpdx: [ID 999999 daemon.error] Agent snmpd appeared dead but responded to DING
10:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
11:11:05 /usr/lib/snmp/snmpdx: [ID 999999 daemon.error] Agent snmpd appeared dead but responded to DING

I cd'd to my desktop, and then I used radoulov's script with the slight modification of adding "\" to the end of each line so I could copy and paste it to the bash command line.

Code:
awk '{x[substr($0,9)]++;y[substr($0,9)]=$1}END{\
for(i in x) \
printf "%s\nThis Error was reproduced %d times\n",y[i]i,x[i]\
}' logfile.log

Press return and voila!:

Code:
10:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
This Error was reproduced 3 times
11:11:05 /usr/lib/snmp/snmpdx: [ID 999999 daemon.error] Agent snmpd appeared dead but responded to DING
This Error was reproduced 3 times

Nice job radoulov!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicate lines, sort it and save it as file itself

Hi, all I have a csv file that I would like to remove duplicate lines based on 1st field and sort them by the 1st field. If there are more than 1 line which is same on the 1st field, I want to keep the first line of them and remove the rest. I think I have to use uniq or something, but I still... (8 Replies)
Discussion started by: refrain
8 Replies

2. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies

3. Shell Programming and Scripting

Remove duplicate lines from a file

Hi, I have a csv file which contains some millions of lines in it. The first line(Header) repeats at every 50000th line. I want to remove all the duplicate headers from the second occurance(should not remove the first line). I don't want to use any pattern from the Header as I have some... (7 Replies)
Discussion started by: sudhakar T
7 Replies

4. Shell Programming and Scripting

Remove duplicate lines from a 50 MB file size

hi, Please help me to write a command to delete duplicate lines from a file. And the size of file is 50 MB. How to remove duplicate lins from such a big file. (6 Replies)
Discussion started by: vsachan
6 Replies

5. Shell Programming and Scripting

How do I remove the duplicate lines in this file?

Hey guys, need some help to fix this script. I am trying to remove all the duplicate lines in this file. I wrote the following script, but does not work. What is the problem? The output file should only contain five lines: Later! (5 Replies)
Discussion started by: Ernst
5 Replies

6. Shell Programming and Scripting

Remove duplicate lines from first file comparing second file

Hi, I have two files with below data:: file1:- 123|aaa|ppp 445|fff|yyy 999|ttt|jjj 555|hhh|hhh file2:- 445|fff|yyy 555|hhh|hhh The records present in file1, not present in file 2 should be writtent to the out put file. output:- 123|aaa|ppp 999|ttt|jjj Is there any one line... (3 Replies)
Discussion started by: gani_85
3 Replies

7. Shell Programming and Scripting

remove duplicate lines from file linux/sh

greetings, i'm hoping there is a way to cat a file, remove duplicate lines and send that output to a new file. the file will always vary but be something similar to this: please keep in mind that the above could be eight occurrences of each hostname or it might simply have another four of an... (2 Replies)
Discussion started by: crimso
2 Replies

8. UNIX for Dummies Questions & Answers

How to delete or remove duplicate lines in a file

Hi please help me how to remove duplicate lines in any file. I have a file having huge number of lines. i want to remove selected lines in it. And also if there exists duplicate lines, I want to delete the rest & just keep one of them. Please help me with any unix commands or even fortran... (7 Replies)
Discussion started by: reva
7 Replies

9. Shell Programming and Scripting

Command/Script to remove duplicate lines from the file?

Hello, Can anyone tell Command/Script to remove duplicate lines from the file? (2 Replies)
Discussion started by: Rahulpict
2 Replies

10. Shell Programming and Scripting

Remove Duplicate Lines in File

I am doing KSH script to remove duplicate lines in a file. Let say the file has format below. FileA 1253-6856 3101-4011 1827-1356 1822-1157 1822-1157 1000-1410 1000-1410 1822-1231 1822-1231 3101-4011 1822-1157 1822-1231 and I want to simply it with no duplicate line as file... (5 Replies)
Discussion started by: Teh Tiack Ein
5 Replies
Login or Register to Ask a Question