Remove Duplicate lines from File


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Remove Duplicate lines from File
# 1  
Remove Duplicate lines from File

I have a log file "logreport" that contains several lines as seen below:

04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping


I would like to edit the report to remove entries that report duplicate events. I am trying to produce an output close to what is seen below:

04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
This Error was reproduced 2 times
# 2  
you can use uniq.

can be like this:
for i in `more logreport | sort |uniq `; do echo "no.of time &i:`grep -c $i logreport`"; done

-ilan

Last edited by ilan; 08-30-2007 at 04:49 PM..
# 3  
Code:
awk '{x[substr($0,9)]++;y[substr($0,9)]=$1}END{
	for(i in x)
printf "%s\nThis Error was reproduced %d times\n",y[i]i,x[i]
}' logfile

As always, use nawk or /usr/xpg4/bin/awk on Solaris.
# 4  
Quote:
Originally Posted by ilan
you can use uniq.

can be like this:
for i in `more logreport | sort |uniq `; do echo "no.of time &i:`grep -c $i logreport`"; done

-ilan
stand corrected!!
My code works only for uniq words Smilie

-ilan

PS:not getting any idea on top of my head to serve your question!!
# 5  
What exactly is the output from this? I can't get it to run correctly.

Code:
awk '{x[substr($0,9)]++;y[substr($0,9)]=$1}END{
	for(i in x)
printf "%s\nThis Error was reproduced %d times\n",y[i]i,x[i]
}' logfile

# 6  
Quote:
Originally Posted by Nysif Steve
What exactly is the output from this? I can't get it to run correctly.
What is the output you get?

Code:
zsh 4.3.2% cat file
04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to peng
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to pong
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to pong
zsh 4.3.2% awk '{x[substr($0,9)]++;y[substr($0,9)]=$1}END{
for(i in x)
printf "%s\nThis Error was reproduced %d times\n",y[i]i,x[i]
}' file
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to pong
This Error was reproduced 2 times
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to peng
This Error was reproduced 1 times
06:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
This Error was reproduced 3 times

# 7  
I tried radoulov's script and it works for me. I had a sample log file on my desktop called "logfile.log" with these entries:

Code:
04:20:00 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
06:38:08 /usr/lib/snmp/snmpdx: [ID 999999 daemon.error] Agent snmpd appeared dead but responded to DING
07:11:05 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
09:20:00 /usr/lib/snmp/snmpdx: [ID 999999 daemon.error] Agent snmpd appeared dead but responded to DING
10:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
11:11:05 /usr/lib/snmp/snmpdx: [ID 999999 daemon.error] Agent snmpd appeared dead but responded to DING

I cd'd to my desktop, and then I used radoulov's script with the slight modification of adding "\" to the end of each line so I could copy and paste it to the bash command line.

Code:
awk '{x[substr($0,9)]++;y[substr($0,9)]=$1}END{\
for(i in x) \
printf "%s\nThis Error was reproduced %d times\n",y[i]i,x[i]\
}' logfile.log

Press return and voila!:

Code:
10:38:08 /usr/lib/snmp/snmpdx: [ID 702911 daemon.error] Agent snmpd appeared dead but responded to ping
This Error was reproduced 3 times
11:11:05 /usr/lib/snmp/snmpdx: [ID 999999 daemon.error] Agent snmpd appeared dead but responded to DING
This Error was reproduced 3 times

Nice job radoulov!
 

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #496
Difficulty: Easy
Many programming languages assign special meaning to keywords such as for, if, and while (for example) that are used to define various control structures.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicate lines, sort it and save it as file itself

Hi, all I have a csv file that I would like to remove duplicate lines based on 1st field and sort them by the 1st field. If there are more than 1 line which is same on the 1st field, I want to keep the first line of them and remove the rest. I think I have to use uniq or something, but I still... (8 Replies)
Discussion started by: refrain
8 Replies

2. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies

3. Shell Programming and Scripting

Remove duplicate lines from a file

Hi, I have a csv file which contains some millions of lines in it. The first line(Header) repeats at every 50000th line. I want to remove all the duplicate headers from the second occurance(should not remove the first line). I don't want to use any pattern from the Header as I have some... (7 Replies)
Discussion started by: sudhakar T
7 Replies

4. Shell Programming and Scripting

Remove duplicate lines from a 50 MB file size

hi, Please help me to write a command to delete duplicate lines from a file. And the size of file is 50 MB. How to remove duplicate lins from such a big file. (6 Replies)
Discussion started by: vsachan
6 Replies

5. Shell Programming and Scripting

How do I remove the duplicate lines in this file?

Hey guys, need some help to fix this script. I am trying to remove all the duplicate lines in this file. I wrote the following script, but does not work. What is the problem? The output file should only contain five lines: Later! (5 Replies)
Discussion started by: Ernst
5 Replies

6. Shell Programming and Scripting

Remove duplicate lines from first file comparing second file

Hi, I have two files with below data:: file1:- 123|aaa|ppp 445|fff|yyy 999|ttt|jjj 555|hhh|hhh file2:- 445|fff|yyy 555|hhh|hhh The records present in file1, not present in file 2 should be writtent to the out put file. output:- 123|aaa|ppp 999|ttt|jjj Is there any one line... (3 Replies)
Discussion started by: gani_85
3 Replies

7. Shell Programming and Scripting

remove duplicate lines from file linux/sh

greetings, i'm hoping there is a way to cat a file, remove duplicate lines and send that output to a new file. the file will always vary but be something similar to this: please keep in mind that the above could be eight occurrences of each hostname or it might simply have another four of an... (2 Replies)
Discussion started by: crimso
2 Replies

8. UNIX for Dummies Questions & Answers

How to delete or remove duplicate lines in a file

Hi please help me how to remove duplicate lines in any file. I have a file having huge number of lines. i want to remove selected lines in it. And also if there exists duplicate lines, I want to delete the rest & just keep one of them. Please help me with any unix commands or even fortran... (7 Replies)
Discussion started by: reva
7 Replies

9. Shell Programming and Scripting

Command/Script to remove duplicate lines from the file?

Hello, Can anyone tell Command/Script to remove duplicate lines from the file? (2 Replies)
Discussion started by: Rahulpict
2 Replies

10. Shell Programming and Scripting

Remove Duplicate Lines in File

I am doing KSH script to remove duplicate lines in a file. Let say the file has format below. FileA 1253-6856 3101-4011 1827-1356 1822-1157 1822-1157 1000-1410 1000-1410 1822-1231 1822-1231 3101-4011 1822-1157 1822-1231 and I want to simply it with no duplicate line as file... (5 Replies)
Discussion started by: Teh Tiack Ein
5 Replies

Featured Tech Videos