07-07-2009
Remove duplicate files based on text string?
Hi
I have been struggling with a script for removing duplicate messages from a shared mailbox.
I would like to search for duplicate messages based on the “Message-ID” string within the messages files.
I have managed to find the duplicate “Message-ID” strings and (if I would like) delete the files in which they where found.
My problem is who to preserve one of each file.
My script so far:
--------------------
#!/bin/tcsh
set dir=/my/maildir
foreach file (`grep -h "Message-ID: <" $dir/* | uniq -d |xargs -i \grep -l "{}" $dir/*`)
rm -f "$file"
end
--------------------
Any ideas?
Thanks // Tomas
---------- Post updated at 06:02 PM ---------- Previous update was at 10:18 AM ----------
Fyi, solved
-------------------
#!/bin/tcsh
set maildir=/my/maildir
foreach dupstring ("`grep -m 1 -h -R "^Message-ID:" $maildir/ | sort | uniq -d`")
grep -l -R "$dupstring" $maildir/ |sed 1d |xargs -i \rm -f "{}"
end
-------------------
// Tomas
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hii Friends.. I have a huge set of data stored in a file.Which is as shown below
a.dat:
RAO 1869 12 19 0 0 0.00 17.9000 82.3000 10.0 0 0.00 0 3.70 0.00 0.00 0 0.00 3.70 4 NULL
LEE 1870 4 11 1 0 0.00 30.0000 99.0000 0.0 0 0.00 0 0.00 0.00 0.00 0 ... (3 Replies)
Discussion started by: reva
3 Replies
2. Shell Programming and Scripting
Hi,
How can I remove duplicates from a file based on group on other column? for example:
Test1|Test2|Test3|Test4|Test5
Test1|Test6|Test7|Test8|Test5
Test1|Test9|Test10|Test11|Test12
Test1|Test13|Test14|Test15|Test16
Test17|Test18|Test19|Test20|Test21
Test17|Test22|Test23|Test24|Test5
... (2 Replies)
Discussion started by: yale_work
2 Replies
3. Shell Programming and Scripting
Hi All,
i have input file like below...
CA009156;20091003;M;AWBKCA72;123;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;;
CA009156;20091003;M;AWBKCA72;321;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;;
CA009156;20091003;M;AWBKCA72;231;;CANADIAN... (2 Replies)
Discussion started by: mohan sharma
2 Replies
4. Shell Programming and Scripting
Hi ,
Some time i got duplicated value in my files ,
bundle_identifier= B
Sometext=ABC
bundle_identifier= A
bundle_unit=500
Sometext123=ABCD
bundle_unit=400
i need to check if there is a duplicated values or not if yes , i need to check if the value is A or B when Bundle_Identified ,... (2 Replies)
Discussion started by: OTNA
2 Replies
5. Shell Programming and Scripting
I have file like this:
chr start end
chr15 99874874 99875874 chr15 99875173 99876173 aa1
chr15 99874923 99875923 chr15 99875173 99876173 aa1
chr15 99874962 99875962 chr15 99875173 99876173 aa1
chr1 ... (7 Replies)
Discussion started by: raj_k
7 Replies
6. Shell Programming and Scripting
Hi Perl users,
I have another problem with text processing in Perl. I have a file below:
Linux Unix Linux Windows SUN
MACOS SUN SUN HP-AUX
I want the result below:
Unix Windows SUN
MACOS HP-AUX
so the duplicate string will be removed and also the keyword of the string on... (2 Replies)
Discussion started by: askari
2 Replies
7. Shell Programming and Scripting
Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed.
example of my file... (3 Replies)
Discussion started by: clarissab
3 Replies
8. Windows & DOS: Issues & Discussions
So, I have text files,
one "fail.txt"
And one
"color.txt"
I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file.
Afterwards there shall be no duplicate lines. (1 Reply)
Discussion started by: pasc
1 Replies
9. Shell Programming and Scripting
Dear community,
I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns
The data are like this:
Region 23/11/2014 09:11:36 41752
Medio 23/11/2014 03:11:38 4132
Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies
10. Shell Programming and Scripting
Hi,
I have a file with many sections in it. Each section is separated by a blank line.
The first line of each section would determine if the section is duplicate or not.
if the section is duplicate then remove the entire section from the file.
below is the example of input and output.... (5 Replies)
Discussion started by: ahmedwaseem2000
5 Replies
LEARN ABOUT DEBIAN
seqdiag
SEQDIAG(1) General Commands Manual SEQDIAG(1)
NAME
seqdiag - generate sequence-diagram image file from spec-text file.
SYNOPSIS
seqdiag [options] file
DESCRIPTION
This manual page documents briefly the seqdiag commands.
seqdiag is generate sequence-diagram image file from spec-text file.
OPTIONS
These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is
included below. For a complete description, see the Info files.
--version
show program's version number and exit
-h, --help
show this help message and exit
-a, --antialias
Pass diagram image to anti-alias filter
-c FILE, --config=FILE
read configurations from FILE
-o FILE
write diagram to FILE
-f FONT, --font=FONT
use FONT to draw diagram
-T TYPE
Output diagram as TYPE format
SEE ALSO
The programs are documented fully by
http://tk0miya.bitbucket.org/seqdiag/build/html/index.html
AUTHOR
seqdiag was written by Takeshi Komiya <i.tkomiya@gmail.com>
This manual page was written by Kouhei Maeda <mkouhei@palmtb.net>, for the Debian project (and may be used by others).
May 21, 2011 SEQDIAG(1)