Do Not Output Duplicates


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Do Not Output Duplicates
# 1  
Old 04-23-2014
Do Not Output Duplicates

Mac OS 10.9

Let me preface this by saying this is not for marketing or spamming purposes.

I have a script that scans all the email messages in a directory (~/Library/Mail/Mailboxes) and outputs a single column list of email addresses. This will run multiple times a day and append the output file with new entries.

If an email is duplicated in the email folder- it is duplicated in the output file. How do I remove these duplications from the output file? Its just a single column of data separated by a new line. Not sure if I should have it check and exclude the output of duplicates or simply run a scan for duplicates after the output file is appended.

This list is being used as input for LDAP queries.

For reference, the scanning/output portion of my script is below:

Code:
find $SRC -type f -name *.emlx |
	while read FILE
	do
	   awk '/^From:/ && gsub(/.*<|>.*/,x)' $FILE
	done > ~/Desktop/output.txt
echo "complete"

# 2  
Old 04-23-2014
Try:
Code:
find $SRC -type f -name *.emlx | 	
  while read FILE 
  do 	   
    awk '/^From:/ && gsub(/.*<|>.*/,x)' $FILE 	
  done | sort | uniq > ~/Desktop/output.txt 
echo "complete"

This User Gave Thanks to bartus11 For This Post:
# 3  
Old 04-23-2014
well that was easy- Thanks!!
# 4  
Old 04-24-2014
You could also try:
Code:
find $SRC -type f -name *.emlx -exec awk '/^From:/ && gsub(/.*<|>.*/,x)' {} + | sort -u > ~/Desktop/output.txt 
echo "complete"

This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicates

I have a file with the following format: fields seperated by "|" title1|something class|long...content1|keys title2|somhing class|log...content1|kes title1|sothing class|lon...content1|kes title3|shing cls|log...content1|ks I want to remove all duplicates with the same "title field"(the... (3 Replies)
Discussion started by: dtdt
3 Replies

2. Shell Programming and Scripting

sed & remove duplicates on output

sed -e '1d' -e 's/^\(]\{2\}\)-\(]\{3\}\)-\(]\{4\}\).*/"0000020\1\200\3"\,/g' abc.txt This script returns many duplicates due to the duplciates in the .txt file. i.e. ... "000002012149000060", "000002012149000064", "000002012149000064", "000002012149000064", "000002012149000064",... (9 Replies)
Discussion started by: Daniel Gate
9 Replies

3. UNIX for Dummies Questions & Answers

Filtering the duplicates

Hello, I want to filter all the duplicates of a record to one place. Sample input and output will give you better idea. I am new to unix. Can some one help me on this? Input: 7488 7389 chr1.fa chr1.fa 3546 9887 chr5.fa chr9.fa 7387 7898 chrX.fa chr3.fa 7488 7389 chr1.fa chr1.fa... (2 Replies)
Discussion started by: koneru_18
2 Replies

4. Shell Programming and Scripting

Remove duplicates based on query and subject fields from blast output file

Hi all I have a blast outfile file like this : NZ_1540841_1561981 ICMP_1687819_1695946 92.59 27 2 0 12826 12852 3136 3162 0.28 38.2 NZ_1540841_1561981 ICMP_1687819_1695946 95.65 23 1 0 12268 12290 5815 5837 0.28 38.2 NZ_1540841_1561981 ICMP_3674888_3676546 82.70 185 32 0 9454 9638 11 195 6e-24 ... (2 Replies)
Discussion started by: pbioinfo
2 Replies

5. Shell Programming and Scripting

Help in removing duplicates

I have an input file abc.txt with info like: abcd rateuse inklite robet rateuse abcd I need to remove duplicates from the file (eg: abcd,rateuse) from the file and need to place the contents in same file abc.txt if needed can be placed in another file. can anyone help me in this :( (4 Replies)
Discussion started by: rkrish
4 Replies

6. AIX

Duplicates in bootlist

Hello, I'm moving some disks from the rootvg on AIX 5.3. # replacepv hdiskOLD hdiskNEW I have for example hdisk12 and hdisk13 with hd5 (boot) LV and want to move hdisk13 So 1st I'm excluding it from the bootlist: # bootlist -om normal hdisk12 then # replacepv hdisk13... (7 Replies)
Discussion started by: emoubi
7 Replies

7. UNIX for Dummies Questions & Answers

Duplicates

Hi, How to eliminate the duplicate values in unix? I have a excel file which contains duplicate values. Need to use this in a script. Thanks in advance. (3 Replies)
Discussion started by: venkatesht
3 Replies

8. Shell Programming and Scripting

Non Duplicates

I have input file like below. I00789524 0213 5212 D00789524 0213 5212 I00778787 2154 5412 The first two records are same(Duplicates) except I & D in the first character. I want non duplicates(ie. 3rd line) to be output. How can we get this . Can you help. Is there any single AWK or SED... (3 Replies)
Discussion started by: awk_beginner
3 Replies

9. HP-UX

getting duplicates

how to get duplicates in a file containing data in columns using command or scripting? (4 Replies)
Discussion started by: megh
4 Replies

10. Shell Programming and Scripting

Reading Input from File and Duplicates Output

Greetings to all, I would like to read input from a file and make duplications from it with Linux shell. For e.g. Input file ----------- ABC ABB ABA ------------------------------- Output file ------------ ABC ABC ABC ABB ABB (6 Replies)
Discussion started by: noelcantona
6 Replies
Login or Register to Ask a Question