To Filter out duplicates..


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting To Filter out duplicates..
# 1  
Old 06-30-2010
Error To Filter out duplicates..

I have a text file of this format...

55 55-45345.xml 20070615
55 55-87655.xml 20070613
34 34-56753.xml 20070614

The text file has values like a number,xml file name, and a date.The first column can have n number of duplicates.And no two dates are equal.Now I sorted out the file.So, it is sorted by first column wise.But itis not sorted by date-wise.

My actual problem is,

I need to write a perl script to find the duplicates in first column.If n number of duplicates are found then n number of XML files are also duplicates.So,on the basis of Date i need to sort it down and take down the least date with it's corresponding XML file and put in a text file.(so that the it is sorted on basis of date wise !) Then,i have to shift the least dated XML file(top most) to a particular directory if the wc(word count) of the n file is equal...if the wc is not equal, then i have to take the bottom most(highest dated XML file) to a folder.

Plz help me out in solving the problem. !
# 2  
Old 07-01-2010
With this code, you can get the latest xml file list with unique ID by first column

Code:
sort -k1n -k3n urfile |awk '{a[$1]=$3; b[$1]=$0}END{for (i in a) print b[i]|"sort -n"}'

For your rest requests, such as shift, wc, you need explain more.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Filter first column duplicates

Dear All, I really enjoy your help or suggestion for resolving an issue. Briefly, I have a file like this: a b c a d e f g h k g h x y z If the first column has the same ID, for example a, just remove it. The output should be this: f g h k g h x y z I was thinking to do it... (11 Replies)
Discussion started by: giuliangiuseppe
11 Replies

2. UNIX for Dummies Questions & Answers

Filter records in a huge text file from a filter text file

Hi Folks, I have a text file with lots of rows with duplicates in the first column, i want to filter out records based on filter columns in a different filter text file. bash scripting is what i need. Data.txt Name OrderID Quantity Sam 123 300 Jay 342 498 Kev 78 2500 Sam 420 50 Vic 10... (3 Replies)
Discussion started by: tech_frk
3 Replies

3. Shell Programming and Scripting

Do Not Output Duplicates

Mac OS 10.9 Let me preface this by saying this is not for marketing or spamming purposes. I have a script that scans all the email messages in a directory (~/Library/Mail/Mailboxes) and outputs a single column list of email addresses. This will run multiple times a day and append the output... (3 Replies)
Discussion started by: sudo
3 Replies

4. Shell Programming and Scripting

Remove duplicates

I have a file with the following format: fields seperated by "|" title1|something class|long...content1|keys title2|somhing class|log...content1|kes title1|sothing class|lon...content1|kes title3|shing cls|log...content1|ks I want to remove all duplicates with the same "title field"(the... (3 Replies)
Discussion started by: dtdt
3 Replies

5. UNIX for Dummies Questions & Answers

Duplicates

Hi, How to eliminate the duplicate values in unix? I have a excel file which contains duplicate values. Need to use this in a script. Thanks in advance. (3 Replies)
Discussion started by: venkatesht
3 Replies

6. Shell Programming and Scripting

Non Duplicates

I have input file like below. I00789524 0213 5212 D00789524 0213 5212 I00778787 2154 5412 The first two records are same(Duplicates) except I & D in the first character. I want non duplicates(ie. 3rd line) to be output. How can we get this . Can you help. Is there any single AWK or SED... (3 Replies)
Discussion started by: awk_beginner
3 Replies

7. Shell Programming and Scripting

Remove duplicates

Hello Experts, I have two files named old and new. Below are my example files. I need to compare and print the records that only exist in my new file. I tried the below awk script, this script works perfectly well if the records have exact match, the issue I have is my old file has got extra... (4 Replies)
Discussion started by: forumthreads
4 Replies

8. HP-UX

getting duplicates

how to get duplicates in a file containing data in columns using command or scripting? (4 Replies)
Discussion started by: megh
4 Replies

9. Shell Programming and Scripting

removing duplicates

Hi I have a file that are a list of people & their credentials i recieve frequently The issue is that whne I catnet this list that duplicat entries exists & are NOT CONSECUTIVE (i.e. uniq -1 may not weork here ) I'm trying to write a scrip that will remove duplicate entries the script can... (5 Replies)
Discussion started by: stevie_velvet
5 Replies

10. Shell Programming and Scripting

Removing duplicates

Hi, I've been trying to removed duplicates lines with similar columns in a fixed width file and it's not working. I've search the forum but nothing comes close. I have a sample file: 27147140631203RA CCD * 27147140631203RA PPN * 37147140631207RD AAA 47147140631203RD JNA... (12 Replies)
Discussion started by: giannicello
12 Replies
Login or Register to Ask a Question