Finding the duplicate in a file....


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Finding the duplicate in a file....
# 1  
Old 11-09-2009
Finding the duplicate in a file....

Hi Unix Guru's

I had generated the uniqe code for every day date ranging from 20000101 to 21990101(200 years alomost 73000 uniqe codes ) and redirected it to text file.

Now My problem is i want to check whether there are any duplicates in unique code not PRESENT in the textfile ?

unique.txt Text file is in the following format.

YYYYMMDD|YYYYMM|UNIQUE CODE(ignore this row)

21120101|211201|F5CA1DD7746029E9C1CEF3137345D987
21120102|211201|F98804977D03F72DBC0AA0163B26F89E
21120103|211201|F01F29EC62E943978C934BCA79CD0140
21120104|211201|C943B6AB6BE9275D4D52B06BB484C59E
21120105|211201|A42873466FD7EF8FD211C82C52B2E1B2
21120106|211201|0179BB5B69E1433758E17DCA7D5A7D10
21120107|211201|30801625DDF75D0CC74E0255E994629E
21120108|211201|B758F26C1DCBC48F5BA62F38CED8B880

And also i want to print all the duplicates found. in the same formate of the record.

Thanks in Advance.
# 2  
Old 11-09-2009
Julian date for '20000101' = 2451545
Julian date for '22000101' = 2524594

That is more than 73000 days.
to check to see if all dates are there try:
Code:
wc -l unique.txt

To see if the date key field is duplicated anywhere
Code:
awk -F'|'  '{arr[$1]++} END {for (i in arr) { if (arr[i>1]) {print i  }  }}'  unique.txt

This User Gave Thanks to jim mcnamara For This Post:
# 3  
Old 11-09-2009
Thanks Jim.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

FINDING DUPLICATE PROJECT ( directory project )

I have a project tree like that. after running find command with the -no -empty option, i am able to have a list of non empty directory DO_MY_SEARCH="find . -type d -not -empty -print0" MY_EXCLUDE_DIR1=" -e NOT_IN_USE -e RTMAP -e NOT_USEFULL " echo " " > $MY_TEMP_RESULT_1 while... (2 Replies)
Discussion started by: jcdole
2 Replies

2. Programming

Finding duplicate files in two base directories

Hello All, I have got some assignment to complete till this Monday and problem statement is as follow :- Problem :- Find duplicate files (especially .c and .cpp) from two project base directories with following requirement :- 1.Should be extendable to search in multiple base... (4 Replies)
Discussion started by: anand.shah
4 Replies

3. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
Discussion started by: machomaddy
7 Replies

4. Shell Programming and Scripting

Perl- Finding average "frequency" of occurrence of duplicate lines

Hello, I am working with a perl script that tries to find the average "frequency" in which lines are duplicated. So far I've only managed to find the way to count how many times the lines are repeated, the code is as follows: perl -ae' my $filename= $ENV{'i'}; open (FILE, "$filename") or... (10 Replies)
Discussion started by: acsg
10 Replies

5. UNIX for Dummies Questions & Answers

CSV file:Find duplicates, save original and duplicate records in a new file

Hi Unix gurus, Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me. File format: CSV file File has four columns with no header... (8 Replies)
Discussion started by: arvindosu
8 Replies

6. Shell Programming and Scripting

Finding Duplicate files

How do you delete and and find duplicate files? (1 Reply)
Discussion started by: Jicom4
1 Replies

7. Shell Programming and Scripting

Finding duplicate lines and deleting folders based on them

Hi, I have research data, which is organized to 100 folders numbered 00-99. I have many sets of 100 folders, for different values of initial parameters. For some reason, the computer that ran the program to gather the data, didn't always create a unique seed for each folder. I anticipated that... (1 Reply)
Discussion started by: Jopi
1 Replies

8. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies

9. Shell Programming and Scripting

Finding duplicate data in a file

A pogram named LOGGEDON returns an output of: Ref_num IP Address Logged on User 12000 10.10.12.12 12-02-2002 11:00 john 12004 10.10.12.13 12-03-2002 14:00 mary 12012 10.10.12.14 12-03-2002 11:30 bob 12024 ... (1 Reply)
Discussion started by: dinplant
1 Replies
Login or Register to Ask a Question