Find duplicates from multuple files with 2 diff types of files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find duplicates from multuple files with 2 diff types of files
# 1  
Old 03-04-2008
Bug Find duplicates from multuple files with 2 diff types of files

I need to compare 2 diff type of files and find out the duplicate after comparing each types of files:

Type 1 file name is like: file1.abc
(the extension abc could any 3 characters but I can narrow it down or hardcode for 10/15 combinations).

The other file is file1.bcd01abc (the extension bcd is constant, 2 digit (00 to 04) and last three character could be any 3 letters (I can narrow that last 3 charcaters to 10-15 combinations).

whenever the script finds a duplicate after coparing these two types of files, the output will indicate "duplicate found in file1.abc and and file1.bcd01abc and the value is xxxx".


Both of file format:
172.10.10.19<TAB>rou020i00sfg<TAB>descriptions

Thanks and this is not a homwork.
# 2  
Old 03-04-2008
This finds duplicates
Code:
find_dup()
{
awk ' FILENAME=="file1" { arr[$0]++ }
        FILENAME=="file2" { if (arr[$0] { print "value is", $0} }
      ' file1 file2
}

Write a loop that generates one set of filenames - the file1.abc things put it into a file -dirfile1
get another list of the other type of files - call the file dirfile2

Okay now call the find_dup() fuunction for each file combination:
Code:
#/bin/ksh
while read file2 
do
   while read file1 
   do
        ln -s file1 $file1
        ln -s file2 $file2
        result=$( find_dup )
        if [[ ! -z $result ]] ; then
           echo "duplicate found in $file1 and $file2 $result"
        fi
   done  < dirfile1
done < dirfile2  > result.log

result.log will have what you found.
# 3  
Old 03-04-2008
How do I differentiate file name extensions.
see both file nane starts with file1, but the extensions are diff.
file1.xxx
file1.bcd00xxx

so while read should I put file1.*
and then second file file1.bcd*

also I am getting error:

[13]: dirfile2: cannot open

Thanks

Last edited by ricky007; 03-05-2008 at 12:40 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Diff 3 files, but diff only their 2nd column

Guys i have 3 files, but i want to compare and diff only the 2nd column path=`/home/whois/doms` for i in `cat domain.tx` do whois $i| sed -n '/Registry Registrant ID:/,/Registrant Email:/p' > $path/$i.registrant whois $i| sed -n '/Registry Admin ID:/,/Admin Email:/p' > $path/$i.admin... (10 Replies)
Discussion started by: kenshinhimura
10 Replies

2. UNIX for Dummies Questions & Answers

Find a list of files in directory, move to new, allow duplicates

Greetings. I know enough Unix to be dangerous (!) and know that there is a clever way to do the following and it will save me about a day of agony (this time) and I will use it forever after! (many days of agony saved in the future)! Basically I need to find any image files (JPGs, PSDs etc)... (5 Replies)
Discussion started by: Clyde Lovett
5 Replies

3. Shell Programming and Scripting

Using a single "find" cmd to search for multiple file types and output individual files

Hi All, I am new here but I have a scripting question that I can't seem to figure out with the "find" cmd. What I am trying to do is to only have to run a single find cmd parsing the directories and output the different file types to induvidual files and I have been running into problems.... (3 Replies)
Discussion started by: swaters
3 Replies

4. UNIX for Dummies Questions & Answers

Find diff between two patterns in two files and append

Hi, I'm a newbie at programming in Unix, and I seem to have a task that is greater than I can handle. Trying to learn awk by the way (but in the end, i just need something that works). My goal is to compare two files and output the difference between the two. I've been reading, and I think I... (5 Replies)
Discussion started by: legato22
5 Replies

5. UNIX for Advanced & Expert Users

How to find duplicates contents in a files by comparing other files?

Hi Guys , we have one directory ...in that directory all files will be set on each day.. files must have header ,contents ,footer.. i wants to compare the header,contents,footer ..if its same means display an error message as 'files contents same' (7 Replies)
Discussion started by: Venkatesh1
7 Replies

6. UNIX for Dummies Questions & Answers

sort and find duplicates for files with no white space

example data 5666700842511TAfmoham03151008075205999900000001000001000++ 5666700843130MAfmoham03151008142606056667008390315100005001 6666666663130MAfmoham03151008142606056667008390315100005001 I'd like to sort on position 10-14 where the characters are eq "130MA". Then based on positions... (0 Replies)
Discussion started by: mmarshall
0 Replies

7. Shell Programming and Scripting

Find diff bet 2 files and store result in another file

Hi I want to compare 2 files. The files have the same amount of rows and columns. So each line must be compare against the other and if one differs from the other, the result of both must be stored in a seperate file. I am doing this in awk. Here is my file1: Blocks... (2 Replies)
Discussion started by: ladyAnne
2 Replies

8. Shell Programming and Scripting

Help - Bug: A script to compile two types of data files into two temporary files

Dear other forum members, I'm writing a script for my homework, but I'm scratching all over my head and still can't figure out what I did wrong. Please help me. I just started to learn about bash scripting, and I appreciate if anyone of you can point out my errors. I thank you in advance. ... (3 Replies)
Discussion started by: ilove2smoke
3 Replies

9. UNIX for Dummies Questions & Answers

find directory with 2 types of files

Trying to use the find command to find any directory which contains a file ending in .zip AND a file ending in .o I'm having trouble specifying multiple files as criteria and have can't seem to figure it out from Unix in a Nutshell and Google. (2 Replies)
Discussion started by: dangral
2 Replies

10. Shell Programming and Scripting

diff 2 files; output diff's to 3rd file

Hello, I want to compare two files. All records in file 2 that are not in file 1 should be output to file 3. For example: file 1 123 1234 123456 file 2 123 2345 23456 file 3 should have 2345 23456 I have looked at diff, bdiff, cmp, comm, diff3 without any luck! (2 Replies)
Discussion started by: blt123
2 Replies
Login or Register to Ask a Question