Sponsored Content
Top Forums Programming Finding duplicate files in two base directories Post 302918032 by migurus on Friday 19th of September 2014 09:09:58 PM
Old 09-19-2014
You can use SHA1 to identify identical files. See below a script to find and show similar files from two different directories:

Code:
DIR1=${1};
DIR2=${2};
TMP1=$(mktemp);
TMP2=$(mktemp);
trap "rm -f $TMP1 $TMP2" EXIT HUP INT QUIT TERM
  
for f1 in $( find $DIR1 -type f -name "*.[ch]" ); do
        shasum $f1 >> TMP1;
done
for f2 in $( find $DIR2 -type f -name "*.[ch]" ); do
        shasum $f2 >> TMP2;
done
 
cat TMP1 TMP2|cut -W -f1|sort|uniq -c|
 awk '{if($1>1)print $2;}'|
while read sha;
do
        grep $sha TMP1 TMP2 | cut -W -f2;
        echo;
done
 exit 0

I added
Code:
echo;

just to separate groups of identical files with an empty line, just for visibility.

This is a quick and dirty, no error checking etc... just to illustrate the idea.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Finding executable files in all directories

This is probably very easy but I would like to know a way to list all my files in all my directories that are readable and executable to everyone. I was told to use find or ls and I tried some stuff but couldnt get it to work. I understand that its dangerous to have files with these permissions for... (4 Replies)
Discussion started by: CSGUY
4 Replies

2. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies

3. Shell Programming and Scripting

duplicate directories

Hi, I have file which users like filename ->"readfile", following entries peter john alaska abcd xyz and i have directory /var/ i want to do first cat of "readfile" line by line and first read peter in variable and also cross check with /var/ how many directories are avaialble... (8 Replies)
Discussion started by: learnbash
8 Replies

4. UNIX for Dummies Questions & Answers

finding largest files (not directories)?

hello all. i would like to be able to find the names of all files on a remote machine using ssh. i only want the names of files, not directories so far i'm stuck at "du -a | sort -n" also, is it possible to write them to a file on my machine? i know how to write it to a file on that... (2 Replies)
Discussion started by: user19190989
2 Replies

5. Shell Programming and Scripting

Finding Duplicate files

How do you delete and and find duplicate files? (1 Reply)
Discussion started by: Jicom4
1 Replies

6. Shell Programming and Scripting

Script for parsing directories one level and finding directories older than n days

Hello all, Here's the deal...I have one directory with many subdirs and files. What I want to find out is who is keeping old files and directories...say files and dirs that they didn't use since a number of n days, only one level under the initial dir. Output to a file. A script for... (5 Replies)
Discussion started by: ejianu
5 Replies

7. UNIX for Dummies Questions & Answers

[Solved] Finding the Files In the Same Name Directories

Hi, In the Unix Box, I have a situation, where there is folder name called "Projects" and in that i have 20 Folders S1,S2,S3...S20. In each of the Folders S1,S2,S3,...S20 , there is a same name folder named "MP". So Now, I want to get all the files in all the "MP" Folders and write all those... (6 Replies)
Discussion started by: Siva Sankar
6 Replies

8. Shell Programming and Scripting

finding matches between multiple files from different directories

Hi all... Can somebody pls help me with this... I have a directory (dir1) which has many subdirectories(vr001,vr002,vr003..) with each subdir containing similar text file(say ras.txt). I have another directory(dir2) which has again got some subdir(vr001c,vr002c,vr003c..) with each subdir... (0 Replies)
Discussion started by: bramya07
0 Replies

9. Shell Programming and Scripting

Finding non-existing words in a list of files in a directory and its sub-directories

Hi All, I have a list of words (these are actually a list of database table names separated by comma). Now, I want to find only the non-existing list of words in the *.java files of current directory and/or its sub-directories. Sample list of words:... (8 Replies)
Discussion started by: Bhanu Dhulipudi
8 Replies

10. Shell Programming and Scripting

Finding files deep in directories

i need to find a portable way to go through multiple directories to find a file. I've trid something like this: find /opt/oracle/diag/*/alert_HH2.log -printordinarily, i can run the ls command and it will find it: /opt/oracle/diag/*/*/*/*/alert_HH2.log The problem with this approach is... (3 Replies)
Discussion started by: SkySmart
3 Replies
MKTEMP(1)						    BSD General Commands Manual 						 MKTEMP(1)

NAME
mktemp -- make temporary file name (unique) SYNOPSIS
mktemp [-d] [-q] [-u] template DESCRIPTION
The mktemp utility takes the given file name template and overwrites a portion of it to create a file name. This file name is unique and suitable for use by the application. The template may be any file name with at least 6 of 'Xs' appended to it, for example /tmp/temp.XXXXXX. The trailing 'Xs' are replaced with the current process number and/or a unique letter combination. The number of unique file names mktemp can return depends on the number of 'Xs' provided; six 'Xs' will result in mktemp testing roughly 26 ** 6 combinations. If mktemp can successfully generate a unique file name, the file is created with mode 0600 (unless the -u flag is given) and the filename is printed to standard output. OPTIONS
The available options are as follows: -d Make a directory instead of a file. -q Fail silently if an error occurs. This is useful if a script does not want error output to go to standard error. -u Operate in ``unsafe'' mode. The temp file will be unlinked before mktemp exits. This is slightly better than mktemp(3) but still introduces a race condition. Use of this option is not encouraged. RETURN VALUES
The mktemp utility exits with a value of 0 on success, and 1 on failure. EXAMPLES
The following sh(1) fragment illustrates a simple use of mktemp where the script should quit if it cannot get a safe temporary file. TMPFILE=`mktemp /tmp/$0.XXXXXX` || exit 1 echo "program output" >> $TMPFILE In this case, we want the script to catch the error itself. TMPFILE=`mktemp -q /tmp/$0.XXXXXX` if [ $? -ne 0 ]; then echo "$0: Can't create temp file, exiting..." exit 1 fi Note that one can also check to see that $TMPFILE is zero length instead of checking $?. This would allow the check to be done later one in the script (since $? would get clobbered by the next shell command). SEE ALSO
mkstemp(3), mktemp(3) HISTORY
The mktemp utility appeared in OpenBSD. BSD
November, 20, 1996 BSD
All times are GMT -4. The time now is 11:31 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy