Sponsored Content
Top Forums Programming Finding duplicate files in two base directories Post 302918032 by migurus on Friday 19th of September 2014 09:09:58 PM
Old 09-19-2014
You can use SHA1 to identify identical files. See below a script to find and show similar files from two different directories:

Code:
DIR1=${1};
DIR2=${2};
TMP1=$(mktemp);
TMP2=$(mktemp);
trap "rm -f $TMP1 $TMP2" EXIT HUP INT QUIT TERM
  
for f1 in $( find $DIR1 -type f -name "*.[ch]" ); do
        shasum $f1 >> TMP1;
done
for f2 in $( find $DIR2 -type f -name "*.[ch]" ); do
        shasum $f2 >> TMP2;
done
 
cat TMP1 TMP2|cut -W -f1|sort|uniq -c|
 awk '{if($1>1)print $2;}'|
while read sha;
do
        grep $sha TMP1 TMP2 | cut -W -f2;
        echo;
done
 exit 0

I added
Code:
echo;

just to separate groups of identical files with an empty line, just for visibility.

This is a quick and dirty, no error checking etc... just to illustrate the idea.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Finding executable files in all directories

This is probably very easy but I would like to know a way to list all my files in all my directories that are readable and executable to everyone. I was told to use find or ls and I tried some stuff but couldnt get it to work. I understand that its dangerous to have files with these permissions for... (4 Replies)
Discussion started by: CSGUY
4 Replies

2. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies

3. Shell Programming and Scripting

duplicate directories

Hi, I have file which users like filename ->"readfile", following entries peter john alaska abcd xyz and i have directory /var/ i want to do first cat of "readfile" line by line and first read peter in variable and also cross check with /var/ how many directories are avaialble... (8 Replies)
Discussion started by: learnbash
8 Replies

4. UNIX for Dummies Questions & Answers

finding largest files (not directories)?

hello all. i would like to be able to find the names of all files on a remote machine using ssh. i only want the names of files, not directories so far i'm stuck at "du -a | sort -n" also, is it possible to write them to a file on my machine? i know how to write it to a file on that... (2 Replies)
Discussion started by: user19190989
2 Replies

5. Shell Programming and Scripting

Finding Duplicate files

How do you delete and and find duplicate files? (1 Reply)
Discussion started by: Jicom4
1 Replies

6. Shell Programming and Scripting

Script for parsing directories one level and finding directories older than n days

Hello all, Here's the deal...I have one directory with many subdirs and files. What I want to find out is who is keeping old files and directories...say files and dirs that they didn't use since a number of n days, only one level under the initial dir. Output to a file. A script for... (5 Replies)
Discussion started by: ejianu
5 Replies

7. UNIX for Dummies Questions & Answers

[Solved] Finding the Files In the Same Name Directories

Hi, In the Unix Box, I have a situation, where there is folder name called "Projects" and in that i have 20 Folders S1,S2,S3...S20. In each of the Folders S1,S2,S3,...S20 , there is a same name folder named "MP". So Now, I want to get all the files in all the "MP" Folders and write all those... (6 Replies)
Discussion started by: Siva Sankar
6 Replies

8. Shell Programming and Scripting

finding matches between multiple files from different directories

Hi all... Can somebody pls help me with this... I have a directory (dir1) which has many subdirectories(vr001,vr002,vr003..) with each subdir containing similar text file(say ras.txt). I have another directory(dir2) which has again got some subdir(vr001c,vr002c,vr003c..) with each subdir... (0 Replies)
Discussion started by: bramya07
0 Replies

9. Shell Programming and Scripting

Finding non-existing words in a list of files in a directory and its sub-directories

Hi All, I have a list of words (these are actually a list of database table names separated by comma). Now, I want to find only the non-existing list of words in the *.java files of current directory and/or its sub-directories. Sample list of words:... (8 Replies)
Discussion started by: Bhanu Dhulipudi
8 Replies

10. Shell Programming and Scripting

Finding files deep in directories

i need to find a portable way to go through multiple directories to find a file. I've trid something like this: find /opt/oracle/diag/*/alert_HH2.log -printordinarily, i can run the ls command and it will find it: /opt/oracle/diag/*/*/*/*/alert_HH2.log The problem with this approach is... (3 Replies)
Discussion started by: SkySmart
3 Replies
ZGREP(1)						      General Commands Manual							  ZGREP(1)

NAME
zgrep - search possibly compressed files for a regular expression SYNOPSIS
zgrep [ grep_options ] [ -e ] pattern filename... DESCRIPTION
Zgrep invokes grep on compressed or gzipped files. These grep options will cause zgrep to terminate with an error code: (-[drRzZ]|--di*|--exc*|--inc*|--rec*|--nu*). All other options specified are passed directly to grep. If no file is specified, then the standard input is decompressed if necessary and fed to grep. Otherwise the given files are uncompressed if necessary and fed to grep. If the GREP environment variable is set, zgrep uses it as the grep program to be invoked. EXIT CODE
2 - An option that is not supported was specified. AUTHOR
Charles Levert (charles@comm.polymtl.ca) SEE ALSO
grep(1), gzexe(1), gzip(1), zdiff(1), zforce(1), zmore(1), znew(1) ZGREP(1)
All times are GMT -4. The time now is 11:08 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy