Find duplicates among 2 directories


 
Thread Tools Search this Thread
Operating Systems Linux Ubuntu Find duplicates among 2 directories
# 8  
Old 02-25-2019
Quote:
Originally Posted by drew77
...
Code:
diff /media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04 /media/andy/MAXTOR_SDB1/Linux_File

Code:
Only in /media/andy/MAXTOR_SDB1/Linux_Files: Briggs_Stratton_Generator.zip
Only in /media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04: Brinkmann_8109415-W.zip
Only in /media/andy/MAXTOR_SDB1/Linux_Files: Brother_2240_Drivers.zip

I do not know what "Only in" means.
...
Isn't that pretty much clear, and more than easily verifyable?
I'd propose the interpretation that Brinkmann_8109415-W.zip is available in /media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04 AND NOT in /media/andy/MAXTOR_SDB1/Linux_Files

Last edited by RudiC; 02-26-2019 at 09:59 AM.. Reason: typo
# 9  
Old 02-25-2019
Code:
fdupes /media/andy/MAXTOR_SDB1/Linux_Files/ /media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/


/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/Send_Email_Via_Command_Line.zip
/media/andy/MAXTOR_SDB1/Linux_Files/Send_Email_Via_Command_Line.zip

/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/My_Sounds.zip
/media/andy/MAXTOR_SDB1/Linux_Files/My_Sounds.zip

/media/andy/MAXTOR_SDB1/Linux_Files/Efax-gtk_Setup_IMPT.zip
/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/Efax-gtk_Setup_IMPT.zip

/media/andy/MAXTOR_SDB1/Linux_Files/SDB1_Maxtor_Drive
/media/andy/MAXTOR_SDB1/Linux_Files/MAXTOR_SDB1
/media/andy/MAXTOR_SDB1/Linux_Files/NEVER_DELETE_THIS_DIRECTORY
/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/2019-02-23_23:42
/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/2019-02-25_03:47

/media/andy/MAXTOR_SDB1/Linux_Files/multi-timer.zip
/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/multi-timer.zip


Is this a list of what files are identical, both in file name and content?
# 10  
Old 02-25-2019
Yes. You can even option -S (size) indicate that there was no confusion

--- Post updated at 15:05 ---

diff utility with such files should be silent
Code:
diff \
/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/My_Sounds.zip \
/media/andy/MAXTOR_SDB1/Linux_Files/My_Sounds.zip

# 11  
Old 02-25-2019
Quote:
Originally Posted by drew77
...
Is this a list of what files are identical, both in file name and content?

It doesn't REALLY seem so, does it? Just from looking at it, and trying to apply some logics and common sense, I'd say SDB1_Maxtor_Drive, MAXTOR_SDB1, and NEVER_DELETE_THIS_DIRECTORY are missing in /media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/, and 2019-02-23_23:42, and 2019-02-25_03:47 are missing in /media/andy/MAXTOR_SDB1/Linux_Files/.
The other filenames are paired and seem to exist in either dir but cannot be considered equal based on the info known thus far.

Last edited by RudiC; 02-25-2019 at 08:19 AM.. Reason: Typo
This User Gave Thanks to RudiC For This Post:
# 12  
Old 02-25-2019
To see which files will be deleted
fdups -f
This will display the entire list without the top files in each section.
Well, to remove the displayed list, use
fdups -Nd
But in your case I would use an interactive way. At least until you get acquainted with the subtleties of this tool.
Good luck

--- Post updated at 15:28 ---

Quote:
Originally Posted by nezabudka
diff utility with such files should be silent
But it is not a fact if you compare binary files. The real is that the "fdupes" utility even works with binary files and "diff" is a text tool
# 13  
Old 02-25-2019
Quote:
Originally Posted by nezabudka
Yes. You can even option -S (size) indicate that there was no confusion

--- Post updated at 15:05 ---

diff utility with such files should be silent
Code:
diff \
/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/My_Sounds.zip \
/media/andy/MAXTOR_SDB1/Linux_Files/My_Sounds.zip


Since my directories contain binaries, and diff only works with text files, it would not help me.

--- Post updated at 10:47 AM ---

Quote:
Originally Posted by RudiC
It doesn't REALLY seem so, does it? Just from looking at it, and trying to apply some logics and common sense, I'd say SDB1_Maxtor_Drive, MAXTOR_SDB1, and NEVER_DELETE_THIS_DIRECTORY are missing in /media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/, and 2019-02-23_23:42, and 2019-02-25_03:47 are missing in /media/andy/MAXTOR_SDB1/Linux_Files/.
The other filenames are paired and seem to exist in either dir but cannot be considered equal based on the info known thus far.

The reason for my post is this.


I use Clonezilla to make images of my main drive to a 2nd older drive.


I also make images of my 2nd drive to my main drive.


That uses a lot more space than simply copying files to my main drive.


I will make a cp script for that.


Code:
rsync --progress -r -u /media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/* /home/andy/Ubuntu_18.04_Programs/


Last edited by drew77; 02-25-2019 at 03:10 PM..
# 14  
Old 02-26-2019
Quote:
Originally Posted by drew77

Code:
rsync --progress -r -u /media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/* /home/andy/Ubuntu_18.04_Programs/

If you want to synchronise two directories/filesystems the way you described your goal now to be then rsync is the way to go. rsync was built for exactly this purpose. You don't even need to check anything before because the toolw will do that itself and simply do nothing if there is nothing to do (that is, if the directories are in sync already).

I hope this helps.

bakunin
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Find duplicates in file with line numbers

Hello All, This is a noob question. I tried searching for the answer but the answer found did not help me . I have a file that can have duplicates. 100 200 300 400 100 150 the number 100 is duplicated twice. I want to find the duplicate along with the line number. expected... (4 Replies)
Discussion started by: vatigers
4 Replies

2. Shell Programming and Scripting

Find duplicates in 2 & 3rd column and their ID

with below given format, I have been trying to find out all IDs for those entries with duplicate names in 2nd and 3rd columns and their count like how many time duplication happened for any name if any, 0.237788 Aaban Aahva 0.291066 Aabheer Aahlaad 0.845814 Aabid Aahan 0.152208 Aadam... (6 Replies)
Discussion started by: busyboy
6 Replies

3. Shell Programming and Scripting

Find All duplicates based on multiple keys

Hi All, Input.txt 123,ABC,XYZ1,A01,IND,I68,IND,NN 123,ABC,XYZ1,A01,IND,I67,IND,NN 998,SGR,St,R834,scot,R834,scot,NN 985,SGR0399,St,R180,T15,R180,T1,YY 985,SGR0399,St,R180,T15,R180,T1,NN 985,SGR0399,St,R180,T15,R180,T1,NN 2943,SGR?99,St,R68,Scot,R77,Scot,YY... (2 Replies)
Discussion started by: unme
2 Replies

4. Shell Programming and Scripting

find numeric duplicates from 300 million lines....

these are numeric ids.. 222932017099186177 222932014385467392 222932017371820032 222932017409556480 I have text file having 300 millions of line as shown above. I want to find duplicates from this file. Please suggest the quicker way.. sort | uniq -d will... (3 Replies)
Discussion started by: pamu
3 Replies

5. UNIX for Dummies Questions & Answers

Using grep command to find the pattern of text in all directories and sub-directories.

Hi all, Using grep command, i want to find the pattern of text in all directories and sub-directories. e.g: if i want to search for a pattern named "parmeter", i used the command grep -i "param" ../* is this correct? (1 Reply)
Discussion started by: vinothrajan55
1 Replies

6. UNIX for Dummies Questions & Answers

sort and find duplicates for files with no white space

example data 5666700842511TAfmoham03151008075205999900000001000001000++ 5666700843130MAfmoham03151008142606056667008390315100005001 6666666663130MAfmoham03151008142606056667008390315100005001 I'd like to sort on position 10-14 where the characters are eq "130MA". Then based on positions... (0 Replies)
Discussion started by: mmarshall
0 Replies

7. Shell Programming and Scripting

Find duplicates in the first column of text file

Hello, My text file has input of the form abc dft45.xml ert rt653.xml abc ert57.xml I need to write a perl script/shell script to find duplicates in the first column and write it into a text file of the form... abc dft45.xml abc ert57.xml Can some one help me plz? (5 Replies)
Discussion started by: gameboy87
5 Replies

8. Shell Programming and Scripting

How to find 777 permisson is there or not for Directories and sub-directories

Hi All, I am Oracle Apps Tech guy, I have a requirement to find 777 permission is there or not for all Folders and Sub-folders Under APPL_TOP (Folder/directory) with below conditions i) the directory names should start with xx..... (like xxau,xxcfi,xxcca...etc) and exclude the directory... (11 Replies)
Discussion started by: gagan4599
11 Replies

9. Shell Programming and Scripting

Shellscript to find duplicates according to size

I have a folder which in turn has numerous sub folders all containing pdf files with same file named in different ways. So I need a script if it can be written to find and print the duplicate files (That is files with same size) along with the respective paths. So I assume here that same file... (5 Replies)
Discussion started by: deaddevil
5 Replies

10. Shell Programming and Scripting

Awk to find duplicates in 2nd field

I want to find duplicates in file on 2nd field i wrote this code: nawk '{a++} END{for i in a {if (a>1) print}}' temp Could not find whats wrong with this. Appreciate help (5 Replies)
Discussion started by: pinnacle
5 Replies
Login or Register to Ask a Question