Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Delete duplicate files from one of two directory structures Post 302362236 by procreator on Thursday 15th of October 2009 10:33:06 AM
Old 10-15-2009
Delete duplicate files from one of two directory structures

Hello everyone,

I have been struggling to clean up a back-up mess I created when manually duplicating a directory structure and then working in both of them..
The structures now are significantly different and contain in the order of 15 k files of which most are duplicates.
Now I am trying to merge those dirs and had a look at FSlint, Meld, diff, and fdupes.
While all of those are good tools, I have not found them able to do what I need so I am looking for a way to reduce manual work to a minimum by deleting duplicates from the second dir structure. I will have to sort/merge the remaining files by hand.
The closest to doing that is with fslint's findup (Linux.com :: Tidy up your filesystem with FSlint) which returns a list of duplicate files separated by empty lines.
Since the duplicates listed may also be within a single one of the directory structures, I cannot be sure to delete files that are also present in path 1.
I can't make myself really clear, I'm afraid, so here's an example:

dir1/path/dup1
dir2/somepath/dup1 <-- delete
dir2/path/to/dup1 <-- delete

One attempt may be to first delete duplicates from dir2 and afterwards compare with dir1.

Any help appreciated!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

help:comparing two directory tree structures only

Hi I what, a script snippet for "comparing two directory tree structures only " not the contents of directories(like files..etc). Thanking you a lot. Regards Rajesh (7 Replies)
Discussion started by: raj_thota
7 Replies

2. Shell Programming and Scripting

remove duplicate files in a directory

Hi ppl. I have to check for duplicate files in a directory . the directory has following files /the/folder /containing/the/file a1.yyyymmddhhmmss a1.yyyyMMddhhmmss b1.yyyymmddhhmmss b2.yyyymmddhhmmss c.yyyymmddhhmmss d.yyyymmddhhmmss d.yyyymmddhhmmss where the date time stamp can be... (1 Reply)
Discussion started by: asinha63
1 Replies

3. Shell Programming and Scripting

script that detects duplicate files in directory

I need help with a script which accepts one argument and goes through all the files under a directory and prints a list of possible duplicate files As its output, it prints zero or more lines, each one containing a space-separated list of filenames. All the files listed on one line have the same... (1 Reply)
Discussion started by: trueman82
1 Replies

4. Shell Programming and Scripting

Delete Some Old files from Particular Directory

Hi Team, I am new to scripting. I want to create a script, which needs to keep only 5 days directories and want to remove the old directory from a particular directory. Can Somebody help me with starting this script. All my directories will be created in the name <YYYYMMDD>. Thanks... (2 Replies)
Discussion started by: siva80_cit
2 Replies

5. UNIX for Dummies Questions & Answers

Production Directory Structures

We (our company) has just purchased a new IBM unix machine. We have been doing some research and have found that it is NOT a good idea to put your own in-house-written applications under the existing file folders such as /usr or /bin ect. Instead you should place these applications in directories... (7 Replies)
Discussion started by: jbrubaker
7 Replies

6. Shell Programming and Scripting

Delete all files if another files in the same directory has a matching occurence of a specific word

Hello, I have several files in a specific directory. A specific string in one file can occur in another files. If this string is in other files. Then all the files in which this string occured should be deleted and only 1 file should remain with the string. Example. file1 ShortName "Blue... (2 Replies)
Discussion started by: premier_de
2 Replies

7. Shell Programming and Scripting

Remove duplicate files in same directory

Hi all. Am doing continuous backup of mailboxes using rsync. So whenever a new mail arrives it is automatically copied on backup server. When a new mail arrives it is named as xyz:2, when it is read by the email client an S is appended xyz:2,S Eventually , 2 copies of the same file exist on... (7 Replies)
Discussion started by: coolatt
7 Replies

8. Ubuntu

delete duplicate rows with awk files

Hi every body I have some text file with a lots of duplicate rows like this: 165.179.568.197 154.893.836.174 242.473.396.153 165.179.568.197 165.179.568.197 165.179.568.197 154.893.836.174 how can I delete the repeated rows? Thanks Saeideh (2 Replies)
Discussion started by: sashtari
2 Replies

9. Shell Programming and Scripting

Delete all files if another files in the same directory has a matching occurrence of a specific word

he following are the files available in my directory RSK_123_20141113_031500.txt RSK_123_20141113_081500.txt RSK_126_20141113_041500.txt RSK_126_20141113_081800.txt RSK_128_20141113_091600.txt Here, "RSK" is file prefix and 123 is a code name and rest is just timestamp of the file when its... (7 Replies)
Discussion started by: kridhick
7 Replies

10. Shell Programming and Scripting

Script needed to delete to the list of files in a directory based on last created & delete them

Hi My directory structure is as below. dir1, dir2, dir3 I have the list of files to be deleted in the below path as below. /staging/retain_for_2years/Cleanup/log $ ls -lrt total 0 drwxr-xr-x 2 nobody nobody 256 Mar 01 16:15 01-MAR-2015_SPDBS2 drwxr-xr-x 2 root ... (2 Replies)
Discussion started by: prasadn
2 Replies
dircmp(1)							   User Commands							 dircmp(1)

NAME
dircmp - directory comparison SYNOPSIS
dircmp [-ds] [-w n] dir1 dir2 DESCRIPTION
The dircmp command examines dir1 and dir2 and generates various tabulated information about the contents of the directories. Listings of files that are unique to each directory are generated for all the options. If no option is entered, a list is output indicating whether the file names common to both directories have the same contents. OPTIONS
The following options are supported: -d Compares the contents of files with the same name in both directories and output a list telling what must be changed in the two files to bring them into agreement. The list format is described in diff(1). -s Suppresses messages about identical files. -w n Changes the width of the output line to n characters. The default width is 72. OPERANDS
The following operands are supported: dir1 A path name of a directory to be compared. dir2 USAGE
See largefile(5) for the description of the behavior of dircmp when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes). ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of dircmp: LC_COLLATE, LC_CTYPE, LC_MES- SAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 Successful completion. >0 An error occurred. (Differences in directory contents are not considered errors.) ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ SEE ALSO
cmp(1), diff(1), attributes(5), environ(5), largefile(5) SunOS 5.11 1 Feb 1995 dircmp(1)
All times are GMT -4. The time now is 11:46 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy