Remove Duplicate Filenames in 2 very large directories
Hello Gurus,
O/S RHEL4
I have a requirement to compare two linux based directories for duplicate filenames and remove them. These directories are close to 2 TB each. I have tried running a:
I have tried this as well:
I wanted to get the output of the above command and place in a variable for a deletion. This scenario does not work and the machines load goes to high for production. I have also thought of trying a rsync with the delete flag, and I am unsure if this will compare both directories successfully.
Can someone please point me in the right direction as to what commands or scenarios will best accomplish my task.
I have also tried to google this on unix.com as well as the web.
Your support and assistance is greatly appreciated.
Jaysunn
Last edited by jaysunn; 09-24-2009 at 11:56 AM..
Reason: Added O/S
Hi!
I have thousands of sub-directories, and hundreds of thousands of files in them. What is the fast way to find out which files are older than a certain date? Is the "find" command the fastest? Or is there some other way?
Right now I have a C script that traverses through and checks... (5 Replies)
I'm trying to put together a shell script that will append specific prefixes based on the content of filenames. I think I have this part down. However, I want to append before that part a process that will remove the current prefix before it renames the files with the new prefix.
For example,... (6 Replies)
I have a problem where tar is somehow creating duplicate filenames when tarring a directory. Doing an ls on the directory does not show any duplicate filenames, yet when the directory is tarred, you can see that there are duplicates:
bash-2.03# pwd
/var/log/cricket
bash-2.03# ls -1 | sort |... (2 Replies)
Hi,
I have file which users like
filename ->"readfile", following entries
peter
john
alaska
abcd
xyz
and i have directory /var/
i want to do first cat of "readfile" line by line and first read peter in variable and also cross check with /var/ how many directories are avaialble... (8 Replies)
I have a large list of filenames from an Excel sheet, which I then translate into a simple text file. I'd like to use this list, which contains various file extensions , to archive these files and then remove them recursively through multiple directories and subdirectories. So far, it looks like... (5 Replies)
below is the script to rename filenames ending with .pdf extension.
I want the script to enter directories and search for all pdf and then if it is in the format file_amb_2008.pdf , then change it to 2008_amb_file.pdf, and this script should work only for .pdf files.
help required to make the... (12 Replies)
Hi Gurus,
Do any kind souls encounter have the same script as mentioned here.
Find and compare filenames in different mount point and remove duplicates.
Thanks a million!!!
wanna13e (7 Replies)
I have noticed that the same folder (and contents) lives in
/u/public and /usr/public
Question was this put here intentionally or by accident?
Its 31Gb in size and on a 72Gb HDD that leaves little room for apps.
It is a nework shared drive for all to access e.g. p: points to... (0 Replies)
Hi,
I have files like below, In files coming as spaces. Before transfering those files into ftp server. I want to remove the spaces and then can transfer the files into unix server.
e.g: filenames are
1) SHmail _profile001_20120908.txt
2) SHmail_profile001 _20120908.txt
3) sh... (3 Replies)
Is there a way via some bash script or just cmd to find duplicate directories?
i have main folders:
TEST1
TEST2
In folder TEST1 is some amount of same folders as in folder TEST2
can be this done? i tried fdupe but it only search for dupe files not whle dirs
thx! (8 Replies)
Discussion started by: ZerO13
8 Replies
LEARN ABOUT MINIX
diff
DIFF(1) General Commands Manual DIFF(1)NAME
diff - print differences between two files
SYNOPSIS
diff [-c | -e | -C n] [-br]file1 file2
OPTIONS -C n Produce output that contains n lines of context
-b Ignore white space when comparing
-c Produce output that contains three lines of context
-e Produce an ed-script to convert file1 into file2
-r Apply diff recursively to files and directories of
EXAMPLES
diff file1 file2 # Print differences between 2 files
diff -C 0 file1 file2
# Same as above
diff -C 3 file1 file2
# Output three lines of context with every
diff -c file1 file2 # Same
diff /etc /dev # Compares recursively the directories /etc and /dev
diff passwd /etc # Compares ./passwd to /etc/passwd
DESCRIPTION
the same name, when file1 and file2 are both directories" difference encountered"
Diff compares two files and generates a list of lines telling how the two files differ. Lines may not be longer than 128 characters. If
the two arguments on the command line are both directories, diff recursively steps through all subdirectories comparing files of the same
name. If a file name is found only in one directory, a diagnostic message is written to stdout. A file that is of either block special,
character special or FIFO special type, cannot be compared to any other file. On the other hand, if there is one directory and one file
given on the command line, diff tries to compare the file with the same name as file in the directory directory.
SEE ALSO cdiff(1), cmp(1), comm(1), patch(1).
DIFF(1)