Sponsored Content
Top Forums Shell Programming and Scripting Help with selecting files from "diff" output Post 302823095 by drl on Tuesday 18th of June 2013 07:25:58 PM
Old 06-18-2013
Hi.

There are standard (Linux) utilities to compare directories. Whether they will fulfill all your desires is up to you to find out.

See find same size file for a demonstration of fdupes and rdfind.

Best wishes ... cheers, drl

Last edited by drl; 06-18-2013 at 08:31 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

reformat the output from "diff" command

Hi all, I use the diff command and got the output: $> diff -e file1.txt file2.txt 15a 000675695 Yi Chen Chen 200520 EASY 50 2/28/05 0:00 SCAD Debit Card Charge . 12a 000731176 Sarah Anderson 200520 EASY 25 2/28/05 0:00 SCAD Debit Card Charge . 11a... (5 Replies)
Discussion started by: CamTu
5 Replies

2. UNIX for Dummies Questions & Answers

diff on c-source file always returns "files differ"

I have a c-source file that is evidently seen by unix as a binary file. When doing a diff between it and older versions with substantial differences, diff will only return "files differ". I have tried cat-ing the file to another file; tried using the "-h" on the diff; I have tried ftp-ing it... (7 Replies)
Discussion started by: C-Prog-Man
7 Replies

3. Debian

Debian: doubt in "top" %CPU and "sar" output

Hi All, I am running my application on a dual cpu debian linux 3.0 (2.4.19 kernel). For my application: <sar -U ALL> CPU %user %nice %system %idle ... 10:58:04 0 153.10 0.00 38.76 0.00 10:58:04 1 3.88 0.00 4.26 ... (0 Replies)
Discussion started by: jaduks
0 Replies

4. Shell Programming and Scripting

error "integer expression expected" when selecting values

dear members, I am having some difficulties with an automation script that I am writing. We have equipments deployed over our network that generate status messages and I was trying an automated method to collect all information. I did a expect script that telnet all devices, logs, asks for... (4 Replies)
Discussion started by: jorlando
4 Replies

5. Solaris

significance of "+" char in SunOS "ls -l" output

Hi, I've noticed that the permissions output from "ls -l" under SunOS differs from Linux in that after the "rwxrwxrwx" field, there is an additional "+" character that may or may not be there. What is the significance of this character? Thanks, Suan (6 Replies)
Discussion started by: sayeo
6 Replies

6. UNIX for Dummies Questions & Answers

Explanation of "total" field in "ls -l" command output

When I do a listing in one particular directory (ls -al) I get: total 43456 drwxrwxrwx 2 root root 4096 drwxrwxrwx 3 root root 4096 -rwxrwxr-x 1 nobody nobody 3701594 -rwxrwxr-x 1 nobody nobody 3108510 -rwxrwxr-x 1 nobody nobody 3070580 -rwxrwxr-x 1 nobody nobody 3099733 -rwxrwxr-x 1... (1 Reply)
Discussion started by: proactiveaditya
1 Replies

7. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

8. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

9. Shell Programming and Scripting

Delete all log files older than 10 day and whose first string of the first line is "MSH" or "<?xml"

Dear Ladies & Gents, I have a requirement to delete all the log files in /var/log/test directory that are older than 10 days and their first line begin with "MSH" or "<?xml" or "FHS". I've put together the following BASH script, but it's erroring out: for filename in $(find /var/log/test... (2 Replies)
Discussion started by: Hiroshi
2 Replies

10. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies
rdfind(1)							      rdfind								 rdfind(1)

NAME
rdfind - finds duplicate files SYNOPSIS
rdfind [ options ] directory1 | file1 [ directory2 | file2 ] ... DESCRIPTION
rdfind finds duplicate files across and/or within several directories. It calculates checksum only if necessary. rdfind runs in O(Nlog(N)) time with N being the number of files. If two (or more) equal files are found, the program decides which of them is the original and the rest are considered duplicates. This is done by ranking the files to each other and deciding which has the highest rank. See section RANKING for details. If you need better control over the ranking than given, you can use some preprocessor which sorts the file names in desired order and then run the program using xargs. See examples below for how to use find and xargs in conjunction with rdfind. To include files or directories that have names starting with -, use rdfind ./- to not confuse them with options. RANKING
Given two or more equal files, the one with the highest rank is selected to be the original and the rest are duplicates. The rules of rank- ing are given below, where the rules are executed from start until an original has been found. Given two files A and B which have equal content, the ranking is as follows: If A was found while scanning an input argument earlier than than B, A is higher ranked. If A was found at a depth lower than B, A is higher ranked (A closer to the root) If A was found earlier than B, A is higher ranked. The last rule is needed when two files are found in the same directory (obviously not given in separate arguments, otherwise the first rule applies) and gives the same order between the files as the operating system delivers the files while listing the directory. This is operat- ing system specific behaviour. OPTIONS
Searching options etc: -ignoreempty true|false Ignore empty files. (default) -followsymlinks true|false Follow symlinks. Default is false. -removeidentinode true|false removes items found which have identical inode and device ID. Default is true. -checksum md5|sha1 what type of checksum to be used: md5 or sha1. Default is md5. Action options: -makesymlinks true|false Replace duplicate files with symbolic links -makehardlinks true|false Replace duplicate files with hard links -makeresultsfile true|false Make a results file results.txt (default) in the current directory. -outputname name Make the results file name to be "name" instead of the default results.txt. -deleteduplicates true|false Delete (unlink) files. General options: -sleep Xms sleeps X milliseconds between reading each file, to reduce load. Default is 0 (no sleep). Note that only a few values are supported at present: 0,1-5,10,25,50,100 milliseconds. -n -dryrun displays what should have been done, dont actually delete or link anything. -h, -help, --help displays brief help message. -v, -version, --version displays version number. EXAMPLES
Search for duplicate files in home directory and a backup directory: rdfind ~ /mnt/backup Delete duplicate in a backup directory: rdfind -deletefiles true /mnt/backup Search for duplicate files in directories called foo: find . -type d -name foo -print0 |xargs -0 rdfind FILES
results.txt (the default name is results.txt and can be changed with option outputname, see above) The results file results.txt will con- tain one row per duplicate file found, along with a header row explaining the columns. A text describes why the file is considered a duplicate: DUPTYPE_UNKNOWN some internal error DUPTYPE_FIRST_OCCURRENCE the file that is considered to be the original. DUPTYPE_WITHIN_SAME_TREE files in the same tree (found when processing the directory in the same input argument as the original) DUPTYPE_OUTSIDE_TREE the file is found during processing another input argument than the original. ENVIRONMENT
DIAGNOSTICS
EXIT VALUES
0 on success, nonzero otherwise. BUGS
/FEATURES When specifying the same directory twice, it keeps the first encountered as the most important (original), and the rest as duplicates. This might not be what you want. The symlink creates absolute links. This might not be what you want. To create relative links instead, you may use the symlinks (2) com- mand, which is able to convert absolute links to relative links. Older versions unfortunately contained a misspelling on the word occurrence. This is now corrected (since 1.3), which might affect user scripts parsing the output file written by rdfind. There are lots of enhancements left to do. Please contribute! SECURITY CONSIDERATIONS
Avoid manipulating the directories while rdfind is reading. rdfind is quite brittle in that case. Especially, when deleting or making links, rdfind can be subject to a symlink attack. Use with care! AUTHOR
Paul Dreik 2006, reachable at rdfind@pauldreik.se Rdfind can be found at http://rdfind.pauldreik.se/ Do you find rdfind useful? Drop me a line! It is always fun to hear from people who actually use it and what data collections they run it on. THANKS
Several persons have helped with suggestions and improvements: Niels Moller, Carl Payne and Salvatore Ansani. Thanks also to you who tested the program and sent me feedback. VERSION
1.3.1 (release date 2012-05-07) svn id: $Id: rdfind.1 766 2012-05-07 17:26:17Z pauls $ COPYRIGHT
This program is distributed under GPLv2 or later, at your option. SEE ALSO
md5sum(1), find(1), symlinks(2) May 2012 1.3.1 rdfind(1)
All times are GMT -4. The time now is 06:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy