Finding duplicate files in two base directories Post: 302917943

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Finding executable files in all directories

This is probably very easy but I would like to know a way to list all my files in all my directories that are readable and executable to everyone. I was told to use find or ls and I tried some stuff but couldnt get it to work. I understand that its dangerous to have files with these permissions for...

2. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more...

3. Shell Programming and Scripting

duplicate directories

Hi, I have file which users like filename ->"readfile", following entries peter john alaska abcd xyz and i have directory /var/ i want to do first cat of "readfile" line by line and first read peter in variable and also cross check with /var/ how many directories are avaialble...

4. UNIX for Dummies Questions & Answers

finding largest files (not directories)?

hello all. i would like to be able to find the names of all files on a remote machine using ssh. i only want the names of files, not directories so far i'm stuck at "du -a | sort -n" also, is it possible to write them to a file on my machine? i know how to write it to a file on that...

5. Shell Programming and Scripting

Finding Duplicate files

How do you delete and and find duplicate files?

6. Shell Programming and Scripting

Script for parsing directories one level and finding directories older than n days

Hello all, Here's the deal...I have one directory with many subdirs and files. What I want to find out is who is keeping old files and directories...say files and dirs that they didn't use since a number of n days, only one level under the initial dir. Output to a file. A script for...

7. UNIX for Dummies Questions & Answers

[Solved] Finding the Files In the Same Name Directories

Hi, In the Unix Box, I have a situation, where there is folder name called "Projects" and in that i have 20 Folders S1,S2,S3...S20. In each of the Folders S1,S2,S3,...S20 , there is a same name folder named "MP". So Now, I want to get all the files in all the "MP" Folders and write all those...

8. Shell Programming and Scripting

finding matches between multiple files from different directories

Hi all... Can somebody pls help me with this... I have a directory (dir1) which has many subdirectories(vr001,vr002,vr003..) with each subdir containing similar text file(say ras.txt). I have another directory(dir2) which has again got some subdir(vr001c,vr002c,vr003c..) with each subdir...

9. Shell Programming and Scripting

Finding non-existing words in a list of files in a directory and its sub-directories

Hi All, I have a list of words (these are actually a list of database table names separated by comma). Now, I want to find only the non-existing list of words in the *.java files of current directory and/or its sub-directories. Sample list of words:...

10. Shell Programming and Scripting

Finding files deep in directories

i need to find a portable way to go through multiple directories to find a file. I've trid something like this: find /opt/oracle/diag/*/alert_HH2.log -printordinarily, i can run the ls command and it will find it: /opt/oracle/diag/*/*/*/*/alert_HH2.log The problem with this approach is...

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Finding executable files in all directories

Discussion started by: CSGUY

2. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Discussion started by: jerome Sukumar

3. Shell Programming and Scripting

duplicate directories

Discussion started by: learnbash

4. UNIX for Dummies Questions & Answers

finding largest files (not directories)?

Discussion started by: user19190989