I have file1 and file2:
file1:
11 xxx kksd ...
22 kkk kdsglg...
33 sss kdfjdksa...
44 kdsf dskjfkas ...
hh kdkf kdkkd..
jg dkf dfkdk ...
...
file2:
jg
22
hh
...
I need to check each line of file1. if the field one is in file2, I will keep it; if not, the whole line will be... (17 Replies)
I have searched about 30 threads, a load of Google pages and cannot find what I am looking for. I have some of the parts but not the whole. I cannot seem to get the puzzle fit together.
I have three folders, two of which contain different versions of multiple files, dist/file1.php dist/file2.php... (4 Replies)
Hi, all:
I've got two folders, say, "folder1" and "folder2".
Under each, there are thousands of files.
It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command.
However, if I change the above question a... (1 Reply)
I have four files, I need to compare these files together.
As such i know "sdiff and comm" commands but these commands compare 2 files together. If I use sdiff command then i have to compare each file with other which will increase the codes.
Please suggest if you know some commands whcih can... (6 Replies)
Please help me with awk.I have two files with the below details
file1
123456789 2012
987654321 2011
a1234567892012
a1234abcde2012
b1234567892012
c1234567892012
98765a12342012
file2
a1234
01234
b1234
33333
I need to check whether the items in file2 is present in file1 .If it is... (2 Replies)
I want to compare two files, and search for items that are in both. Then override the first file with that containing only elements which were in both files. I imagine something with diff, but not sure.
File 1
One
Two
Three
Four
Five
File 2
One
Three
Four
Six
Eight (2 Replies)
I have this code
awk 'NR==FNR{a=$1;next} a' file1 file2
which does what I need it to do, but for only two files. I want to make it so that I can have multiple files (for example 30) and the code will return only the items that are in every single one of those files and ignore the ones... (7 Replies)
hi all,
Thanks to all for your great help...
I have a scenario that I have two files (file1 & file2). I need to compare two files entire row by row and share the output if any discrepancies within two files.
File1:
DB1|TB1|C1,C3
DB2|TB2|C1,C2
DB3|TB3|C1,C2,C3,C4
File2:
... (2 Replies)
Discussion started by: Selva_2507
2 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)