Hello,
I want to compare two files. All records in file 2 that are not in file 1 should be output to file 3.
For example:
file 1
123
1234
123456
file 2
123
2345
23456
file 3 should have
2345
23456
I have looked at diff, bdiff, cmp, comm, diff3 without any luck! (2 Replies)
I need to compare 2 diff type of files and find out the duplicate after comparing each types of files:
Type 1 file name is like: file1.abc
(the extension abc could any 3 characters but I can narrow it down or hardcode for 10/15 combinations).
The other file is file1.bcd01abc (the extension... (2 Replies)
i want to compare 2 files and generate below 3 files:
1. new lines
2. updated lines
3. deleted lines
are there any one liners for each one of them.
Note the method to find duplicates is based on field 1, values are separated by '|'
example:
test1 (older file)
1|XXX
2|YYY... (3 Replies)
Hi,
Can anyone let me know what is difference between
grep .* foo.c
grep '.*' foo.c
I am not able to understand what is exact difference.
Thanks in advance (2 Replies)
Hi,
I'm having trouble with a script to copy one line out of multiple files in a directory and copy to a file called test. I've tried the code below but it copies one line out of the first file multiple times not one line out of all the files. Would someone help? I'm very new to all this.
... (8 Replies)
Hi everyone
I am trying to write a script to check if file systems are mounted, and also validate the permission; then do a whole bunch of other things. I am facing a problem with grep. For example, if the mountpoints are:
/dev/XYZ_lv /abc/XYZ jfs2 Nov 25 20:36... (4 Replies)
I have a need to grep a large number of files, but only display the first result from each file. I have tried to use grep, but am not limited to it. I can use perl and awk as well. Please help! (9 Replies)
Hi Gurus,
I have two big files. I need to compare the different. currently, I am using
sort file1 > file1_temp;
sort file2 > file2_tmp
diff file1_tmp file2_tmp
I can use command
grep -v -f file1 file2
just wondering which way is fast to compare two big files.
Thanks... (4 Replies)
Guys i have 3 files,
but i want to compare and diff only the 2nd column
path=`/home/whois/doms`
for i in `cat domain.tx`
do
whois $i| sed -n '/Registry Registrant ID:/,/Registrant Email:/p' > $path/$i.registrant
whois $i| sed -n '/Registry Admin ID:/,/Admin Email:/p' > $path/$i.admin... (10 Replies)
Discussion started by: kenshinhimura
10 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)