I'm trying to compare the first column values in two different files that use a numerical value as the key and output the more meaningful value found in the second column of file1 in front of the matching line(s) in file2. My problem is that file2 has multiple records. For example given:
FILE1... (4 Replies)
I have two files - file1 and file2. Now I want records in file2 those are not exist in file1. How to grep this ?
eg:
file1
08941
08944
08945
08946
08947
file2
08942 08944 5
08942 08945 5
08942 08946 4
08942 08947 6
08942 08952 4
08942 08963 5
08942 ... (3 Replies)
Hi All,
I have two files say file1 and file2.
I want to check the number of records in file1 and if its atleast 2 (i.e., 2 or greater than 2 ) then I have to check records in file2 .If records in file2 is atleast 1 (i.e. if its not empty ) i have to set some conditions .
Could you pls... (3 Replies)
I am not an expert in awk, SED, etc... but I really hope there is a way to do this, because I don't want to have to right a program. I am using C shell.
FILE 1 FILE 2
H0000000 H0000000
MA1 MA1
CA1DDDDDD CA1AAAAAA
MA2 ... (2 Replies)
I have 2 zip files which have about 20 million records in each file. file 2 will have additional records than file 1. I want to compare the records in both the files and capture the new records from file 2 into another file file3. Please help me with a command/script which provides me the desired... (8 Replies)
Hi ,
My requirement is to Compare 2 files having different number of columns and records and get the ouptut containing all the non-matching records from File A(with all column values ) .Example data below :
File A contains following :
Aishvarya |1234... (4 Replies)
hi.. i am using solaris system and ksh and using nawk to get records of file1 not in file2(not line by line comparison). code i am using is nawk 'NR==FNR{a++} !a {print"line:" FNR"->" $0} ' file2 file1
same command with awk runs perfectly on darwin kernel(mac) but in solaris it does line by... (2 Replies)
hi.. I want to compare records present in 1 file with those in 3 other files and print those records of file 1 which are not present in any of the files. for eg -
file1 file2 file3 file4
1 1 5 7
2 2 6 9
3
4
5
6
7
8
9
... (3 Replies)
Hi,
I am using Sun Solaris - SunOS. I have two fixed width files shown below. I am trying to find the changes in the records in the Newfile.txt for the records where the key column matches. The first column is a key column (example: A123).
If there are any new or deletion of records in the... (4 Replies)
Hi,
We have created a script that's checks the latency of IIDR subscription by fetching details from a config file (that contains subscription details) and running the CHCCLP command. The out put is then concatenated in a csv file. Once all subscription details are saved the script send a mail... (7 Replies)
Discussion started by: ab095
7 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)