Hi Everyone,
I have a flat file of 1000 unique records like following : For eg
Andy,Flower,201-987-0000,12/23/01
Andrew,Smith,101-387-3400,11/12/01
Ani,Ross,401-757-8640,10/4/01
Rich,Finny,245-308-0000,2/27/06
Craig,Ford,842-094-8740,1/3/04
.
.
.
.
.
.
Now I want to duplicate... (9 Replies)
I want to remove the records based on duplicate. I want to remove if two or more records exists with combination fields. Those records should not come once also
file abc.txt
ABC;123;XYB;HELLO;
ABC;123;HKL;HELLO;
CDE;123;LLKJ;HELLO;
ABC;123;LSDK;HELLO;
CDF;344;SLK;TEST
key fields are... (7 Replies)
Hi,
I have a file with these records
abc
xyz
xyz
pqr
uvw
cde
cde
In my o/p file , I want all the non duplicate rows to be shown.
o/p abc
pqr
uvw
Any suggestions how to do this?
Thanks for the help.
rs (2 Replies)
Hi
I have a table which has 2 columns - id and amount.
If there duplicate rows , as in id and amount are same , then i have to update the table in such away that only one row should contain amount and all rows should become zero for that id.
eg
id amount
1 100
1 100
2 200
1... (5 Replies)
Consider my input is
10
10
20
then,
uniq -u will give 20 and uniq -dwill return 10.
But i need the output as ,
10
10
How we can achieve this?
Thanks (4 Replies)
I have 2 files
"File 1" is delimited by ";" and "File 2" is delimited by "|".
File 1 below (3 record shown):
Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones
Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull
Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Hi,
i am working on a script that would remove records or lines in a flat file. The only difference in the file is the "NOT NULL" word. Please see below example of the input file.
INPUT FILE:>
CREATE a
(
TRIAL_CLIENT NOT NULL VARCHAR2(60),
TRIAL_FUND NOT NULL... (3 Replies)
Gents,
Please how I can get only the last 2 records from repetead values, from column 2
input
1 1011
1 1011
1 1012
1 1012
1 5001
1 5001
1 5002
1 5002
1 5003
1 5003
1 7001
1 7001
1 7002
1 7002 (2 Replies)
Gents,
I have a file which contends duplicate records in column 1, but the values in column 2 are different.
3099753489 3
3099753489 5
3101954341 12
3101954341 14
3102153285 3
3102153285 5
3102153297 3
3102153297 5
I will like to get something like this:
output desired... (16 Replies)
Gents,
Please give a help
file
--BAD STATUS NOT RESHOOTED--
*** VP 41255/51341 in sw 2973
*** VP 41679/51521 in sw 2973
*** VP 41687/51653 in sw 2973
*** VP 41719/51629 in sw 2976
--BAD COG NOT RESHOOTED--
*** VP 41689/51497 in sw 2974
*** VP 41699/51677 in sw 2974
*** VP... (18 Replies)
Discussion started by: jiam912
18 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)