12-22-2009
Have you searched this forum for "matching columns"? There are many threads dealing with just this type of problem. My personal suggestion would be to use a database, since it will give you fault tolerant processing and you have an additional reason to use a database, which is the large number of files. Hope this helps.
10 More Discussions You Might Find Interesting
1. UNIX for Advanced & Expert Users
I have the following data in FILE1.CSV:
code: Amount1: Amount2:
xxxxx ,, 200 ,,400
yyxxa ,,200
bbcgu ,,2500 ,,300
i want to be able to produce the following FILE2.CSV:
code: Amount
xxxxx ,, 600... (7 Replies)
Discussion started by: chachabronson
7 Replies
2. UNIX for Dummies Questions & Answers
I want to select the first column from a daily file called foo.csv. The result is written to file foo.txt. Currently the following script is used for that:
cut -d, -f 1 foo.csv > foo.txt
A typical result would yield :
A12
A45
B11
B67
What needs to happen in addition is that two columns... (5 Replies)
Discussion started by: figaro
5 Replies
3. Shell Programming and Scripting
Hey everyone!
I have a need to add 2 files together as columns.
For instance, I have one file that has several rows of data and I want to take data from another file and add Line 1 to the end of Line1 in the first file
file1 line1.........file2 line1
file1 line2.........file2 line2... (12 Replies)
Discussion started by: Kelam_Magnus
12 Replies
4. Shell Programming and Scripting
Hello everyone,
I have two files containing 6 columns and thousands of rows. I want to add them (i.e. first column of first file + first column of second file and so on) and print the output in a third file. Can you please help me.
Thanks a lot (7 Replies)
Discussion started by: chandra321
7 Replies
5. Shell Programming and Scripting
i have a file with two columns, and i want to uniquely sort the values in fist column and add the corresponding values in the second columns
eg
file a contents
tom 200
john 300
sow 500
tom 800
james 50
sow 300
output shpould be in file b as
tom 1000
john 300
sow 800
james 50 (0 Replies)
Discussion started by: dealerso
0 Replies
6. UNIX for Dummies Questions & Answers
I have a file in which I need to add more columns to based on a key in the first file:
File1
key1,abc,123,
key2,def,456,
key3,ghi,789,
File2
key2,zyx,111,qqq,
key3,yuu,222,www,
key1,pui,333,eee,
key4,xxx,999,rrr,
I would like to create the following output:
Output (1 Reply)
Discussion started by: WongSifu
1 Replies
7. Shell Programming and Scripting
Hello all,
I'm in the process of writing a script, and I need to be able to add columns of time in the following format (time elapsed Net Backup logs):
000:01:03
000:00:58
000:00:49
Does anyone have a way of converting and/or adding timestamps such as these accurately?
Thanks in... (9 Replies)
Discussion started by: LinuxRacr
9 Replies
8. Shell Programming and Scripting
Hello,
I am using AWK in UBUNTU 12.04.
I have a dataset as follows:
1 2 12 1 4 1 4 1 7 9 4 6
1 2 4 5 7 8 45 7 4 5 7 5
What I want to do is to add the values of some columns to each other and print it in the same file as the new column while omitting the previous two columns to have... (3 Replies)
Discussion started by: Homa
3 Replies
9. Shell Programming and Scripting
Hello
I have a file as below
chr1 start ref alt code1 code2
chr1 18884 C CAAAA 2 0
chr1 135419 TATACA T 2 0
chr1 332045 T TTG 0 2
chr1 453838 T TAC 2 0
chr1 567652 T TG 1 0
chr1 602541 ... (2 Replies)
Discussion started by: plumb_r
2 Replies
10. Shell Programming and Scripting
I have two files, file1 and file2 who have identical number of rows and columns. However, the script is supposed to be used for for different files and I cannot know the format in advance. Also, the number of columns changes within the file, some rows have more and some less columns (they are... (13 Replies)
Discussion started by: maya3
13 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)
NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS
--predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO
bup-midx(1), bup-save(1)
BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown- bup-margin(1)