08-20-2013
Last edited by rdcwayx; 08-21-2013 at 04:10 AM..
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I have a file with 3 columns in it that are comma separated and it has about 5000 lines. What I want to do is find the most common value in column 3 using awk or a shell script or whatever works! I'm totally stuck on how to do this.
e.g.
value1,value2,bob
value1,value2,bob... (12 Replies)
Discussion started by: Donkey25
12 Replies
2. Shell Programming and Scripting
Hi, I have two files file1 and file2. I have to merge the columns of those two files into file3 based on common column of two files. To be simple.
file1:
Row-id name1
13456 Rahul
16789 Vishal
18901 Karan
file2 :
Row-id place
18901 Mumbai
... (2 Replies)
Discussion started by: manneni prakash
2 Replies
3. Shell Programming and Scripting
Hello All,
Please help me with this file.
My input file (Tab separated) is like:
Abc-01 pc1 -0.69
Abc-01 E2cR 0.459666666666667
Abc-01 5ez.2 1.2265625
Xyz-01 pc1 -0.153
Xyz-01 E2cR 1.7358
Xyz-01 5ez.2 2.0254
Ced-02 pc1 -0.5714
Ced-02 ... (7 Replies)
Discussion started by: mira
7 Replies
4. UNIX for Dummies Questions & Answers
Dear all
I have big file with two columns
A_AA960715 GO:0006952
A_AA960715 GO:0008152
A_AA960715 GO:0016491
A_AA960715 GO:0007165
A_AA960715 GO:0005618
A_AA960716 GO:0006952
A_AA960716 GO:0005618
A_AA960716... (15 Replies)
Discussion started by: AAWT
15 Replies
5. Shell Programming and Scripting
Input file A.txt :-
C2062 -117.6 -118.5 -117.5
C5145 0 0 0
C5696 0 0 0
Output file B.txt
C2062 X -117.6
C2062 Y -118.5
C2062 Z -117.5... (4 Replies)
Discussion started by: asavaliya
4 Replies
6. Shell Programming and Scripting
I have the following records from multiple files.
415 A G
415 A G
415 A T
415 A .
415 A .
421 G A
421 G A,C
421 G A
421 G A
421 G A,C
421 G .
427 A C
427 A ... (3 Replies)
Discussion started by: empyrean
3 Replies
7. Shell Programming and Scripting
hi i have two files and i wanted to join them using common column. try to do this using "join" command but that did not help.
File 1:
123 9a.vcf hy92.vcf hy90.vcf
Index Ref Alt Ref Alt Ref Alt
315 14 0 7 4 ... (6 Replies)
Discussion started by: empyrean
6 Replies
8. Shell Programming and Scripting
HI, I have a 3-column tab separated column (approx 1GB) in which I would like to count and output the frequency of all of the common elements in the 1st column.
For instance:
If my input was the following:
dot is-big 2
dot is-round 3
dot is-gray 4
cat is-big 3
hot in-summer 5
My... (4 Replies)
Discussion started by: owwow14
4 Replies
9. Shell Programming and Scripting
Hi,
I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column)
The input is for example, after sorted:
K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies
10. Programming
Hi All,
I would like get the minimum value in the certain column with respect to other column.
For example, I have a text file like this.
ATOM 1 QSS SPH S 0 -2.790 -1.180 -2.282 2.28 2.28
ATOM 1 QSS SPH S 1 -2.915 -1.024 -2.032 2.31 2.31
ATOM 1 ... (4 Replies)
Discussion started by: bala06
4 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)
NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS
--predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO
bup-midx(1), bup-save(1)
BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown- bup-margin(1)