09-10-2013
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi I have fakebook.csv as following:
F1(current date) F2(popularity) F3(name of book) F4(release date of book)
2006-06-21,6860,"Harry Potter",2006-12-31
2006-06-22,,"Harry Potter",2006-12-31
2006-06-23,7120,"Harry Potter",2006-12-31
2006-06-24,,"Harry Potter",2006-12-31... (0 Replies)
Discussion started by: onthetopo
0 Replies
2. UNIX for Dummies Questions & Answers
Suppose I have 500 files in a directory and I need to Use awk to calculate average of column 3 for each of the file, how would I do that? (6 Replies)
Discussion started by: grossgermany
6 Replies
3. UNIX for Dummies Questions & Answers
Hello,
Is there a quick way to compute the average of a column data in a numerical tab delimeted file?
Thanks,
Gussi (2 Replies)
Discussion started by: Gussifinknottle
2 Replies
4. UNIX for Dummies Questions & Answers
I have a file that looks like this
452 025_E3
8 025_E3
82 025_F5
135 025_F5
5 025_F5
23 025_G2
38 025_G2
71 025_G2
9 026_A12
81 026_A12
10 026_A12
some of the elements in column2 are repeated.
I want an output file that will extract the... (1 Reply)
Discussion started by: FelipeAd
1 Replies
5. Shell Programming and Scripting
Hi, I tried to do this in excel but there is a limit to how many rows it can handle.
All I need to do is average each column in a file and get the final value.
My file looks something like this (obviously a lot larger):
Joe HHR + 1 2 3 4 5 6 7 8
Jor HHR - 1 2 3 4 5 6 7 8
the output... (1 Reply)
Discussion started by: kylle345
1 Replies
6. Shell Programming and Scripting
Dear All,
I have this file tab delimited
A 1 12 22
B 3 34 33
C 55 9 32
A 12 81 71
D 11 1 66
E 455 4 2
B 89 4 3
I would like to make the average every column where the first column is the same, for example,
A 6,5 46,5 46,5
B 46,0 19,0 18,0
C 55,0 9,0 32,0
D 11,0 1,0 66,0... (8 Replies)
Discussion started by: paolo.kunder
8 Replies
7. Shell Programming and Scripting
I have a lot of input files that have the following form:
Sample Cq Sample Cq Sample Cq Sample Cq Sample Cq
1WBIN 23.45 1WBIN 23.45 1CVSIN 23.96 1CVSIN 23.14 S1 31.37
1WBIN 23.53 1WBIN 23.53 1CVSIN 23.81 1CVSIN 23.24 S1 31.49
1WBIN 24.55 1WBIN 24.55 1CVSIN 23.86 1CVSIN 23.24 S1 31.74 ... (3 Replies)
Discussion started by: isildur1234
3 Replies
8. Shell Programming and Scripting
Hi,
I would like to calculate the average of column 'y' based on the value of column 'pos'.
For example, here is file1
id pos y c
11 1 220 aa
11 4333 207 f
11 5333 112 ee
11 11116 305 e
11 11117 310 r
11 22228 781 gg
11 ... (2 Replies)
Discussion started by: jackken007
2 Replies
9. Shell Programming and Scripting
Hi,
My input file
Gene1 1
Gene1 2
Gene1 3
Gene1 0
Gene2 0
Gene2 0
Gene2 4
Gene2 8
Gene3 9
Gene3 9
Gene4 0
Condition:
If the first column matches, then look in the second column. If there is a value of zero in the second column, then don't consider that record while averaging.
... (5 Replies)
Discussion started by: jacobs.smith
5 Replies
10. Shell Programming and Scripting
Hi all,
Does anyone know of an efficient unix script to average each numeric column of a multi-column tab delimited file (with header) with some character columns.
Here is an example input file:
CHR RS_ID ALLELE POP1 POP2 POP3 POP4 POP5 POP6 POP7 POP8... (7 Replies)
Discussion started by: Geneanalyst
7 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)
NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS
--predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO
bup-midx(1), bup-save(1)
BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown- bup-margin(1)