gawk '\
{
gsub(/(^ *|^\t)/, "", $0) \\ for cutting one or more space OR tab
}
/^input/ {
inputs[++in]=$0;
}
/^output/) {
outputs[++out]=$0;
}
END {
print "Inputs:"
for (x=0; x < in; ++x)
print inputs[x];
print "Outputs:"
for (x=0; x < out; ++x)
print outputs[x];
}
}' file
Last edited by otheus; 11-06-2008 at 11:10 AM..
Reason: forgot closing tag
I have following command which tells me File size in GBs which are greater than 0.01GBs recursively in a dir structure.
ls -l -R | awk '{ if ($5/1073741824 >= 0.01) print $9, $5/1073741824 }'
But there are some files whom I dont have enough permissions, after executing this script
gives me... (1 Reply)
Hi All,
I have the below input and expected ouput. I need a code which can scan through this input file and if the number in column1 is more than 1 , it will print out the whole line, else it will output "No Re-occurrence". Can anybody help ?
Input:
1 vvvvv 20 7 7 23 0 64
6 zzzzzz 11 5... (7 Replies)
file1 contain: (this just a small sample of data it may have thousand of lines)
1 aaa 1/01/1975 delhi
2 bbb 2/03/1977 mumbai
3 ccc 1/01/1975 mumbai
4 ddd 2/03/1977 chennai
5 aaa 1/01/1975 kolkatta
6 bbb 2/03/1977 bangalore
program:
nawk '{
idx= $2 SUBSEP $3
arr = (idx in arr) ?... (2 Replies)
Hi ,
i have file with delimiter as "|" and data in Double codes for all fields. how to filter data in a column like awk -F"|" '$1="asdf" {print $0}' test.
ex : "asdf"|"zxcv"
Thanks,
Soma (1 Reply)
I am trying to filter out some data with awk. If someone could help me that would be great. Below is my input file.
Date: 10-JUN-12 12:00:00
B 0: 00 00 00 00 10 00 16 28
B 120: 00 00 00 39 53 32 86 29
Date: 10-JUN-12 12:00:10
B 0: 00 00 00 00 10 01 11 22
B 120: 00 00 00 29 23 32 16 29... (5 Replies)
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
my data in csv-format ...
...
13/08/2012,16:30,303.30,5.10,3,2,2,1,9360.0,322... (13 Replies)
Hello,
Does anyone know an easy way to filter this type of file? I want to get everything that has score (column 2) 100.00 and get rid of duplicates (for example gi|332198263|gb|EGK18963.1| below), so I guess uniq can be used for this?
gi|3379182634|gb|EGK18561.1| 100.00... (6 Replies)
Hi,
I have some data like seen below.
format : apple(hhmm mm/dd).fruit
apple(2345 03/25).fruit
apple(2345 05/06).fruit
orange(0443 05/02).fruit
orange(0345 05/05).fruit
orange(2134 05/04).fruit
grape(0930 04/24).fruit
grape(2330 03/30).fruit
I need to get the data which are... (1 Reply)
Please consider the following file, I have many groups which can be of 3 types, T1 (Serial_Number 1) T2 (Serial_Number 2) and T1*T2 (all other Serial_Number).
I want to only consider groups that have both T1 and T2 present and their values are different from each other. In the example file,... (8 Replies)
Hi Everyone,
I need help on figuring out a way to filter some data that I get back from an API. Im able to get all the data that Im looking for but I would like to know a way for me to filter it better. The data that Im getting back is basically 2 rows of data as seen here.
Row 1 ... (25 Replies)
Discussion started by: TheStruggle
25 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)