Hi, I'm writing a ksh script and trying to use an awk / sed / or perl one-liner to remove the last 4 characters of a line in a file if it begins with a period.
Here is the contents of the file... the column in which I want to remove the last 4 characters is the last column. ($6 in awk). I've... (10 Replies)
Need to remove rest of line after the equals sign on search pattern from the searchfile. Can anybody help. Couldn't find any similar example in the forum:
infile:
64_1535: Delm. = 86 var, aaga
64_1535: Fran. = 57 ex. ccc
64_1639: Feb. = 26 (link). def
64_1817: mar. = 3/4. drz ... (7 Replies)
awk , sed Experts,
I want to remove first and last line after pattern match "vg" :
I am trying : # sed '1d;$d' works fine , but where the last line is not having vg entry it is deleting one line of data.
- So it should check for the pattern vg if present , then it should delete the line ,... (5 Replies)
I am an awk beginner and need help figuring out how to search for a number in the first column and if it (or anything greater) exists, remove those lines.
AM11400012012 2.26 2.12 1.98 2.52 3.53 3.01 3.62 5.00 3.65 7.95 0.79 3.88 0.00
AM11400012013 3.39 2.29 ... (1 Reply)
the following pattern match works correctly for me
awk '/name="Fruits"/{f=1;next} /"name=Vegetables"/{f=0} f' filename
This works well for me. Id like to temporarily move the match out of the file ( > newfile) and be able to stick it back in the same place at a later time.
Is this... (7 Replies)
Hi
I need to do a patten match between files .
I am new to shell scripting and have come up with this so far. It take 50 seconds to process files of 2mb size . I need to tune this code as file size will be around 50mb and need to save time.
Main issue is that I need to search the pattern from... (2 Replies)
Sorry for the weird title but i have the following problem.
We have several files which have between 10000 and about 500000 lines in them. From these files we want to remove lines which contain a pattern which is located in another file (around 20000 lines, all EAN codes). We also want to get... (28 Replies)
In the awk below I am trying to remove all instances after a ; (semi-colon) or , (comma) in the ANN= pattern. I am using gsub
to substitute an empty string in these, so that ANN= is a single value (with only one value in it the one right after the ANN=). Thank you :).
I have comented my awk and... (11 Replies)
In the awk below I am trying to remove all lines above and including the pattern Test or Test2. Each block is seperated by a newline and Test2 also appears in the lines to keep but it will always have additional text after it. The Test to remove will not. The awk executed until the || was added... (2 Replies)
In the awk piped to sed below I am trying to format file by removing the odd xxxx_digits and whitespace after, then move the even xxxx_digit to the line above it and add a space between them. There may be multiple lines in file but they are in the same format. The Filename_ID line is the last line... (4 Replies)
Discussion started by: cmccabe
4 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)