10 More Discussions You Might Find Interesting
1. UNIX for Beginners Questions & Answers
I have a csv which has lot of columns . I was looking for an awk script which would extract a column twice. for the first occurance the header and data needs to be intact but for the second occurance i want to replace the header name since it a duplicate and extract year value which is in ddmmyy... (10 Replies)
Discussion started by: Kunalcurious
10 Replies
2. Shell Programming and Scripting
Hello gurus,
I have a lookup table
cat tmp1
\\\erw``~ 1
^774574574565665f\] 2
()42543^
and I`m trying to compare a bunch of strings such that, either the lookup table column 1, or the string to be looked up are substrings of each other (and return the second lookup column if yes).
... (2 Replies)
Discussion started by: sheetalk
2 Replies
3. Shell Programming and Scripting
Hello, I would like to know what is the three most abundant substrings of length 6 from col2. The file is quite large and looks like this
col1 col2
EN03 typehellobyedogcatcatdog
EN09 typehellobyebyebyebye
EN08 dogcatcatdogbyebyebyebye
EN09 catcattypehellobyebyebyebye... (9 Replies)
Discussion started by: verse123
9 Replies
4. Shell Programming and Scripting
I have a log file like below.
66.249.73.11 - - "UCiZ7QocVqYAABgwfP8AAHAA" "US" "Mediapartners-Google" "-" www.mahashwetha.com.sg "GET... (2 Replies)
Discussion started by: Tuxidow
2 Replies
5. Shell Programming and Scripting
Hello Everyone,
I am looking for a way to extract substrings to local variables. Here is the format of the string variable i am using :
/var/x/www && /usr/x/share/doc && /etc/x/logs
where the substrings i must extract are the "/var/x/www" and such.
I was originally thinking of using... (15 Replies)
Discussion started by: jimmy75_13
15 Replies
6. Shell Programming and Scripting
Hello,
I really would appreciate some help with a bash script for some string manipulation on an SQL dump:
I'd like to be able to rename "sites/WHATEVER/files" to "sites/SOMETHINGELSE/files" within the sql dump.
This is quite easy with sed:
sed -e... (1 Reply)
Discussion started by: otrotipo
1 Replies
7. Shell Programming and Scripting
Hi guys,
I am stuck in this problem. Please help.
I have two files.
FILE1 (with records starting from '>' )
>TC1723_3 similar to Scific_A7Q9Q3
EMSPSQDYCDDYFKLTYPCTAGAQYYGRGALPVYWNYNYGAIGEALKLDLLNHPEYIEQN
ATMAFQAAIWRWMNPMKKGQPSAHDAFVGNWKP
>TC214_2 similar to Quiet_Ref100_Q8W2B2 Cluster;... (1 Reply)
Discussion started by: smriti_shridhar
1 Replies
8. AIX
In AIX 4.2, are there any shell commands to do substrings and the text like manipulation commands ?
I want to take an error log where errors are multi-ligned and convert them into single lines to ease tracking/monitoring. I may need to shorten them out too.
If I can manage to put them into an... (2 Replies)
Discussion started by: Browser_ice
2 Replies
9. Shell Programming and Scripting
I have a very long string (millions of characters).
I have a file with start location and length that is thousands of rows long:
Start Length
5 10
16 21
44 100
215 37
...
I'd like to extract the substring that corresponds to the start and length from each row of the list:
I tried... (7 Replies)
Discussion started by: dcfargo
7 Replies
10. Shell Programming and Scripting
I'm only new to shell programming and have been given a task to do a program in .sh, however I've come to a point where I'm not sure what to do. This is my code so far:
# process all arguments (i.e. loop while $1 is present)
while ; do
# echo "Arg is $1"
case $1 in
-h*|-H*) echo "help... (4 Replies)
Discussion started by: switch
4 Replies
bup-margin(1) General Commands Manual bup-margin(1)
NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS
--predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO
bup-midx(1), bup-save(1)
BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown- bup-margin(1)