04-22-2008
Nothing like answering your own question.
Spaces after the word car in file one.
Suppose a good idea to squash multiple spaces into one then chomp
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
How do I get the record count in an EBCDIC file on a Linux Box. :confused: (1 Reply)
Discussion started by: oracle8
1 Replies
2. UNIX for Dummies Questions & Answers
Hi,
I trying to sort information in a file by making count for every object. For example:
A
B
A
D
A
B
C
i would like to sort out count for each object (A's, B's and etc) but in actual case i dont know the object but i know the position ofthe object. do i need to use array as well? Any... (2 Replies)
Discussion started by: sukhpal_78
2 Replies
3. Shell Programming and Scripting
Hi all this is a UNIX question.
I have a large flat file with millions of records.
col1|col2|col3
1|a|b
2|c|d
3|e|f
3|g|h
footer****
I am supposed to calculate the sum of col1 1+2+3+3=9, count of col1 1,2,3,3=4, and distinct count of col1 1,2,3=c3
I would like it if you avoid... (4 Replies)
Discussion started by: singhabhijit
4 Replies
4. Shell Programming and Scripting
I have a sorted file like:
Apple 3
Apple 5
Apple 8
Banana 2
Banana 3
Grape 31
Orange 7
Orange 13
I'd like to search $1 and if $1 is not the same as $1 in the previous row print that row and print the number of times $1 was found.
so the output would look like:
Apple 8 3
Banana... (2 Replies)
Discussion started by: dcfargo
2 Replies
5. Shell Programming and Scripting
I have a file containing about 5 million rows, in the file there are some records which has extra delimiter at random position. (we dont know the positions), now we have to Count the delimeter from each row and if the count of delimeter is not matching then I want to delete those rows from the... (5 Replies)
Discussion started by: Akumar1
5 Replies
6. Shell Programming and Scripting
Hello All,
I got a requirement when I was working with a file. Say the file has unloads of data from a table in the form
1|121|asda|434|thesi|2012|05|24|
1|343|unit|09|best|2012|11|5|
I was put into a scenario where I need the field count in all the lines in that file. It was simply... (6 Replies)
Discussion started by: PikK45
6 Replies
7. Shell Programming and Scripting
What I'm trying to accomplish. I receive a Header and Detail file for daily processing. The detail file comes first which holds data, the header is a receipt of the detail file and has the detail files record count. Before processing the detail file I would like to put a wrapper around another... (4 Replies)
Discussion started by: pone2332
4 Replies
8. Shell Programming and Scripting
hi All, Any one answer my requirement.
I have source location
src_dir="/home/oracle/arun/IRMS-CM"
My Target location
dest_dir="/home/oracle/arun/LiveLink/IRMS-CM/$dc/$pc/$ct"
my source text files check with below example.text file content
$fn "\t" $dc "\t" $pc "\t" ... (3 Replies)
Discussion started by: sravanreddy
3 Replies
9. Shell Programming and Scripting
The awk below runs and produces the following output on the file2. This is just an example of the format as the file is ~14MB. file1.txt is attached. I am trying to count the ids that match between the two files and out the ids that are missing. Thank you :).
file2
970 NM_213590 ... (2 Replies)
Discussion started by: cmccabe
2 Replies
10. UNIX for Beginners Questions & Answers
Hi,
I have a file with a list of bunch of IP addresses from different VLAN's . I am trying to find the list the number of each vlan occurence in the output
Here is how my file looks like
1.1.1.1
1.1.1.2
1.1.1.3
1.1.2.1
1.1.2.2
1.1.3.1
1.1.3.2
1.1.3.3
1.1.3.4
So what I am trying... (2 Replies)
Discussion started by: new2prog
2 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)
NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS
--predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO
bup-midx(1), bup-save(1)
BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown- bup-margin(1)