Hi All,
I have a data in flat file like below. Some of the information are in second row.
I am using following code to merge 2 lines, if data are in 2 lines.
But, it is taking too much time. I have arounf 3000 records in the file. Is there any other fast possible ways to do so... ie awk, nawk...?
Hi,
I have a tab delimited flat file like this: 189 Guide de lutilisateur sur lappel conférence à trois au moyen d'adaptateurs téléphoniques <TABLE><TBODY><TR><TD><DIV class=subheader>La fonction Appel conférence à trois </DIV></TD>
\
<TD><?php print $navTree;?> vous permet de tenir un appel... (4 Replies)
Hi
I have the fixed width flat file having the following data
12345aaaaaaaaaabbbbbbbbbb
12365sssssssssscccccccccc
12365sssss
12367ddddddddddvvvvvvvvvv
12367 vvvvv
Here the first column is length 5 second is length 10 third is length 10
if the second or third column exceeds... (3 Replies)
I've hunted and hunted but nothing seems to apply to what I need. Any help will be much appreciated!
My input file looks like (Unix):
marker,allele1,allele2
RS1002244,1,1
RS1002244,1,3
RS1002244,3,3
RS1003719,2,2
RS1003719,2,4
RS1003719,4,4
Most markers are listed 3 times but a few... (2 Replies)
Hello,
I have searched forum trying to find a solution to my problem, but could not find anything or I did not understand the examples....
I should say, I am very inexperienced with text processing.
I have a text file with approx 60k lines in it.
I need to merge lines based on the number... (8 Replies)
I have one comma separated file (a.txt) with two or more records all matching except for the last column.
I would like to merge all matching lines into one and consolidate the last column, separated by ":". Does anyone know of a way to do this easily?
I've searched the forum but most talked... (6 Replies)
Hello! i have a text file.. which contains the data as follows
i want to merge the declarations lines pertaining to one datatype in to a single line as follows
i've searched the forum for help.. but couldn't find much help.. how can i do this?? (1 Reply)
Hello Everyone,
I have two files I created in a format similar to the ones found below (character position is important):
File 1:
21 Cat Y N S Y Y N N
FOUR LEGS
TAIL
WHISKERS
30 Dog N N 1 Y Y N N
FOUR LEGS
TAIL
33 Fish Y N 1 Y Y N N
FINS
43 CAR Y N S Y Y N N
WHEELS
DOORS... (7 Replies)
I've been a Unix admin for nearly 30 years and never learned AWK. I've seen several similar posts here, but haven't been able to adapt the answers to my situation. AWK is so damn cryptic! ;)
I have a single file with ~900 lines (CSV list). Each line starts with an ID, but with different stuff... (6 Replies)
Need help figuring out how to merge data from a file. I have a large txt file with some data that needs to be merged from separate lines into one line.
Doug.G|3/12/2011|817-555-5555|Portland
Doug.G|3/12/2011|817-555-5522|Portland
Steve.F|1/11/2007|817-555-5111|Portland... (5 Replies)
Discussion started by: cdubu2
5 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)