Help with changing header of tsv with 30 million lines Post: 302747371

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Strip 3 header lines and 4 trailer lines

Hello friends, I want to remove 3 header lines and 4 trailer lines, I am using following , is it correct ? sed '1,3d';'4,$ d' filename

2. UNIX for Advanced & Expert Users

Changing a header in a shared library

Hello, Does changing a header in a shared library under Solaris (say adding a new class data member) will result in not only compiling that library but all of the libraries that depend on that lib that was changed because of the change in the object's size? What about adding a virtual function?...

3. Shell Programming and Scripting

Tail 86000 lines from 1.2 million line file?

I have a log file that is about 1.2 million lines long and about 300MB. we need a way to clean up this file and only keep the last few thousand lines. if i use tail command we run our of memory as the file is too big. I do have a key word to match on. example, we want to keep every line...

4. Shell Programming and Scripting

print lines except the header

awk -F ";" '{if($10>80 && NR>1) print $0 }' txt_file_* I am using this command to print the lines which has 10th field more then 80 and leaving the first line of the file which is the header. But this is not working , the first line is is coming as output , please correct me . thanks

5. UNIX for Dummies Questions & Answers

Changing email header information by tweaking sendmail

How can i tweak sendmail configuration files so that the "Received:" field is removed from email header information? Or else can i change Received: (from enswitch@localhost) in email header to something likeReceived: (from xyz@localhost)? ---------- Post updated at 09:57 PM ---------- Previous...

6. Shell Programming and Scripting

Adding header once every 5 lines

Hi, I need a help in creating a report file. The input file is like this 1 A 2 B 3 V 4 X 5 m 6 O 7 X 8 p 9 a 10 X There is a header which i have to print & save the result as a output file. The header has multiple lines on is like say: New New S.No Name

7. Shell Programming and Scripting

Matching 10 Million file records with 10 Million in other file

Dear All, I have two files both containing 10 Million records each separated by comma(csv fmt). One file is input.txt other is status.txt. Input.txt-> contains fields with one unique id field (primary key we can say) Status.txt -> contains two fields only:1. unique id and 2. status ...

8. Shell Programming and Scripting

find numeric duplicates from 300 million lines....

these are numeric ids.. 222932017099186177 222932014385467392 222932017371820032 222932017409556480 I have text file having 300 millions of line as shown above. I want to find duplicates from this file. Please suggest the quicker way.. sort | uniq -d will...

9. Shell Programming and Scripting

Print header and some lines

HI , i have to print the first header of df -h (Filesystem Size Used Avail Use% Mounted on)and line which conatin size Network path only. Filesystem Size Used Avail Use% Mounted on /test/sda3 35G 1.8G 32G 6% / /test/sda10 7.8G 1.1G ...

10. Shell Programming and Scripting

Find header in a text file and prepend it to all lines until another header is found

I've been struggling with this one for quite a while and cannot seem to find a solution for this find/replace scenario. Perhaps I'm getting rusty. I have a file that contains a number of metrics (exactly 3 fields per line) from a few appliances that are collected in parallel. To identify the...

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Strip 3 header lines and 4 trailer lines

Discussion started by: ganesh123

2. UNIX for Advanced & Expert Users

Changing a header in a shared library

Discussion started by: Linker

3. Shell Programming and Scripting

Tail 86000 lines from 1.2 million line file?

Discussion started by: robsonde

4. Shell Programming and Scripting

print lines except the header

Discussion started by: madfox