Delete first 100 lines from a BIG File Post: 302657421

10 More Discussions You Might Find Interesting

1. Solaris

delete first 100 lines rather than zero out of file

Hi experts, in my solaris 9 the file- /var/adm/messeages growin too first. by 24 hours 40MB. And always giving the below messages-- bash-2.05# tail -f messages Nov 9 16:35:38 ME1 last message repeated 1 time Nov 9 16:35:38 ME1 ftpd: wtmpx /var/adm/wtmpx No such file or directory Nov 9...

2. Solaris

delete first 100 lines from a file

I have a file with 28,00,000 lines of rows in this the first 80 lines will be chunks . I want to delete the chunks of 80 lines. I tried tail -f2799920 filename. is there any efficient way to do this. Thanks in advance.

3. Shell Programming and Scripting

How to delete lines in a file that have duplicates or derive the lines that aper once

Input: a b b c d d I need: a c I know how to get this (the lines that have duplicates) : b d sort file | uniq -d But i need opossite of this. I have searched the forum and other places as well, but have found solution for everything except this variant of the problem.

4. Shell Programming and Scripting

Print #of lines after search string in a big file

I have a command which prints #lines after and before the search string in the huge file nawk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r;print;c=a}b{r=$0}' b=0 a=10 s="STRING1" FILE The file is 5 gig big. It works great and prints 10 lines after the lines which contains search string in...

5. Shell Programming and Scripting

Re: Deleting lines from big file.

Hi, I have a big (2.7 GB) text file. Each lines has '|' saperator to saperate each columns. I want to delete those lines which has text like '|0|0|0|0|0' I tried: sed '/|0|0|0|0|0/d' test.txt Unfortunately, it scans the file but does nothing. file content sample:...

6. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to...

7. Shell Programming and Scripting

Delete rows from big file

Hi all, I have a big file (about 6 millions rows) and I have to delete same occurrences, stored in a small file (about 9000 rews). I have tried this: while read line do grep -v $line big_file > ok_file.tmp mv ok_file.tmp big_file done < small_file It works, but is very slow. How...

8. UNIX for Dummies Questions & Answers

Delete records from a big file based on some condition

Hi, To load a big file in a table,I have a make sure that all rows in the file has same number of the columns . So in my file if I am getting any rows which have columns not equal to 6 , I need to delete it . Delimiter is space and columns are optionally enclosed by "". This can be ...

9. Shell Programming and Scripting

Want to extract certain lines from big file

Hi All, I am trying to get some lines from a file i did it with while-do-loop. since the files are huge it is taking much time. now i want to make it faster. The requirement is the file will be having 1 million lines. The format is like below. ##transaction, , , ,blah, blah...

10. UNIX for Beginners Questions & Answers

How to copy only some lines from very big file?

Dear all, I have stuck with this problem for some days. I have a very big file, this file can not open by vi command. There are 200 loops in this file, in each loop will have one line like this: GWA quasiparticle energy with Z factor (eV) And I need 98 lines next after this line. Is...

LEARN ABOUT DEBIAN

pdl::stats

Stats(3pm)						User Contributed Perl Documentation						Stats(3pm)

NAME

       PDL::Stats - a collection of statistics modules in Perl Data Language, with a quick-start guide for non-PDL people.

VERSION

       Version 0.6.2

DESCRIPTION

       Loads modules named below, making the functions available in the current namespace.

       Properly formated documentations online at http://pdl-stats.sf.net

SYNOPSIS

	   use PDL::LiteF;	  # loads less modules
	   use PDL::NiceSlice;	  # preprocessor for easier pdl indexing syntax

	   use PDL::Stats;

	   # Is equivalent to the following:

	   use PDL::Stats::Basic;
	   use PDL::Stats::GLM;
	   use PDL::Stats::Kmeans;
	   use PDL::Stats::TS;

	   # and the following if installed;

	   use PDL::Stats::Distr;
	   use PDL::GSL::CDF;

QUICK-START FOR NON-PDL PEOPLE
       Enjoy PDL::Stats without having to dive into PDL, just wet your feet a little. Three key words two concepts and an icing on the cake, you
       should be well on your way there.

   pdl
       The magic word that puts PDL::Stats at your disposal. pdl creates a PDL numeric data object (a pdl, pronounced "piddle" :/ ) from perl
       array or array ref. All PDL::Stats methods, unless meant for regular perl array, can then be called from the data object.

	   my @y = 0..5;

	   my $y = pdl @y;

	   # a simple function

	   my $stdv = $y->stdv;

	   # you can skip the intermediate $y

	   my $stdv = stdv( pdl @y );

	   # a more complex method, skipping intermediate $y

	   my @x1 = qw( y y y n n n );
	   my @x2 = qw( 1 0 1 0 1 0 )

	   # do a two-way analysis of variance with y as DV and x1 x2 as IVs

	   my %result = pdl(@y)->anova( @x1, @x2 );
	   print "$_	$result{$_}
" for (sort keys %result);

       If you have a list of list, ie array of array refs, pdl will create a multi-dimensional data object.

	   my @a = ( [1,2,3,4], [0,1,2,3], [4,5,6,7] );

	   my $a = pdl @a;

	   print $a . $a->info;

	   # here's what you will get

	   [
	    [1 2 3 4]
	    [0 1 2 3]
	    [4 5 6 7]
	   ]
	   PDL: Double D [4,3]

       PDL::Stats puts observations in the first dimension and variables in the second dimension, ie pdl [obs, var]. In PDL::Stats the above
       example represents 4 observations on 3 variables.

	   # you can do all kinds of fancy stuff on such a 2D pdl.

	   my %result = $a->kmeans( {NCLUS=>2} );
	   print "$_	$result{$_}
" for (sort keys %result);

       Make sure the array of array refs is rectangular. If the array refs are of unequal sizes, pdl will pad it out with 0s to match the longest
       list.

   info
       Tells you the data type (yes pdls are typed, but you shouldn't have to worry about it here*) and dimensionality of the pdl, as seen in the
       above example. I find it a big help for my sanity to keep track of the dimensionality of a pdl. As mentioned above, PDL::Stats uses 2D pdl
       with observation x variable dimensionality.

       *pdl uses double precision by default. If you are working with things like epoch time, then you should probably use pdl(long, @epoch) to
       maintain the precision.

   list
       Come back to the perl reality from the PDL wonder land. list turns a pdl data object into a regular perl list. Caveat: list produces a flat
       list. The dimensionality of the data object is lost.

   Signature
       This is not a function, but a concept. You will see something like this frequently in the pod:

	   stdv

	     Signature: (a(n); float+ [o]b())

       The signature tells you what the function expects as input and what kind of output it produces. a(n) means it expects a 1D pdl with n
       elements; [o] is for output, b() means its a scalar. So stdv will take your 1D list and give back a scalar. float+ you can ignore; but if
       you insist, it means the output is at float or double precision. The name a or b or c is not important. What's important is the thing in
       the parenthesis.

	   corr

	     Signature: (a(n); b(n); float+ [o]c())

       Here the function corr takes two inputs, two 1D pdl with the same numbers of elements, and gives back a scalar.

	   t_test

	     Signature: (a(n); b(m); float+ [o]t(); [o]d())

       Here the function t_test can take two 1D pdls of unequal size (n==m is certainly fine), and give back two scalars, t-value and degrees of
       freedom. Yes we accommodate t-tests with unequal sample sizes.

	   assign

	     Signature: (data(o,v); centroid(c,v); byte [o]cluster(o,c))

       Here is one of the most complicated signatures in the package. This is a function from Kmeans. assign takes data of observasion x variable
       dimensions, and a centroid of cluster x variable dimensions, and returns an observation x cluster membership pdl (indicated by 1s and 0s).

       Got the idea? Then we can see how PDL does its magic :)

   Threading
       Another concept. The first thing to know is that, threading is optional.

       PDL threading means automatically repeating the operation on extra elements or dimensions fed to a function. For a function with a
       signature like this

	   gsl_cdf_tdist_P

	     Signature: (double x(); double nu();  [o]out())

       the signatures says that it takes two scalars as input, and returns a scalar as output. If you need to look up the p-values for a list of
       t's, with the same degrees of freedom 19,

	   my @t = ( 1.65, 1.96, 2.56 );

	   my $p = gsl_cdf_tdist_P( pdl(@t), 19 );

	   print $p . "
" . $p->info;

	   # here's what you will get

	   [0.94231136 0.96758551 0.99042586]
	   PDL: Double D [3]

       The same function is repeated on each element in the list you provided. If you had different degrees of freedoms for the t's,

	   my @df = (199, 39, 19);

	   my $p = gsl_cdf_tdist_P( pdl(@t), pdl(@df) );

	   print $p . "
" . $p->info;

	   # here's what you will get

	   [0.94973979 0.97141553 0.99042586]
	   PDL: Double D [3]

       The df's are automatically matched with the t's to give you the results.

       An example of threading thru extra dimension(s):

	   stdv

	     Signature: (a(n); float+ [o]b())

       if the input is of 2D, say you want to compute the stdv for each of the 3 variables,

	   my @a = ( [1,1,3,4], [0,1,2,3], [4,5,6,7] );

	   # pdl @a is pdl dim [4,3]

	   my $sd = stdv( pdl @a );

	   print $sd . "
" . $sd->info;

	   # this is what you will get

	   [ 1.2990381	 1.118034   1.118034]
	   PDL: Double D [3]

       Here the function was given an input with an extra dimension of size 3, so it repeates the stdv operation on the extra dimenion 3 times,
       and gives back a 1D pdl of size 3.

       Threading works for arbitrary number of dimensions, but it's best to refrain from higher dim pdls unless you have already decided to become
       a PDL wiz / witch.

       Not all PDL::Stats methods thread. As a rule of thumb, if a function has a signature attached to it, it threads.

   perldl
       Essentially a perl shell with "use PDL;" at start up. Comes with the PDL installation. Very handy to try out pdl operations, or just plain
       perl. print is shortened to p to avoid injury from exessive typing. my goes out of scope at the end of (multi)line input, so mostly you
       will have to drop the good practice of my here.

   For more info
       PDL::Impatient

AUTHOR

       ~~~~~~~~~~~~ ~~~~~ ~~~~~~~~ ~~~~~ ~~~ `` ><(((">

       Copyright (C) 2009-2012 Maggie J. Xiong <maggiexyz users.sourceforge.net>

       All rights reserved. There is no warranty. You are allowed to redistribute this software / documentation as described in the file COPYING
       in the PDL distribution.

perl v5.14.2							    2012-06-04								Stats(3pm)

10 More Discussions You Might Find Interesting

1. Solaris

delete first 100 lines rather than zero out of file

Discussion started by: thepurple

2. Solaris

delete first 100 lines from a file

Discussion started by: salaathi

3. Shell Programming and Scripting

How to delete lines in a file that have duplicates or derive the lines that aper once

Discussion started by: necroman08

4. Shell Programming and Scripting

Print #of lines after search string in a big file

Discussion started by: prash184u