Sponsored Content
Top Forums UNIX for Advanced & Expert Users Delete first 100 lines from a BIG File Post 302657421 by alister on Sunday 17th of June 2012 03:29:18 PM
Old 06-17-2012
Excellent observation, drl.

Regards,
Alister
 

10 More Discussions You Might Find Interesting

1. Solaris

delete first 100 lines rather than zero out of file

Hi experts, in my solaris 9 the file- /var/adm/messeages growin too first. by 24 hours 40MB. And always giving the below messages-- bash-2.05# tail -f messages Nov 9 16:35:38 ME1 last message repeated 1 time Nov 9 16:35:38 ME1 ftpd: wtmpx /var/adm/wtmpx No such file or directory Nov 9... (7 Replies)
Discussion started by: thepurple
7 Replies

2. Solaris

delete first 100 lines from a file

I have a file with 28,00,000 lines of rows in this the first 80 lines will be chunks . I want to delete the chunks of 80 lines. I tried tail -f2799920 filename. is there any efficient way to do this. Thanks in advance. (7 Replies)
Discussion started by: salaathi
7 Replies

3. Shell Programming and Scripting

How to delete lines in a file that have duplicates or derive the lines that aper once

Input: a b b c d d I need: a c I know how to get this (the lines that have duplicates) : b d sort file | uniq -d But i need opossite of this. I have searched the forum and other places as well, but have found solution for everything except this variant of the problem. (3 Replies)
Discussion started by: necroman08
3 Replies

4. Shell Programming and Scripting

Print #of lines after search string in a big file

I have a command which prints #lines after and before the search string in the huge file nawk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r;print;c=a}b{r=$0}' b=0 a=10 s="STRING1" FILE The file is 5 gig big. It works great and prints 10 lines after the lines which contains search string in... (8 Replies)
Discussion started by: prash184u
8 Replies

5. Shell Programming and Scripting

Re: Deleting lines from big file.

Hi, I have a big (2.7 GB) text file. Each lines has '|' saperator to saperate each columns. I want to delete those lines which has text like '|0|0|0|0|0' I tried: sed '/|0|0|0|0|0/d' test.txt Unfortunately, it scans the file but does nothing. file content sample:... (4 Replies)
Discussion started by: dipeshvshah
4 Replies

6. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies

7. Shell Programming and Scripting

Delete rows from big file

Hi all, I have a big file (about 6 millions rows) and I have to delete same occurrences, stored in a small file (about 9000 rews). I have tried this: while read line do grep -v $line big_file > ok_file.tmp mv ok_file.tmp big_file done < small_file It works, but is very slow. How... (2 Replies)
Discussion started by: Tibbeche
2 Replies

8. UNIX for Dummies Questions & Answers

Delete records from a big file based on some condition

Hi, To load a big file in a table,I have a make sure that all rows in the file has same number of the columns . So in my file if I am getting any rows which have columns not equal to 6 , I need to delete it . Delimiter is space and columns are optionally enclosed by "". This can be ... (1 Reply)
Discussion started by: hemantraijain
1 Replies

9. Shell Programming and Scripting

Want to extract certain lines from big file

Hi All, I am trying to get some lines from a file i did it with while-do-loop. since the files are huge it is taking much time. now i want to make it faster. The requirement is the file will be having 1 million lines. The format is like below. ##transaction, , , ,blah, blah... (38 Replies)
Discussion started by: mad man
38 Replies

10. UNIX for Beginners Questions & Answers

How to copy only some lines from very big file?

Dear all, I have stuck with this problem for some days. I have a very big file, this file can not open by vi command. There are 200 loops in this file, in each loop will have one line like this: GWA quasiparticle energy with Z factor (eV) And I need 98 lines next after this line. Is... (6 Replies)
Discussion started by: phamnu
6 Replies
Stats(3pm)						User Contributed Perl Documentation						Stats(3pm)

NAME
PDL::Stats - a collection of statistics modules in Perl Data Language, with a quick-start guide for non-PDL people. VERSION
Version 0.6.2 DESCRIPTION
Loads modules named below, making the functions available in the current namespace. Properly formated documentations online at http://pdl-stats.sf.net SYNOPSIS
use PDL::LiteF; # loads less modules use PDL::NiceSlice; # preprocessor for easier pdl indexing syntax use PDL::Stats; # Is equivalent to the following: use PDL::Stats::Basic; use PDL::Stats::GLM; use PDL::Stats::Kmeans; use PDL::Stats::TS; # and the following if installed; use PDL::Stats::Distr; use PDL::GSL::CDF; QUICK-START FOR NON-PDL PEOPLE Enjoy PDL::Stats without having to dive into PDL, just wet your feet a little. Three key words two concepts and an icing on the cake, you should be well on your way there. pdl The magic word that puts PDL::Stats at your disposal. pdl creates a PDL numeric data object (a pdl, pronounced "piddle" :/ ) from perl array or array ref. All PDL::Stats methods, unless meant for regular perl array, can then be called from the data object. my @y = 0..5; my $y = pdl @y; # a simple function my $stdv = $y->stdv; # you can skip the intermediate $y my $stdv = stdv( pdl @y ); # a more complex method, skipping intermediate $y my @x1 = qw( y y y n n n ); my @x2 = qw( 1 0 1 0 1 0 ) # do a two-way analysis of variance with y as DV and x1 x2 as IVs my %result = pdl(@y)->anova( @x1, @x2 ); print "$_ $result{$_} " for (sort keys %result); If you have a list of list, ie array of array refs, pdl will create a multi-dimensional data object. my @a = ( [1,2,3,4], [0,1,2,3], [4,5,6,7] ); my $a = pdl @a; print $a . $a->info; # here's what you will get [ [1 2 3 4] [0 1 2 3] [4 5 6 7] ] PDL: Double D [4,3] PDL::Stats puts observations in the first dimension and variables in the second dimension, ie pdl [obs, var]. In PDL::Stats the above example represents 4 observations on 3 variables. # you can do all kinds of fancy stuff on such a 2D pdl. my %result = $a->kmeans( {NCLUS=>2} ); print "$_ $result{$_} " for (sort keys %result); Make sure the array of array refs is rectangular. If the array refs are of unequal sizes, pdl will pad it out with 0s to match the longest list. info Tells you the data type (yes pdls are typed, but you shouldn't have to worry about it here*) and dimensionality of the pdl, as seen in the above example. I find it a big help for my sanity to keep track of the dimensionality of a pdl. As mentioned above, PDL::Stats uses 2D pdl with observation x variable dimensionality. *pdl uses double precision by default. If you are working with things like epoch time, then you should probably use pdl(long, @epoch) to maintain the precision. list Come back to the perl reality from the PDL wonder land. list turns a pdl data object into a regular perl list. Caveat: list produces a flat list. The dimensionality of the data object is lost. Signature This is not a function, but a concept. You will see something like this frequently in the pod: stdv Signature: (a(n); float+ [o]b()) The signature tells you what the function expects as input and what kind of output it produces. a(n) means it expects a 1D pdl with n elements; [o] is for output, b() means its a scalar. So stdv will take your 1D list and give back a scalar. float+ you can ignore; but if you insist, it means the output is at float or double precision. The name a or b or c is not important. What's important is the thing in the parenthesis. corr Signature: (a(n); b(n); float+ [o]c()) Here the function corr takes two inputs, two 1D pdl with the same numbers of elements, and gives back a scalar. t_test Signature: (a(n); b(m); float+ [o]t(); [o]d()) Here the function t_test can take two 1D pdls of unequal size (n==m is certainly fine), and give back two scalars, t-value and degrees of freedom. Yes we accommodate t-tests with unequal sample sizes. assign Signature: (data(o,v); centroid(c,v); byte [o]cluster(o,c)) Here is one of the most complicated signatures in the package. This is a function from Kmeans. assign takes data of observasion x variable dimensions, and a centroid of cluster x variable dimensions, and returns an observation x cluster membership pdl (indicated by 1s and 0s). Got the idea? Then we can see how PDL does its magic :) Threading Another concept. The first thing to know is that, threading is optional. PDL threading means automatically repeating the operation on extra elements or dimensions fed to a function. For a function with a signature like this gsl_cdf_tdist_P Signature: (double x(); double nu(); [o]out()) the signatures says that it takes two scalars as input, and returns a scalar as output. If you need to look up the p-values for a list of t's, with the same degrees of freedom 19, my @t = ( 1.65, 1.96, 2.56 ); my $p = gsl_cdf_tdist_P( pdl(@t), 19 ); print $p . " " . $p->info; # here's what you will get [0.94231136 0.96758551 0.99042586] PDL: Double D [3] The same function is repeated on each element in the list you provided. If you had different degrees of freedoms for the t's, my @df = (199, 39, 19); my $p = gsl_cdf_tdist_P( pdl(@t), pdl(@df) ); print $p . " " . $p->info; # here's what you will get [0.94973979 0.97141553 0.99042586] PDL: Double D [3] The df's are automatically matched with the t's to give you the results. An example of threading thru extra dimension(s): stdv Signature: (a(n); float+ [o]b()) if the input is of 2D, say you want to compute the stdv for each of the 3 variables, my @a = ( [1,1,3,4], [0,1,2,3], [4,5,6,7] ); # pdl @a is pdl dim [4,3] my $sd = stdv( pdl @a ); print $sd . " " . $sd->info; # this is what you will get [ 1.2990381 1.118034 1.118034] PDL: Double D [3] Here the function was given an input with an extra dimension of size 3, so it repeates the stdv operation on the extra dimenion 3 times, and gives back a 1D pdl of size 3. Threading works for arbitrary number of dimensions, but it's best to refrain from higher dim pdls unless you have already decided to become a PDL wiz / witch. Not all PDL::Stats methods thread. As a rule of thumb, if a function has a signature attached to it, it threads. perldl Essentially a perl shell with "use PDL;" at start up. Comes with the PDL installation. Very handy to try out pdl operations, or just plain perl. print is shortened to p to avoid injury from exessive typing. my goes out of scope at the end of (multi)line input, so mostly you will have to drop the good practice of my here. For more info PDL::Impatient AUTHOR
~~~~~~~~~~~~ ~~~~~ ~~~~~~~~ ~~~~~ ~~~ `` ><((("> Copyright (C) 2009-2012 Maggie J. Xiong <maggiexyz users.sourceforge.net> All rights reserved. There is no warranty. You are allowed to redistribute this software / documentation as described in the file COPYING in the PDL distribution. perl v5.14.2 2012-06-04 Stats(3pm)
All times are GMT -4. The time now is 05:14 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy