gzcat number of records


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting gzcat number of records
# 1  
Old 08-23-2012
gzcat number of records

Hey guys,

I want to do something quite simple but I just can't no matter what I try. I have a large file and i usually just:
Code:
gzcat test.gz | nohup /test/this-script-does-things-to-the-records.pl -> /testdir/tmp_test.txt

But now I need to do it only for the first 100k records. I sure hope you can be not to hard on a newbie like me Smilie

Moderator's Comments:
Mod Comment Please view this code tag video for how to use code tags when posting code and data.

Last edited by vbe; 08-23-2012 at 11:52 AM.. Reason: code tags
# 2  
Old 08-23-2012
Code:
gzcat test.gz | awk 'FNR<100001' | nohup /test/this-script-does-things-to-the-records.pl -> /testdir/tmp_test.txt

one way
# 3  
Old 08-23-2012
Quote:
Originally Posted by jim mcnamara
Code:
gzcat test.gz | awk 'FNR<100001' | nohup /test/this-script-does-things-to-the-records.pl -> /testdir/tmp_test.txt

That could be wasteful if the file is much larger than 100k lines, since it would still read the file in its entirety. Downstream, the pipeline won't see EOF until awk eventually exits.

I would suggest
Code:
head -n100000

If awk is preferred, perhaps
Code:
awk 'FNR==100001 {exit}'

Regards,
Alister
# 4  
Old 08-24-2012
Thank you very much! head -n100000 works like a charm.

---------- Post updated at 03:06 PM ---------- Previous update was at 09:13 AM ----------

ok new problem Smilie

i decided that i would like to preform an action between the lines 100k and 200k (for example). The easy way i tried out

Code:
head -n100 | tail -n200

did not worked and i'm guessing that tail reads the whole file first so that will be a bad idea (the file is really big). Next thing i'm thinking is using sed

something like

Code:
sed -n '100,200 p' /filelocation/file | grep "string im searching for"

but it's kinda slow when the grep is added. Any help would be greatly appricieted

Last edited by sg3; 08-24-2012 at 09:14 AM..
# 5  
Old 08-24-2012
Code:
sed -n '100000,200000{;p;200000q;}' file

This will quit after printing 200000th line.
This User Gave Thanks to elixir_sinari For This Post:
# 6  
Old 08-27-2012
thanks, i played a bit with this part and this is the part that bugs me - for lines let's say to the first million it works just fine, even above, but when i tried to print a single line

Code:
sed -n '177998637,177998638{;p;177998638q;}' /testdir/testfile

it took a lot of time (actually i didn't even waited for result) is that normal behaviour.
# 7  
Old 08-27-2012
Yes. That command is reading your file sequentially till the 177998637th line and then printing this and the next line and then quitting the read operation. So you see, to print just those 2 lines, the command still has to read the first 177.99 million lines (have to say, a huge huge file).

For an improvement in the time taken, you could replace that sed command with awk:
Code:
awk 'NR==177998637||NR==177998638;NR==177998638{exit}' /testdir/testfile


Last edited by elixir_sinari; 08-27-2012 at 08:00 AM..
This User Gave Thanks to elixir_sinari For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Assign number of records to a variable

How does one assign a variable, x to equal the number of records in a different file. I have a simple command such as below: awk -F "\t" '(NR>5) { if(($x == "0/0")) { print $0} }' a.txt > a1.txt but I want x to equal the number of records in a different file, b.txt (10 Replies)
Discussion started by: Geneanalyst
10 Replies

2. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Hi Freinds , I have 2 files . File 1 |nag|HYd|1|Che |esw|Gun|2|hyd |pra|bhe|3|hyd |omu|hei|4|bnsj |uer|oeri|5|uery File 2 |nag|HYd|1|Che |esw|Gun|2|hyd |uer|oi|3|uery output : (9 Replies)
Discussion started by: i150371485
9 Replies

3. Shell Programming and Scripting

AWK print number of records, divide this number

I would like to print the number of records of 2 files, and divide the two numbers awk '{print NR}' file1 > output1 awk '{print NR}' file2 > output2 paste output1 output2 > output awl '{print $1/$2}' output > output_2 is there a faster way? (8 Replies)
Discussion started by: programmerc
8 Replies

4. Shell Programming and Scripting

Getting number of records from a table

I am doing a loading process. I am loading data from a Oracle source to Oracle target. For example there is an SQL statement: Insert into emp_1 Select * from emp_2 where deptno=20; In this case my source is emp_2 and loading into my target table emp_1. This process is automated. Now I... (3 Replies)
Discussion started by: karthikkasarla
3 Replies

5. Shell Programming and Scripting

Number of records in a file of a particular directory

Hi All, I want to find the number of records in a file of a particular directory I have a file as abcd.txt in the path var/hr/payments/ I want to find number of records in abcd.txt file in a single command. I tried the following cd /var/hr/payments/wc -l abcd.txt I got... (5 Replies)
Discussion started by: ajaykumarkona
5 Replies

6. Shell Programming and Scripting

Calculating number of records by field

Hi, I have CSV file which looks like below, i want to calulate number of records for each brand say SOLO_UNBEATABLE E and SOLO_UNBEATABLE F combined and record count is say 20 . i want to calculate for each brand, and here only first record will have all data and rest of record for the brand... (2 Replies)
Discussion started by: raghavendra.cse
2 Replies

7. Shell Programming and Scripting

loop number of records.

Initially i store some files into anothe file Y. Now i want read the contents of file Y one by one do some check on each file. i,e Open file Y (contains multiple files) First read a file , do some check on that individual file.If that file satisfies teh condition put it in another file. Now... (1 Reply)
Discussion started by: vasuarjula
1 Replies

8. UNIX for Advanced & Expert Users

delete records using line number(NR)

Hai I have a flat file which contains more than 6 crore lines or records. I want to delete only one line, using line number. For example I want to delete 414556 th line . How to do this using sed or awk command. thanks (3 Replies)
Discussion started by: tkbharani
3 Replies

9. Shell Programming and Scripting

awk - Number of records

Hi, Is it possible to find the total number of records processed by awk at begining. NR gives the value at the end. Is there any variable available to find the value at the begining? Thanks ---------- Suman (1 Reply)
Discussion started by: suman_jakkula
1 Replies

10. Shell Programming and Scripting

Number of records in a file

hi gurus i'm trying to get the count of number of records of a file as : wc -l file1.txt iam getting the correct count by in out put i'm getting the file name too i get the output as follows "7 file1.txt" my question is how to avoid filename in the output. might be a basic... (20 Replies)
Discussion started by: sish78
20 Replies
Login or Register to Ask a Question