quickest way to get the total number of lines in a file Post: 302708251

Sponsored Content

Top Forums Shell Programming and Scripting quickest way to get the total number of lines in a file Post 302708251 by SkySmart on Monday 1st of October 2012 09:50:54 AM

10-01-2012

Registered User

Quote:

Originally Posted by jim mcnamara

Sky Smart - can you cite a verified reference as to why reading a file from record 1 to EOF is NOT the most efficient for carriage control files. rdrtx1's sample code does that and so does wc -l

I'll answer:
No such valid reference exists. You have to count the number of \n characters to get a line count. The only other possibility is for a fixed length record file. In that case you call stat, ls, or some code you have to get the number of bytes (reading file metadata: struct stat st_size) and then do integer division: bytes/recsz.

One other ' reliable' way is to call x=ftell() on the end of a file when you know the file has finished being written and divide x/recsz - again for fixed record length files. This is an even less efficient way to do stat.

NOTHING else exists. In other words: how can you know how many \n characters exist in the file?

when you run a "wc -l" on a file that big, it takes a while to get the total line count. i understand that the entire file must be read in order to get the total lines.

i'm also aware that in UNIX, there are more than one ways to get something done. on some linux systems a "grep -P" will get you want you want a lot faster than any other utility can. on others, the -P option is not available.

overall, i'm more concerned about speed and what the quickest way is to get total line count on a file that big.

SkySmart

View Public Profile for SkySmart

Find all posts by SkySmart

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Total of lines w/out header and footer incude for a file

I am trying to get a total number of tapes w/out headers or footers in a ERV file and append it to the file. For some reason I cannot get it to work. Any ideas? #!/bin/sh dat=`date +"%b%d_%Y"` + date +%b%d_%Y dat=Nov16_2006 tapemgr="/export/home/legato/tapemgr/rpts"...

2. Shell Programming and Scripting

total number of lines

Hi have following file |abcd 2|abcd |sdfh |sdfj I want to find total number of files haivng nothing in feild 1 using awk will command awk -F "|" '( $1=="") {print NR}' test_awk will work???

3. Shell Programming and Scripting

total number of lines in a file

Hi , How about find the total number of lines in a file ? How can i do that with the "grep" command ?

4. Shell Programming and Scripting

Appending line number to each line and getting total number of lines

Hello, I need help in appending the line number of each line to the file and also to get the total number of lines. Can somebody please help me. I have a file say: abc def ccc ddd ffff The output should be: Instance1=abc Instance2=def Instance3=ccc Instance4=ddd Instance5=ffff ...

5. Shell Programming and Scripting

Removing lines from large files.. quickest method?

Hi I have some files that contain be anything up to 100k lines - eg. file100k I have another file called file5k and I need to produce filec which will contain everything in file100k minus what matches in file 5k.. ie. File100k contains 1FP 2FP 3FP File5k contains 2FP I would...

6. Shell Programming and Scripting

perl script on how to count the total number of lines of all the files under a directory

how to count the total number of lines of all the files under a directory using perl script.. I mean if I have 10 files under a directory then I want to count the total number of lines of all the 10 files contain. Please help me in writing a perl script on this.

7. Shell Programming and Scripting

Select lines in which column have value greater than some percent of total file lines

i have a file in following format 1 32 3 4 6 4 4 45 1 45 4 61 54 66 4 5 65 51 56 65 1 12 32 85 now here the total number of lines are 8(they vary each time) Now i want to select only those lines in which the values...

8. Shell Programming and Scripting

Help with sum total number of record and total number of record problem asking

Input file SFSQW 5192.56 HNRNPK 611.486 QEQW 1202.15 ASDR 568.627 QWET 6382.11 SFSQW 4386.3 HNRNPK 100 SFSQW 500 Desired output file SFSQW 10078.86 3 QWET 6382.11 1 QEQW 1202.15 1 HNRNPK 711.49 2 ASDR 568.63 1 The way I tried:

9. UNIX for Dummies Questions & Answers

Write the total number of rows in multiple files into another file

Hello Friends, I know you all are busy and inteligent too... I am stuck with one small issue if you can help me then it will be really great. My problem is I am having some files i.e. Input.txt1 Input.txt2 Input.txt3 Now my task is I need to check the total number of rows in...

10. UNIX for Dummies Questions & Answers

How to find count total number of pattern in a file �?

How to find count total number of pattern in a file … File contains : a.txt ------------- aaa bbb nnn ccc aaa bbb aaa ddd aaa aaa aaa aaa grep -c aaa a.txt Op: 4 ( But my requirement is should count the total no of patterns as 7 )

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Total of lines w/out header and footer incude for a file

Discussion started by: gzs553

2. Shell Programming and Scripting

total number of lines

Discussion started by: mahabunta

3. Shell Programming and Scripting

total number of lines in a file

Discussion started by: Raynon

4. Shell Programming and Scripting

Appending line number to each line and getting total number of lines

Discussion started by: chiru_h