Visit Our UNIX and Linux User Community


Alternative for wc -l


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Alternative for wc -l
# 1  
Old 10-15-2010
Data Alternative for wc -l

Hi techies ..

This is my first posting hr ..

Am facing a serious performance problem in counting the number of lines in the file. The input files i get will be in some 10 to 15 Gb of size or even sometimes more ..and I will load it to db

I have used wc -l to confirm whether the loader has loaded entire data ...the couting operaion alone consumes more than 45 to 50 mins ...could someone suggest me any other way to get the count of lines in files ..Smilie

pls note I cant use sed due to the coding standards followed here ..so pls excuse ..

Any other swifting workaround would be really helpfull ..and appreciatedSmilie
# 2  
Old 10-15-2010
Code:
awk 'END{print NR}' yourfile

# 3  
Old 10-15-2010
Quote:
awk 'END{print NR}' yourfile
Out of interest, I tried this on a 5 million record text file and the result came out as an exponential number. The result from "wc -l" was correct.
Anybody know how to get awk to count a large number of records?


@rajesh_2383
How many records in a typical file?
Are they fixed length records? If so, we could calculate the number of records from the file size.

What database engine is this? It may be quicker to write a count program in a high level language.
# 4  
Old 10-15-2010
@Methyl
Maybe use printf instead?

I think wc is just optimized for this task. Anyway here is a little C program you can compile with your favourite C compiler, for example:
Code:
gcc -Wall -o wcc wcc.c
# and then issue
./wcc yourfile

and try it out.

Code:
#include <stdio.h>
#include <stdlib.h>


#define MAX 2048

int main(int argc, char** argv)
{
        char zbuf[MAX];
        long int z=0;
        FILE *fp;

        fp=fopen(argv[1],"r");
        if( !fp )
        {
                fprintf(stderr, "Error: File %s could not be opened.\n", argv[1]);
                exit (EXIT_FAILURE);
        }
        else
        {
                while ( fgets(zbuf, MAX, fp) )
                {
                        z++;
                }
        }
        fclose(fp);
        printf("Line count: %li\n", z);
        exit (EXIT_SUCCESS);
        return 0;
}

I have set the maximum line length to 2048 - maybe you want to increase this if it is not sufficient for your file. Maybe worth a try. I am no C programmer so maybe someone has even an idea to improve it.

It could be also the case that your hardware/OS is the bottle neck - just a guess.

Last edited by zaxxon; 10-15-2010 at 10:01 AM.. Reason: changed printf to long integer according to definition of z
# 5  
Old 10-15-2010
Have you tried this:
Code:
awk 'END{printf ("%d\n", NR)}' yourfile

# 6  
Old 10-15-2010
Quote:
awk 'END{printf ("%d\n", NR)}' yourfile
Brilliant. Took 37 seconds for 5 million 80-character records.
Let's see how the O/P gets on.
# 7  
Old 10-15-2010
To me:
Code:
# echo "10000000000000000000" | awk '{printf ("%d\n", $0)}'
10000000000000000000
# echo "100000000000000000000" | awk '{printf ("%d\n", $0)}'
1e+20

Maybe there is another way, that is also fast!

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Looking for an alternative to Tcl

I've created quite a collection of tcl scripts which have buttons, radio buttons, check boxes, text fields, etc. These tcl scripts in turn call and execute several hundred sh, csh, bash, perl scripts and pass in the args based on the gui selections on the same and other redhat machines. We're... (4 Replies)
Discussion started by: scottwevans
4 Replies

2. Solaris

vi alternative

Is there any other editor, installed by 'default' in Sparc Solaris10, besides vi? I'd like to avoid installing anything new. If not, how to make vi more user-friendly? thanks. (8 Replies)
Discussion started by: orange47
8 Replies

3. Solaris

Alternative to sshfs?

I have an automated testing script that relies on the dev box being able to see production's (NFS) share. It uses rsync and ssh to handle transfers and command execution; however, it also needs the production share mounted in order to run Perl code against it when Unix commands via ssh will not do.... (2 Replies)
Discussion started by: effigy
2 Replies

4. Shell Programming and Scripting

Alternative for ikecert

Hi Folks... Is there an alternative for ikecert(SunOS) - man info - "manipulates the machine's on-filesystem public-key certificate databases" in linux? Can we use pkcs7, pkcs8 or something like that?... I also came across ssh-keygen and ssh-keygen2... My best guess is to use ssh-certtool... (0 Replies)
Discussion started by: ahamed101
0 Replies

5. HP-UX

alternative for egrep -o on HP-UX

Hello to all board members!! I have a problem on a HP-UX system. I should write a script. Therefore I need to search after IP addresses in the output of a command. On Debian this works: ifconfig | egrep -o "{1,3}\.{1,3}\.{1,3}\.{1,3}" The script where i need this is not ifconfig, but... (2 Replies)
Discussion started by: vostro
2 Replies

6. Shell Programming and Scripting

du alternative in perl

I have a perl script that just does a `du -sk -x` and formats it to look groovy ( the argument can be a directory but usually is like /usr/local/* ) #!/usr/bin/perl use strict; use warnings; my $sizes = `du -x -sk @ARGV | sort -n`; my $total = 0; print "MegaBytes Name\n"; for(split... (1 Reply)
Discussion started by: insania
1 Replies

7. Shell Programming and Scripting

help with while loop or any other alternative?

i=1 while do mm=02 dd=03 yy=2008 echo "$mm$dd$yy" i=$(( i+1)) echo "$i" done whenever i execute the script above i will get the error below: syntax error at line 30: `i=$' unexpected (3 Replies)
Discussion started by: filthymonk
3 Replies

8. Shell Programming and Scripting

getopts alternative?

I have to implement switches (options) like this in my script. ./myscript -help ./myscript -dir /home/krish -all ./myscript -all getopts allows switches to have one character (like a, b, etc.). How can I customize it for handling the above situation? Or, is there any alternative to... (3 Replies)
Discussion started by: krishmaths
3 Replies

Featured Tech Videos