File - reading - Performance improvement


 
Thread Tools Search this Thread
Top Forums Programming File - reading - Performance improvement
# 8  
Old 05-23-2008
Hi
This helps.
But a concern here is that i need to put a while loop in place for reading the bulk characters until i come across "\n" character as my aim is to get line by line from the file.

Thanks for the idea.

Regards
Dhana
# 9  
Old 06-04-2008
There are two sets of functions for reading data,

open/read/write/gets/close

that operate on file 'handles', and

fopen/fread/fwrite/fgets/fclose

that operate on FILE * 'streams'.

The big advantage of using the streams is that they are buffered whereas the file handles are not. What this means is that for the nonbuffered functions, every time you call read() it has to go out to the physical disk and read some data.

With the buffered functions, it allocates a block of memory internally (I believe 8kb but I'm not sure) and when you call fread() or fgets() it only hits the disk if there isn't enough data already in the buffer. This is much faster.

By the way, you can increase buffer size with setbuf() and you can use fgets() to get the next line (next occurrence of \n) rather than a fixed number of characters.

To get the fastest possible speed, as mentioned above, you would have to use a big buffer, read a large chunk of file at once and then go through it looking for line ends. This avoids extra copying the data, i.e. it's copied from disk into memory and then out again.

But I'd try just using fgets() first as it probably is fast enough.
# 10  
Old 06-04-2008
or what ever be the mode of opening a file, set it to buffered using setvbuf, that should turn on buffering mode
# 11  
Old 06-06-2008
Steven's book on advanced unix programming has a table showing read performance on files.

Since you are returning lines, somewhere down inside the C++ stdio module is calling something like fgets. It does call read() to fill a buffer. Steven's table show that buffer sizes of 4096 are probably close optimum. There are other examples that show using
struct statvfs.f_frsize - the block size of the filesystem in question will also help.

See man setvbuf.

The other components of speed are the i/o queue request length, on board disk caching
and how "far above" the native read call your code operates. The first two are system related. If you call this low-level read routine directly and parse out you own lines it will probably speed things up - use 4096 or f_frsize as the number of bytes to read:

This is taken from M. Rochkind's book - example:

Code:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>

ssize_t readall(int fd, void *buf, size_t nbyte)
{
         ssize_t nread=0;
         ssize_t n=0;

         memset(buf,0x0, nbyte+1);
         do
         {
                 if ((n = read(fd, &((char *)buf)[nread], nbyte - nread)) == -1)
                 {
                         if (errno == EINTR)
                                 continue;
                         else
                                 return (-1);
                 }
                 if (n == 0)
                                 return nread;
                 nread += n;
         } while (nread < nbyte);
         return nread;
}



void foo()
{
    ssize_t result=0;
	char buf[4200]={0x0};
	FILE *fp=fopen("somefile","r");

	if(fp!=NULL)
	{
		result=readall(fileno(fp), buf, 4096);
		if(result>0)
		{
			printf("%s", buf);
		}
	    if (result== (-1))
	    {
	    	perror("file I/O error");
	    	exit(1);
	    }
	}
	else
             {
		perror("file open error");
                          exit(1);
               }
}

Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Performance improvement in grep

Below script is used to search numeric data from around 400 files in a folder. I have 300 such folders. Need help in performance improvement in the script. Below Script searches 20 such folders ( 300 files in each folder) simultaneously. This increases cpu utilization upto 90% What changes... (3 Replies)
Discussion started by: vegasluxor
3 Replies

2. UNIX for Dummies Questions & Answers

Improvement in shell script

Hi This is my Following code: #!/bin/sh echo "TOTAL_NO_OF_MAILS" read TOTAL_NO_OF_MAILS echo "TOTAL_NO_OF_TICKETS " read TOTAL_NO_OF_TICKETS echo "TICKETS_IN_QUEUE" read TICKETS_IN_QUEUE rm -rf `pwd`/Focus echo "Hi Team\nSTATS IN CLRS MAIL BOX\n\n==============================" >> Focus... (11 Replies)
Discussion started by: wasim999
11 Replies

3. Shell Programming and Scripting

I need the improvement for my script

Hi All, Here is my script #! /bin/sh var1=some email id var2=some email id grep -i "FAILED FILE FORMAT VALIDATION" /opt >tmp2 diff tmp1 tmp2 | grep ">" >tmp3 if then cat tmp3 | mailx -s " Error Monitoring" $var2 else echo "Pattern NOt Found" | mailx -s " Error Monitoring" $var1... (1 Reply)
Discussion started by: Gopalak
1 Replies

4. UNIX for Advanced & Expert Users

linux os improvement

can anyone help to share the knowledge on linux os improvement? 1) os account - use window AD authentication, such as ldap, but how to set /etc/passwd, where to put user home? 2) user account activity - how to log os user activity share the idea and what tools can do that...thx (5 Replies)
Discussion started by: goodbid
5 Replies

5. Infrastructure Monitoring

Possible performance improvement (Bash and flat file)

Hello, I am pretty new to shell scripts and I recently wrote one that seems to do what it should but I am exploring the possibility of improving its performance and would appreciate some help. Here is what it does - Its meant to monitor a bunch of systems (reads in IPs one at a time from a flat... (9 Replies)
Discussion started by: prafulnama
9 Replies

6. Shell Programming and Scripting

Any improvement possible in this script

Hi! Thank you for the help yesterday This is the finished product There is one more thing I would like to do to it but I’m not to certain On how to proceed I would like to log all output to a log in order to Be able to roll back This script is meant to be used in repairing a... (4 Replies)
Discussion started by: Ex-Capsa
4 Replies

7. Shell Programming and Scripting

Script ready but might need some improvement.

Hi All, I have written a script which does some editing in the files, based on user input.This might not be the most elegant way of doing it and there would be many improvements needed. Please go through it and let me know how it could be improved. Suggestions are welcome!! Thanks!... (2 Replies)
Discussion started by: nua7
2 Replies
Login or Register to Ask a Question