Sponsored Content
Full Discussion: overhead of fopen/freopen
Top Forums Programming overhead of fopen/freopen Post 302407902 by migurus on Friday 26th of March 2010 09:51:25 PM
Old 03-26-2010
overhead of fopen/freopen

I always assumed the fopen/freopen is very costly, so when I needed to work with many files within on process I spent extra time to implement a list of FILE * pointers to avoid extra open/reopen but it did not produced any better results.

Here is a task at hand - there is a huge stream of data coming through stdin, each line is preceded with id and I need to place that line into its own file named id.log. The ids are coming not very random, but somewhat grouped.

Original code is very straightforward: read the line, get the ID, form the file name, do fopen/puts/fclose, loop to the next line. I thought the fopen/fclose is a bottleneck.

So, I built an array of {ID / FILE *ptr / counter} to keep last N opened files, should the next ID happens to be in the list I would just re-use the opened stream. Otherwise I either fopen stream for new entry into array, or when array has no more empty slots I would freopen the one that has the biggest number of writes. But the results are very close to the original simple approach.

My new code
Code:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <ctype.h>
#include <errno.h>
#include <sys/time.h>
#include <sys/resource.h>
 
typedef struct  {
        int     mid;
        int     cnt;
        FILE    *fp;
} MFP;
static  MFP     *mfp = NULL;
static  int     mfpcnt = 0;
 
main(int argc, char *argv[])
{
  int   mid, n, found, maxcnt, empty, curr;
  char  buf[1024], fullname[256];
  struct rlimit rl;
 
        if(getrlimit(RLIMIT_NOFILE, &rl) == 0)
                mfpcnt = rl.rlim_cur - 8; /* leave some for other streams */
        else
                mfpcnt = 16; /* arbitrary default */
        mfp = (MFP *)malloc(sizeof(MFP) * mfpcnt);
        memset(mfp, 0, sizeof(MFP) * mfpcnt);
 
        while(fgets(buf, sizeof(buf) - 1, stdin))
        {
                mid = atoi(buf);
                sprintf(fullname, "%04i.log", mid);
                maxcnt = 0;
                empty = -1;
                found = -1;
                for(n = 0; n < mfpcnt; n++)
                {
                        if(mfp[n].mid == mid)
                        {
                                found = n;
                                break;
                        }
                        if(mfp[n].cnt > mfp[maxcnt].cnt)
                        {
                                maxcnt = n;
                        }
                        if(mfp[n].cnt == 0 && empty == -1)
                        {
                                empty = n;
                        }
                }
                if(found != -1)
                {
                        curr = found;
                }
                else
                {
                        if(empty != -1)
                        {
                                curr = empty;
                                mfp[curr].fp = fopen(fullname, "a");
                        }
                        else
                        {
                                curr = maxcnt;
                                mfp[curr].cnt = 0;
                                mfp[curr].fp = freopen(fullname, "a",
                                                mfp[curr].fp);
                        }
                }
                fputs(buf, mfp[curr].fp);
                mfp[curr].cnt++;
        }
        return(0);
}

I had some counters printed out just to confirm the whole scheme is working, it confirmed there are around 10% - 20% of reusing already opened file stream, so no fopen/freopen needed. But if measured by time the new code is not more than %5 faster. Is there any explanation?
 

10 More Discussions You Might Find Interesting

1. Programming

difference between fdopen() and freopen()

hi , I came acroos two functions fdopen() and freopen(). what is the difference between these two functions and where can they be used. Is it that fdopen() is used to write freopen(). Advance Thanks for your co-operation. :) (1 Reply)
Discussion started by: kinnaree
1 Replies

2. Programming

.cc fopen failed - Broken Pipe

hello.. i make some code with C in freebsd 5.4 and compile it in solaris somehow i succeed compile the program. but when i run it, i got error message "Broken Pipe" i looked out the syntax that that caused this, fp = fopen("file.tmp","r"); does anyone know why, and how to solve this... (3 Replies)
Discussion started by: kuampang
3 Replies

3. Web Development

CAN TCPDF USE fopen() or Convert URL To PDF?

Dear all, I'm a newbie for PHP and TCPDF ,I have to change the URL to PDF, so I used FPDF , But it cannot convert most of the advanced HTML tags. So explored again and found TCPDF , it can do most of the tag but I cannot found to change URL to PDF. So Does anyone can point the example... (0 Replies)
Discussion started by: athae
0 Replies

4. UNIX for Advanced & Expert Users

overhead in the archive

Hi everyone, I am currently trying to work out the size overhead in the library archive. The total size of all my objects file is about 100KB. However, when I package them into the archive (libXX.a), the size gets boosted up to 200KB. I want to know what exact is that 100KB overhead. I tried... (1 Reply)
Discussion started by: jasoncrab
1 Replies

5. UNIX for Advanced & Expert Users

Linux fopen() mistery. Help required.

Hello! I'm having problems with fopen() call in Linux. I have shared library (created by myself) that implements some file operations: int lib_func(char* file_name) { ... fd = fopen(file_name, "r"); if(!fd) {... exit with error ...} ... do something useful using fd ... ... (2 Replies)
Discussion started by: kalbi
2 Replies

6. Programming

fopen and open

what is the difference between fopen and open fread and read fwrite and write open and create why this much of functions for the i/o when everything does the same...? What is their major difference? In which case, which is the best to use. :confused:'ed Collins (2 Replies)
Discussion started by: collins
2 Replies

7. Programming

fopen() - don't know what I'm doing wrong

This code works fine when I use a command line argument for fopen()'s parameter, but when I change it to a filename, the program freezes upon compilation. input.txt is definitely there, so I can't figure it out. Thanks. #include <stdlib.h> #include <stdio.h> #include <ctype.h> int... (3 Replies)
Discussion started by: lazypeterson
3 Replies

8. UNIX for Dummies Questions & Answers

Overhead of using a shared library

Hi, I found a very strange thing when I linked my executable with a shared library. That is the executable only references a small function of the shared library, and the size of this function is only hundred bytes, but when I check the /proc/pid/smaps, I found that the 'Rss' of this shared... (8 Replies)
Discussion started by: Dongping84
8 Replies

9. Web Development

Java overhead

Hey Guys and girls,can anybody with a experience in java since i am pretty new in it, tell me why a java or java enabled web program is eating up so much system resources like CPU,Ram......ect and how to go by finding what is causing the overhead.;) Thanks a mill (3 Replies)
Discussion started by: techcreeb
3 Replies

10. Programming

help plz - fopen()

Hello, I have a problem here, I want to write a function called"myfopen()" instead of "fopen()" for writing this function I must not use the <stdio.h> library, Can you help me? thanks a lot (2 Replies)
Discussion started by: hamed.samie
2 Replies
SETBUF(3S)																SETBUF(3S)

NAME
setbuf, setbuffer, setlinebuf - assign buffering to a stream SYNOPSIS
#include <stdio.h> setbuf(stream, buf) FILE *stream; char *buf; setbuffer(stream, buf, size) FILE *stream; char *buf; int size; setlinebuf(stream) FILE *stream; DESCRIPTION
The three types of buffering available are unbuffered, block buffered, and line buffered. When an output stream is unbuffered, information appears on the destination file or terminal as soon as written; when it is block buffered many characters are saved up and written as a block; when it is line buffered characters are saved up until a newline is encountered or input is read from stdin. Fflush (see fclose(3S)) may be used to force the block out early. Normally all files are block buffered. A buffer is obtained from malloc(3) upon the first getc or putc(3S) on the file. If the standard stream stdout refers to a terminal it is line buffered. The standard stream stderr is always unbuffered. Setbuf is used after a stream has been opened but before it is read or written. The character array buf is used instead of an automati- cally allocated buffer. If buf is the constant pointer NULL, input/output will be completely unbuffered. A manifest constant BUFSIZ tells how big an array is needed: char buf[BUFSIZ]; Setbuffer, an alternate form of setbuf, is used after a stream has been opened but before it is read or written. The character array buf whose size is determined by the size argument is used instead of an automatically allocated buffer. If buf is the constant pointer NULL, input/output will be completely unbuffered. Setlinebuf is used to change stdout or stderr from block buffered or unbuffered to line buffered. Unlike setbuf and setbuffer it can be used at any time that the file descriptor is active. A file can be changed from unbuffered or line buffered to block buffered by using freopen (see fopen(3S)). A file can be changed from block buffered or line buffered to unbuffered by using freopen followed by setbuf with a buffer argument of NULL. SEE ALSO
fopen(3S), getc(3S), putc(3S), malloc(3), fclose(3S), puts(3S), printf(3S), fread(3S) BUGS
The standard error stream should be line buffered by default. The setbuffer and setlinebuf functions are not portable to non-4.2BSD versions of UNIX. On 4.2BSD and 4.3BSD systems, setbuf always uses a suboptimal buffer size and should be avoided. Setbuffer is not usually needed as the default file I/O buffer sizes are optimal. 4th Berkeley Distribution May 12, 1986 SETBUF(3S)
All times are GMT -4. The time now is 01:22 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy