Read/Write a fairly large amount of data to a file as fast as possible


 
Thread Tools Search this Thread
Top Forums Programming Read/Write a fairly large amount of data to a file as fast as possible
# 1  
Old 04-23-2009
Read/Write a fairly large amount of data to a file as fast as possible

Hi,

I'm trying to figure out the best solution to the following problem, and I'm not
yet that much experienced like you. :-)

Basically I have to read a fairly large file, composed of "messages" , in order
to display all of them through an user interface (made with QT).

The messages that I write into the file, comes all at once from a socket, so
in oder to write it quickly without loosing any of them I plan to do the following:

- Create a list of preallocated pages (3-4 by default but the list grows if needed)
- Write the data that comes from the socket to the preallocated buffer
- Once a page is full, schedule a write with aio_write (AIO - Asyncronous IO).
- On callback schedule another one if any is full.
And so on..

This is the best I could come up with in the writing part, but if any of you have
a better idea, please let me know.

Now the problem comes when I have to read the file, at a later time, and display all
the messages in order to analyze them as fast as possible.

I first thought of mmap'ing the file in order to copy the data only once, from the file
to the kernel cache (if I understood correctly how mmap works internally) and then
accessing it from the application. But I'm not sure this can be done, or convenient, as
the file might be pretty big (2,3 Giga bytes, although I'm not sure about the magnitude).
Beside the kernel could unload the pages and many page faults could occurred.
So I discarded this idea.

I also thought about the opposite of what I do for writing but I'm not sure is a good idea.

The main problem is that I have to decode the messages before displaying them, as they are of different type and variable length. So reading the whole file at once and then decoding them to copy to another memory location seem time consuming to me. As it
requires 4 copies (disk -> kernel -> user space -> user space after decoding).

Anyway, now it's your turn. :-)
Any help would be appreciated.

Thanks.
# 2  
Old 04-23-2009
First off, have you already proven that conventional I/O (read/write or stdio) is simply not adequate for your files? Buffering is your friend.

You might want to read Steven's 'Advanced Programming in the Unix Environment' -
the chapter (Chap 8, I think) with the table on the effect of buffering on I/O....

Rochkind's 'Advanced Unix Programming' has some examples of high-performance read/write routines using conventional syscalls, including mmap().

You should consider that pitching programmatically simpler methodology for more complex methodology is never always a given. What you gain in speed may not be worth the extra programming time and maintenance time. Is say, 100 extra hours of your time worth a 10% gain in performance? Your manager might say 'No'.
# 3  
Old 04-24-2009
What kind of socket do you have that your hard drive cannot keep up with? Normal read/write calls are not slow. Seeking is slow, if you're going to be seeking randomly all over the place then mmap-ing it might be better. But keep in mind that, on 32-bit machines at least, you're limited in how big an area you can map, a gig is a big chunk of a process' 4-gig address space. 64-bit's limit is much, much higher.

What you might also find useful is cache-hinting, being able to tell the kernel 'OK, I am done with this area of the file for the foreseeable future' in order to let it purge data from cache earlier than it might otherwise have, or do read-ahead differently, etc. It gives you some of the advantages of raw I/O without the problems. See fadvise and madvise.

Last edited by Corona688; 04-24-2009 at 06:11 PM..
# 4  
Old 04-24-2009
Corona said it much better.... must be a terabit line....
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to make awk command faster for large amount of data?

I have nginx web server logs with all requests that were made and I'm filtering them by date and time. Each line has the following structure: 127.0.0.1 - xyz.com GET 123.ts HTTP/1.1 (200) 0.000 s 3182 CoreMedia/1.0.0.15F79 (iPhone; U; CPU OS 11_4 like Mac OS X; pt_br) These text files are... (21 Replies)
Discussion started by: brenoasrm
21 Replies

2. Shell Programming and Scripting

Need a UNIX/perl script to read and write the data

Hi, I have on Designdocument in that information is stored with in tabular format.I need Perl/unix script to read and write the data using perl script? Regards, Ravi (4 Replies)
Discussion started by: toravi.pentaho
4 Replies

3. Shell Programming and Scripting

Need a perl script to read and write the data

Hi, I have on Designdocument in that information is stored with in tabular format.I need Perlscript to read and write the datausing perl script? Regards, Ravi (0 Replies)
Discussion started by: toravi.pentaho
0 Replies

4. Shell Programming and Scripting

Perl : Large amount of data put into an array

This basic code works. I have a very long list, almost 10000 lines that I am building into the array. Each line has either 2 or 3 fields as shown in the code snippit. The array elements are static (for a few reasons that out of scope of this question) the list has to be "built in". It... (5 Replies)
Discussion started by: sumguy
5 Replies

5. Shell Programming and Scripting

Read user input, Encrypt the data and write to file

Hi, can some one help me how to encrypt and decrypt a file. AIM: reade user input, encrypt it and save it to file. while decryption read the encrypted file decrypt it and save the output in some variable. Example: consider we have Credentials.txt file with content username: password... (5 Replies)
Discussion started by: saichand1985
5 Replies

6. Solaris

Read/Write Data on CD/RW Disk

Would simply like to write data (no audio) to a CD/RW disk. The disk drive states CD/RW on the front but don't know for sure if the software is configured to recognize it as a writable disk. I can read/move data from the disk to the hard drive with no issue from the disk. Any help in this... (4 Replies)
Discussion started by: jes1trish
4 Replies

7. Shell Programming and Scripting

How to tar large amount of files?

Hello I have the following files VOICE_hhhh SUBSCR_llll DEL_kkkk Consider that there are 1000 VOICE files+1000 SUBSCR files+1000DEL files When i try to tar these files using tar -cvf backup.tar VOICE* SUBSCR* DEL* i get the error: ksh: /usr/bin/tar: arg list too long How can i... (9 Replies)
Discussion started by: chriss_58
9 Replies

8. Shell Programming and Scripting

Read the apecific data from one file and write into another file

Hi, I would like to read the specific data from file and write the data in the new file. My data input is something like this.. <EXROP:R=TJ0311T; ROUTE DATA R ROUTE PARAMETERS TJ0311T DETY=UPDR TTRANS=1 FNC=3 MA=628160955000 R=TJ0311D ... (3 Replies)
Discussion started by: bha148
3 Replies

9. AIX

amount of memory allocated to large page

We just set up a system to use large pages. I want to know if there is a command to see how much of the memory is being used for large pages. For example if we have a system with 8GB of RAm assigned and it has been set to use 4GB for large pages is there a command to show that 4GB of the *GB is... (1 Reply)
Discussion started by: daveisme
1 Replies

10. Shell Programming and Scripting

Read Write byte range/chunk of data from specific location in file

I am new to Unix so will really appreciate if someone can guide me on this. What I want to do is: Step1: Read binary file - pick first 2 bytes, convert from hex to decimal. Read the next 3 bytes as well. 2 bytes will specify the number of bytes 'n' that I want to read and write... (1 Reply)
Discussion started by: Kbenipel
1 Replies
Login or Register to Ask a Question