The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > High Level Programming
Google UNIX.COM


High Level Programming Post questions about C, C++, Java, SQL, and other programming languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
reading from a file and pass as variables and ignore # in the file konark Shell Programming and Scripting 4 11-08-2007 12:55 AM
Announcing collectl - new performance linux performance monitor MarkSeger News, Links, Events and Announcements 0 10-26-2007 03:14 PM
Reading file names from a file and executing the relative file from shell script anushilrai Shell Programming and Scripting 4 03-10-2006 02:25 AM
performance of writing into a file shriashishpatil High Level Programming 1 12-22-2005 07:51 AM
File Upload Performance using IE from Windows to AIX via HTTPS darontan AIX 1 10-27-2005 11:04 PM

Reply
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #1  
Old 05-21-2008
Read Only
 

Join Date: Jun 2006
Posts: 105
File - reading - Performance improvement

Hi All
I am reading a huge file of size 2GB atleast. I am reading each line and cutting certain columns and writing it to another file.

Here is the logic.

Code:
int main()
{
     
	  string u_line;
	  string Char_List;
	  string u_file;
	  int line_pos;
	  string temp_form_u_file;
	  ofstream temp_u_file;
	  u_file=getenv("u_file");
	  temp_form_u_file=getenv("DATA_DIR");
	  ifstream U_File;
	  temp_u_file.open(temp_form_u_file.c_str(),ios::app);
      
	  
	  if (temp_u_file.fail()) {
      cout << "Unable to open file "<<temp_form_u_file<<" for writing" << endl;
      exit(1);
      }
       
	    
      U_File.open(u_file.c_str());
      if (U_File.fail())
      {
         cout<<"File "<<u_file<<" unable to open for reading\n";
         cout<<"dart_report job failed\n";
	     exit(3);
      } 
    
      while (! U_File.eof() )
      {

		 line_pos=72;
		 u_line.erase();
		 getline (U_File,u_line);

		 if ( ! u_line.empty())  {
         while (line_pos< u_line.length())
	     {
           
	       if (u_line.substr(line_pos,2)!= "  ")
	       {
  
				Char_List=u_line.substr(line_pos,41);
				Char_List.append(u_line.substr(16,4));
				Char_List.append("\n");
				temp_u_file<< Char_List;
          
           } 
                line_pos=line_pos+41;

	    }

      }
    }  
}
When i run this program it takes 2.5 to 3 hours to read the 2 GB file. I am trying to reduce the time taken to reading. Is there any way i can reduce the processing time of the program.

Kindly let me know. If i can use Shell Script it is also okay. But i feel 'C' will be faster than Shell Scripting.

Please give me your suggestions.

Regards
Dhana

Last edited by Yogesh Sawant; 05-21-2008 at 10:35 PM. Reason: added code tags
Reply With Quote
Forum Sponsor
  #2  
Old 05-21-2008
Registered User
 

Join Date: Feb 2008
Posts: 18
I believe shell script should be faster. With C/C++, there is a lot of copying of data to/from kernel, which makes C/C++ programs slow. To make C/C++ programs faster, you may use multithreading also.

- Dheeraj
Reply With Quote
  #3  
Old 05-21-2008
Registered User
 

Join Date: Mar 2008
Location: Delhi
Posts: 7
Hi,
I would suggest to use fread that is read data in bulk say thousands at a time and then manipulate it.You will surely get the performance improvement.
Reply With Quote
  #4  
Old 05-22-2008
Read Only
 

Join Date: Jun 2006
Posts: 105
Definitely C/C++ is faster than Shell Script.
Can you explain how fread is faster because i am going to read line by line only.


Regards
Kuttalaraj
Reply With Quote
  #5  
Old 05-22-2008
Registered User
 

Join Date: May 2008
Posts: 8
I think, read and write are the most low level system calls. All the other function like fread and fwrite again uses some low level function to do their work.
I think, using read for reading a chunk of data can improve the performance since their is not much overhead involved.

Regards,
Aamir
Reply With Quote
  #6  
Old 05-22-2008
Read Only
 

Join Date: Jun 2006
Posts: 105
HI
read(fd, buffer, n_to_read)
I am trying to use the above call, but i will not be able to read the entire line as i will not now the length of the line before hand.

This part is little tricky to handle.
If you have any idea please let me know.

Regards
Dhana
Reply With Quote
  #7  
Old 05-22-2008
Registered User
 

Join Date: May 2008
Posts: 8
Hello!
What you can try out is: have a huge circular buffer for example say around 6144 (6KB) , you can experiment with the size!!
What i mean by circular is have two pointers, start_ptr and processed_ptr.

Code:
offset = 0;
read(fd, &buffer[offset], 3KB);
if(offset == 0)
{
    // next time read in the next chunk of buffer
    offset = 3KB;
    start_ptr = 0;
}
else
{    
    offset = 0;
    start_ptr = 3KB; 
}

bytes_read = start_ptr - processed_ptr;

//start processing it
while(bytes_read >= minimum_size_of_record)
{
     ret_val = check_for_complete_record(processed_ptr);
    // incomplete record
     if(ret_val == -1)
     {
             // don't modify processed_ptr since the record is not complete
             break; //without modifying the pointers 
     }
     else
     {
          // in this case check_for_complete_record will return the size of record
          bytes_read = bytes_read - ret_val;
          processed_ptr = processed_ptr + ret_val;
     }      
}
1) Have start_ptr and processed_ptr as global
2) You must take care of rollover of processed_ptr for every read

Code:
     if(processed_ptr >= MAX_BUFFER_SIZE) // in this case 6KB
               processed_ptr = 0;
Regards,
Aamir
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes




All times are GMT -7. The time now is 06:23 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0