The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Parsing out records from one huge record bwrynz1 UNIX for Dummies Questions & Answers 4 03-07-2008 02:57 PM
deleting files based on file name and modified time ammu UNIX for Dummies Questions & Answers 1 01-22-2008 08:09 AM
problem deleting date-time stamped file in a directory dharmesht High Level Programming 1 05-13-2004 06:31 AM
deleting records with a missing field gillbates UNIX for Dummies Questions & Answers 2 12-12-2002 08:52 PM
Time taken for creation of a huge core file nayeem Filesystems, Disks and Memory 4 10-12-2002 12:24 PM

Reply
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #1  
Old 02-05-2008
Registered User
 

Join Date: Jul 2006
Posts: 183
deleting multiple records from a huge file at one time

I have a very big file of 5gb size and there are about 50 million records in there. I have to delete the records based on recrord number that I know fromoutside with out opening the file. The record numbers are very random like 5000678, 7890005 etc.


Can somebody let me know how i can remove records based on the record number all at one time and not one time
a piece please?
Reply With Quote
Forum Sponsor
  #2  
Old 02-05-2008
Moderator
 

Join Date: Dec 2003
Location: /dev/fl
Posts: 1,061
What do you mean by "with out opening the file"? You cannot delete records without opening the file.

Is this file a structured file or a database file? Are the records in the file sorted or random? You seem to indicate that the records are random but I am unclear if you are referring to the file or the list of records to be deleted.

You need to provide more precise information if you want somebody to help you.
Reply With Quote
  #3  
Old 02-05-2008
Registered User
 

Join Date: Jul 2006
Posts: 183
The reason I said i want to delete with out opening is because the file is too large too open. The file is a regular ascii FILE with data in it. I just need to delete some records from there one time with out having to do it as many times as i want to delete the records.
Reply With Quote
  #4  
Old 02-05-2008
drl's Avatar
drl drl is offline
Registered User
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 556
Hi.

See post #6 in http://www.unix.com/shell-programmin...#post302154933 -- I think you should be able to adapt that procedure for creating a sed script that will delete specific lines in a single pass over the file. It was used to print ("p") print lines, but a delete is a similar operation. You would also need to omit the "-n" option on the final execution of sed.

It still will not be cheap -- every program that processes a file will "open" the file in the sense that it tells the system that it will be dealing with the content of that file. The program will need to read every line in order to create the new copy minus the lines you delete. Afterwards, you could rename the new file to the old name.

I suggest you try the procedure on small sample files first ... cheers, drl
Reply With Quote
  #5  
Old 02-06-2008
Registered User
 

Join Date: Jan 2008
Posts: 13
Dsarvan,
Split & process,we process few GB of file through AWK.
1.split them based lines (approximate nr)
2.process the the files in parallel ( if you server is having decent RAM & CPU's)
remember not to have same name if you use any temporary file, one way could be adding a random number or adding process id to it.
Reply With Quote
  #6  
Old 02-06-2008
Registered User
 

Join Date: Jul 2006
Posts: 183
Thank you very much drl. The post you gave me helped me.
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes




All times are GMT -7. The time now is 06:50 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0