The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Removing the first and last lines in a file naveendronavall Shell Programming and Scripting 2 12-29-2007 09:22 PM
Removing the first and last lines in a file naveendronavall AIX 1 12-29-2007 07:44 PM
Removing lines from a file computersaysno UNIX for Dummies Questions & Answers 6 11-14-2006 02:50 PM
Removing lines within a file tookers Shell Programming and Scripting 3 08-22-2006 06:49 AM
Removing lines in a text file. WABonnett Shell Programming and Scripting 4 11-25-2003 07:27 AM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 12-16-2005
Registered User
 

Join Date: Dec 2005
Posts: 25
Removing lines from a file

Hello

i have 2 files file1 and file2 as shown below

file1
110010000000206|567810008161509
110010000000207|567810072227627
110010000000208|567811368851555
110010000000209|567811422513652
110010000000210|567812130217683
110010000000211|567813220211182
110010000000212|567813449322589
110010000000213|567813741319623
110010000000214|567816323171591
110010000000215|567816660521463
110010000000216|567818208711973
110010000000217|567819516604228
110010000000218|567819540685909
110010000000219|567820748714137
110010000000220|567821948536668
110010000000221|567822556413253

file2

110010000000206
110010000000210
110010000000211
110010000000214
110010000000217
110010000000221

Now i want a third file obtained from file1 which will not have the entries from file2.

ie

110010000000207|567810072227627
110010000000208|567811368851555
110010000000209|567811422513652
110010000000212|567813449322589
110010000000213|567813741319623
110010000000215|567816660521463
110010000000216|567818208711973
110010000000218|567819540685909
110010000000219|567820748714137
110010000000220|567821948536668

Now my problem is that file1 has 10 million entries and file2 has half a million entries. So grep -v option is out. Pls suggest an easy way out.

I seem to get stuck with problems working with big files.

Regards
Pradeep
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 12-16-2005
Registered User
 

Join Date: Dec 2005
Posts: 5
It's going to be difficult to parse that many row in any shell script.

Are you not able to load the files into a DB via isql, bcp, or sqlplus?
Reply With Quote
  #3 (permalink)  
Old 12-16-2005
vgersh99's Avatar
Moderator
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 3,002
nawk -f pra.awk file2 file1

pra.awk:
Code:
BEGIN {
  FS=OFS="|"
}
NR==FNR { arr[$1]; next}
!($1 in arr) && !($2 in arr)
Reply With Quote
  #4 (permalink)  
Old 12-16-2005
Registered User
 

Join Date: Dec 2005
Location: London
Posts: 222
Try this

egrep -v -f tmp2 tmp1
Reply With Quote
  #5 (permalink)  
Old 12-17-2005
Registered User
 

Join Date: Dec 2005
Posts: 25
Thanks nawk worked!!!

Thats a load off my chest

Thanks a lot man

Regards
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 09:49 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0