![]() |
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| awk - comparing files | dbrundrett | Shell Programming and Scripting | 6 | 01-18-2009 10:51 PM |
| Comparing data in file with values in table | Mohit623 | Shell Programming and Scripting | 0 | 01-22-2008 08:57 AM |
| Comparing 2 files | hdixon | UNIX for Dummies Questions & Answers | 2 | 08-01-2007 01:24 PM |
| comparing shadow files with real files | terrym | UNIX for Advanced & Expert Users | 4 | 02-09-2007 02:38 AM |
| Comparing data list... | giannicello | UNIX for Dummies Questions & Answers | 4 | 03-06-2003 01:08 PM |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
||||
|
Hello everyone, here's the scenario
I have two files, each one has around 1,300,000 lines and each line has a column (phone numbers). I have to get the phones that are in file1 but not in file2. I can get these phones trough Oracle but my boss does not want that so he gave me the files with the phone numbers (he said it will take hours to finish the query and that will reduce the server resources or something like that). First I tried to solve the problem with some perl scripting but it took like 10 minutes just to read the files and because my poor programming skills i tried to do the search with a double foreach, something like this: @file1 = <SOME1>; @file2 = <SOME2>; $n = 0; $flag = true; #if $flag = false then the element is in file2 foreach $row1 (@file1) { foreach $row2 (@file2) { if($row1 == $row2) $flag = false } if($flag) { $anArray[$n]\=$row1; #ignore the backslash please $n++; } $flag = true; } if($n > 0) { foreach $row3 (@anArray) { print OUT_FILE "$row3\n"; } } The data from the files is like this: FILE1 ---------------------------- 1234567890 0987654321 2345678901 9012345678 FILE2 ---------------------------- 1234567890 0987654321 2345678901 OUT_FILE must be ---------------------------- 9012345678 but this solution wil take ages to finish so now i am thinking in using awk or another lenguage but i really don't know which one is better for this problem and what algorithm i should use (besides i have never used awk or shell scripting, I'm new using UNIX), I was thinking in sort the files and then do a binary search but i have some doubts about it so i feel really lost now Thanks for your help |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|