![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| file comparison...help needed. | er_ashu | UNIX for Dummies Questions & Answers | 4 | 05-15-2008 09:37 PM |
| Comparison Unix and Windows file sysytem | localp | UNIX for Dummies Questions & Answers | 1 | 04-11-2008 04:02 AM |
| Output format - comparison with I/p file | velappangs | Shell Programming and Scripting | 1 | 04-03-2008 06:31 AM |
| file comparison script | tiger99 | Shell Programming and Scripting | 1 | 01-30-2008 10:47 AM |
| File Time Comparison Question | pc9456 | UNIX for Advanced & Expert Users | 2 | 07-23-2003 03:05 PM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
I'd probably use diff too...
If the lines in the files are similar to the lines you put in your first post, meaning there are no spaces on the lines, you could: Code:
#!/bin/sh for k in `cat file1` do grep -m 1 $k file2 > /dev/null if [ $? -eq 1 ]; then echo $k; fi done |
|
|||||
|
Hi.
Quote:
If this is correct, and you have 2 such files, then I think any method that reads a line from file1 and uses it with a program to look through file 2 at each step will not end quickly, because there will be 15 M loads of that program involved, not to mention actually reading the file. For example, doing a grep reading /dev/null for 15,000 times takes about 10 seconds (10.2 actually) real time. For 1,000 times that, I'd be looking at 2.75 hours just to load grep from the disk and read an immediate EOF. A grep of a non-existent string takes about 18 seconds for a single search. I suggest that the files be sorted and diff be run once on the two files (post #8, rikxik). That will be 2 passes across each file, a decrease of close to 100% from 15M passes over 1 file. If my facts are wrong, then tell me where I missed something of importance or made a mistake. Otherwise, perhaps we should take a step back and you tell us what the higher purpose of the problem is -- what problem you are really trying to solve -- perhaps we can suggest some other approach ... cheers, drl |
|
|||||
|
Quote:
|
|
|||||
|
Hi, rikxik.
I was thinking that the diff window to look for sequences would not be so large. However, if the files were very similar, then the sort could perhaps be skipped -- I hope for the best, but expect the worst It would be interesting to try it both ways, of course ... cheers, drl |
![]() |
| Bookmarks |
| Tags |
| linux |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|