![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Forum Rules | FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| file comparison...help needed. | er_ashu | UNIX for Dummies Questions & Answers | 4 | 05-15-2008 06:37 PM |
| Comparison Unix and Windows file sysytem | localp | UNIX for Dummies Questions & Answers | 1 | 04-11-2008 01:02 AM |
| Output format - comparison with I/p file | velappangs | Shell Programming and Scripting | 1 | 04-03-2008 03:31 AM |
| file comparison script | tiger99 | Shell Programming and Scripting | 1 | 01-30-2008 07:47 AM |
| File Time Comparison Question | pc9456 | UNIX for Advanced & Expert Users | 2 | 07-23-2003 12:05 PM |
|
|
Submit Tools | LinkBack | Thread Tools | Search this Thread | Display Modes |
|
#15
|
|||
|
|||
|
I am trying stateful method, but I am not getting any output.
I made your code as a script file and executed it where the files reside, do not see anything,it comes back without any output or error. I am trying on small files to verify. |
| Forum Sponsor | ||
|
|
|
#16
|
||||
|
||||
|
Hi.
Quote:
If this is correct, and you have 2 such files, then I think any method that reads a line from file1 and uses it with a program to look through file 2 at each step will not end quickly, because there will be 15 M loads of that program involved, not to mention actually reading the file. For example, doing a grep reading /dev/null for 15,000 times takes about 10 seconds (10.2 actually) real time. For 1,000 times that, I'd be looking at 2.75 hours just to load grep from the disk and read an immediate EOF. A grep of a non-existent string takes about 18 seconds for a single search. I suggest that the files be sorted and diff be run once on the two files (post #8, rikxik). That will be 2 passes across each file, a decrease of close to 100% from 15M passes over 1 file. If my facts are wrong, then tell me where I missed something of importance or made a mistake. Otherwise, perhaps we should take a step back and you tell us what the higher purpose of the problem is -- what problem you are really trying to solve -- perhaps we can suggest some other approach ... cheers, drl |
|
#17
|
||||
|
||||
|
Quote:
|
|
#18
|
||||
|
||||
|
Hi, rikxik.
I was thinking that the diff window to look for sequences would not be so large. However, if the files were very similar, then the sort could perhaps be skipped -- I hope for the best, but expect the worst It would be interesting to try it both ways, of course ... cheers, drl |
|
#19
|
|||
|
|||
|
I did sort both the files and then tried diff as well as grep -v -f file1 file2, same problem.
It is running for too long. |
|
#20
|
||||
|
||||
|
Hi.
Perhaps I had more luck -- I didn't have to wait so long for a definitive answer. On 2 different machines, I had 2 large, similar, but different files of size about 1 GB. One machine had 2.5 GB memory, the other 1 GB. When I used diff, I got the message: Code:
diff: memory exhausted Exit status: 2 Code:
comm -3 file1 file2 You may need to glance at man comm to see what it is doing -- it does require sorted input files, and then presents unique entries in both files. Best wishes ... cheers, drl |
||||
| Google The UNIX and Linux Forums |