find lines in file1.txt not found in file2.txt memory problem


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers find lines in file1.txt not found in file2.txt memory problem
# 1  
Old 07-15-2011
find lines in file1.txt not found in file2.txt memory problem

I have a diff command that does what I want but when comparing large text/log files, it uses up all the memory I have (sometimes over 8gig of memory)

Code:
diff file1.txt file2.txt | grep '^<'| awk '{$1="";print $0}' | sed 's/^ *//'

Is there a better more efficient way to find the lines in one file that aren't found in another file?
Thanks
# 2  
Old 07-15-2011
What does your input look like and what should your output look like?
# 3  
Old 07-15-2011
It should get the lines in file1.txt not found in file2.txt For example:

Code:
File1:

000
555
aaa
ccc
jjj
zzz

Code:
File 2:

000
ccc
111
hhh
vvv

when I run the command it should get:

Code:
555
aaa
jjj
zzz

---------- Post updated at 08:10 PM ---------- Previous update was at 04:25 PM ----------

The error that I would get when using this diff command on large files is:

diff: memory exhausted

any ideas for alternative commands that do the same thing but doesn't use as much memory?
# 4  
Old 07-16-2011
AFAIK your two options are use a lot of memory, or use a lot of time. is file1 8G?
# 5  
Old 07-16-2011
It is only 2gigs but for some reason it uses more than 8gigs of memory
# 6  
Old 07-16-2011
You might get lucky with the following grep approach. It should consume less memory than your diff solution:
Code:
grep -vxFf file2 file1

If file2 is large and contains many duplicates, filtering it through sort|uniq would help.


Regards,
Alister
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Awk, sed, shell all words in INPUT.txt find in column1 of TABLE.txt and replce with column2 in

Hi dears i have text file like this: INPUT.txt 001_1_173 j nuh ]az 001_1_174 j ]esma. nuh ]/.xori . . . and have another text like this TABLE.txt j j nuh word1... (6 Replies)
Discussion started by: alii
6 Replies

2. UNIX for Dummies Questions & Answers

Compare file1 and file2, print matching lines in same order as file1

I want to print only the lines in file2 that match file1, in the same order as they appear in file 1 file1 file2 desired output: I'm getting the lines to match awk 'FNR==NR {a++}; FNR!=NR && a' file1 file2 but they are in sorted order, which is not what I want: Can anyone... (4 Replies)
Discussion started by: pathunkathunk
4 Replies

3. Shell Programming and Scripting

Based on column in file1, find match in file2 and print matching lines

file1: file2: I need to find matches for any lines in file1 that appear in file2. Desired output is '>' plus the file1 term, followed by the line after the match in file2 (so the title is a little misleading): This is honestly beyond what I can do without spending the whole night on it, so I'm... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

4. Shell Programming and Scripting

Delete file2.txt from file1.txt using scripting

Hi, I`m a total newbie, well my requirement is that i have 2 files I want to identify which countries i do not currently have in db.. how can i use the grep or another command to find this file .. i want to match all-countries.txt with countries-in-db.txt so the output is equal to... (11 Replies)
Discussion started by: beanbaby
11 Replies

5. UNIX for Dummies Questions & Answers

if matching strings in file1 and file2, add column from file1 to file2

I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string. I'm looking to match column1 in file1 to the number... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

6. Shell Programming and Scripting

How to find lines in a .txt contains the strings I want

I have a .txt contains a lot of lines. Now I want to write a shell script to find out all the lines which contain the strings I want, and print these lines. For example: A.txt when you post any code you can easily do this highlighting your code and then click you should do a Google... (6 Replies)
Discussion started by: Henryyy
6 Replies

7. Shell Programming and Scripting

awk '{print $ from file1.txt}'

Hi All, I have a file1.txt where the index of the columns are placed. I want to get the columns from file2.txt corresponding to these index numbers. I was usually using awk '{print $5, $6, $2, $3, ...}' file2.txt > output.txt However, this list is very long. So, i want to read the... (4 Replies)
Discussion started by: senayasma
4 Replies

8. Shell Programming and Scripting

merging two .txt files by alternating x lines from file 1 and y lines from file2

Hi everyone, I have two files (A and B) and want to combine them to one by always taking 10 rows from file A and subsequently 6 lines from file B. This process shall be repeated 40 times (file A = 400 lines; file B = 240 lines). Does anybody have an idea how to do that using perl, awk or sed?... (6 Replies)
Discussion started by: ink_LE
6 Replies

9. Shell Programming and Scripting

sed to cp lines x->y from 1.txt into lines a->b in file2.txt

I have one base file, and multiple target files-- each have uniform line structure so no need to use grep to find things-- can just define sections by line number. My question is quite simple-- can I use sed to copy a defined block of lines (say lines 5-10) from filename1.txt to overwrite an... (3 Replies)
Discussion started by: czar21
3 Replies

10. UNIX for Dummies Questions & Answers

echo "ABC" > file1.txt file2.txt file3.txt

Hi Guru's, I need to create 3 files with the contents "ABC" using single command. Iam using: echo "ABC" > file1.txt file2.txt file3.txt the above command is not working. pls help me... With Regards / Ganapati (4 Replies)
Discussion started by: ganapati
4 Replies
Login or Register to Ask a Question