Remove lines based on contents of another file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove lines based on contents of another file
# 1  
Old 03-24-2009
Remove lines based on contents of another file

So, this issue is driving me nuts! I was hoping to get a lending hand here...

I have 2 files:

file1.txt contains:
this is example1
this is example2
this is example3
this is example4
this is example5

file2.txt contains:
example3
example5

Basically, I need a script or command to generate a new file which contains only the lines that DON'T exist on file2.txt, ideally, the resulting file would look like this:

fileX.txt:
this is example1
this is example2
this is example4

I really appreciate your help, I can't seem to figure this one out..

Thanks a lot!

Dave
# 2  
Old 03-24-2009
you can try something like thisSmilie
Code:
grep -v -f file2.txt file1.txt

# 3  
Old 03-24-2009
I thought about that, but it takes forever... the files that I'm working with are quite large.

Is there a better way to accomplish this?

Thanks again!

Dave
# 4  
Old 03-24-2009
Code:
#! /usr/bin/perl 

open FILE, "/path/to/file1.txt"  or die "can't open file: $!\n";
while (<FILE>) {
        chomp;
    	push (@array_one, $_);
	}

open FILE, "/path/to/file2.txt"  or die "can't open file: $!\n";
while (<FILE>) {
        chomp;
        push (@array_two, $_);
        }

my @C = grep { my $x = $_; not grep { $x =~ /\Q$_/i } @array_two } @array_one;

foreach $C (@C) {
print "$C\n";
}


Last edited by s_becker; 03-24-2009 at 10:40 PM..
# 5  
Old 03-25-2009
you can use the below:-

nawk 'FILENAME=="file2" {a[$1]=$1 }
FILENAME=="file1" { if ( ! a[$3] ) { print $0} }
' file2 file1
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Remove lines from File.A based on criteria in File.B

Hello, I have two files of the following form. I would like to remove from File.A where the first three colum matches values in File.B to give the output in File.C File.A 121 54321 PQR CAT 122 765431 ABC DOG 124 98765 ZXY TIGER 125 86432 GEF LION File.B 122 765431 ABC 125 86432 GEF... (4 Replies)
Discussion started by: Gussifinknottle
4 Replies

2. UNIX for Beginners Questions & Answers

awk function to remove lines that contain contents of another file

Hi, I'd be grateful for your help with the following. I have a file (file.txt) with 10 columns and about half a million lines, which in simplified form looks like this: ID Col1 Col2 Col3.... a 4 2 8 b 5 6 1 c 8 4 1 d... (4 Replies)
Discussion started by: aberg
4 Replies

3. Shell Programming and Scripting

Remove or rename based on contents of file

I am trying to use the two files shown below to either remove or rename contents in one of those files. If in file1.txt $5 matches $5 of file2.txt and the value in $1 of file1.txt is not "No Match" then that value is substituted for all values in $5 and $1 of file2.txt. If however in $1 ... (5 Replies)
Discussion started by: cmccabe
5 Replies

4. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies

5. Shell Programming and Scripting

How to remove a line based on contents of the first column?

Good day all. Using basic UNIX/Linux tools, how would you delete a line based on a character found in column 1? For example, if the CITY name contains an 'a' or 'A', delete the line: New York City; New York Los Angeles; California Chicago; Illinois Houston; Texas Philadelphia;... (3 Replies)
Discussion started by: BRH
3 Replies

6. Shell Programming and Scripting

Two files, remove lines from second based on lines in first

I have two files, a keepout.txt and a database.csv. They're unsorted, but could be sorted. keepout: user1 buser3 anuser19 notheruser27 database: user1,2343,"information about",field,blah,34 user2,4231,"mo info",etc,stuff,43 notheruser27,4344,"hiya",thing,more thing,423... (4 Replies)
Discussion started by: esoffron
4 Replies

7. UNIX for Dummies Questions & Answers

Remove lines in a positional file based on string value

Gurus, I am relatively new to Unix scripting and am struck with a problem in my script. I have positional input file which has a FLAG indicator in at position 11 in every record of the file. If the Flag has value =Y, then the record from the input needs to be written to a new file.However if... (3 Replies)
Discussion started by: gsam
3 Replies

8. Shell Programming and Scripting

Remove certain lines from file based on start of line except beginning and ending

Hi, I have multiple large files which consist of the below format: I am trying to write an awk or sed script to remove all occurrences of the 00 record except the first and remove all of the 80 records except the last one. Any help would be greatly appreciated. (10 Replies)
Discussion started by: nwalsh88
10 Replies

9. Shell Programming and Scripting

Remove lines based on column value

Hi All, I just need a quick fix here. I need to delete all lines containing "." in the 6th column. Input: 1 1055498 . G T 5.46 . 1 1902377 . C T 7.80 . 1 1031540 . A G 34.01 PASS 1 ... (2 Replies)
Discussion started by: Hkins552
2 Replies

10. UNIX for Dummies Questions & Answers

find lines in another file based on contents in a second file

Hello, I have a file with tab delimited columns like: File1 A 2 C R F 4 D Q C 9 A B ...... I want to grep out the lines in a second file, File2, corresponding to each line in File1 Can I do this: while read a b c d do grep '$a\t$b\t$c\t$d' File2 >>... (2 Replies)
Discussion started by: Gussifinknottle
2 Replies
Login or Register to Ask a Question