Searching for terms contained in 2 separate files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Searching for terms contained in 2 separate files
# 1  
Old 02-15-2012
Searching for terms contained in 2 separate files

Hi All !

As a dummy, I try to search for multiple string of characters contained in 2 types of list.
Not clear? No it's not.

Example:
I have a tab-delimited file called "inventory" with 2 columns that looks like that:
Code:
#ref  #year,color
xrt3   2000,blue
gf5    2000,red,green,yellow
tyr6   2001,purple,pink
gh3   2003,brown
euo4  2005,black,grey

A file "list1.tab" contains:
Code:
#years
2000
2003
2005

Another file "list2.tab" contains:
Code:
#colours
red
black

Using egrep, I would like to extract the lines from the file "inventory" that contain both ONE of the terms from "list1" AND one of the term from "list2".

In a way I should obtain only:
Code:
gf5  2000,red,green,yellow
euo4  2005,black,grey

Here it is just an example.
In reality my file "inventory" contains about 100,000 lines, the list1.tab and list2.tab about 100 terms each.

Is it possible to use egrep to write something with this meaning:
egrep '(<one of the term from list1>)*(<one of the term from list2>)'

I don't really care of the column "ref".

I hope I have been clear enough !
I can definitely give more explanation if needed.

Thanks in advance guys !!!
# 2  
Old 02-15-2012
Not so efficient. But works.

Code:
#! /bin/bash

while read year
do
    [[ $year =~ /^#/ ]] && continue
    while read color
    do
        [[ $color =~ /^#/ ]] && continue
        grep "$year.*$color" inventory
    done < list2.tab
done < list1.tab

This User Gave Thanks to balajesuri For This Post:
# 3  
Old 02-15-2012
Try:
Code:
grep -wf list1.tab inventory | grep -wf list2.tab

This User Gave Thanks to Scrutinizer For This Post:
# 4  
Old 02-15-2012
Thanks a lot guys for your help !

Thanks Scrutinizer, it works perfectly !!!

Thanks balajesuri, it works but the second method is definitely easier !

---------- Post updated at 08:09 PM ---------- Previous update was at 07:16 PM ----------

I am wondering...

Once the lines containing the matching terms from list1.tab and list2.tab have been extracted, would it be possible to print the corresponding ref. AND only the matching terms (separated by a pipe, a coma, a space, or whatever) instead of the whole line?
Or if you prefer, a way to discard everything except the matching terms from the 2 lists and the corresponding ref?

Like that:
Code:
gf5|2000|red
euo4|2005|black


I tried the option -o in any possible way, but it doesn't work.
# 5  
Old 02-16-2012
I don't think you can do it that way. You'd probably need to use Unix' Swiss Army Knife:
Code:
awk ' f==1{A[$1]} 
      f==2{B[$1]} 
      f==3{if($2 in A) for(i=3;i<=NF;i++) if($i in B)print $1,$2,$i}
    ' FS='[ \t,]*' OFS=\| f=1 list1.tab f=2 list2.tab f=3 inventory

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need separate vi files in shell

Input: I have input file below mentioned.Input file has Yahoo,gmail,yuimn etc..are websites and there are users listed under it. I have many other unique websites but i mentioned just few as below. For example: Yahoo is website and 123,fsfd are members of website "yahoo". See below input... (19 Replies)
Discussion started by: buzzme
19 Replies

2. Shell Programming and Scripting

Output in separate files

Hi all, i have the bash script for remote conection, for hosts in $(cat /list); do ssh user1@$hosts "hostname"; done execute hostname command by all hosts and show standar ouput, how i can send to file by each host in lists, so e.g. $cat list 10.0.0.1 10.0.0.2... (1 Reply)
Discussion started by: aav1307
1 Replies

3. UNIX for Dummies Questions & Answers

Grep -B used with -f? (Searching a file using a list of terms, output is lines before each match)

(1 Reply)
Discussion started by: Twinklefingers
1 Replies

4. Shell Programming and Scripting

awk multiply values contained in 2 different files

Hi Everyone ! I have two files with the same configuration and I want to multiply corresponding values and write the result in a file. Let say 2 header lines and then lines of values (with not constant number of columns): more file1.txt --> BLABLABLA BLABLABLA 1 2 3 4 1 2 3 1 2 1... (7 Replies)
Discussion started by: Youm
7 Replies

5. Web Development

Two separate domains - and files

Hi, I've been asked to 'troubleshoot' a webserver where two different TLDs are being served. Or to be more accurate, 'domain.com' and 'domain.fr'. So we have /var/www/domain.com /var/www/domain.fr And then for some reason, the httpd.conf file points to two different configuration files.... (1 Reply)
Discussion started by: davidm123SED
1 Replies

6. Shell Programming and Scripting

Grep multiple terms and output to individual files

Hi all, I'll like to search a list of tems in a huge file and then output each of the terms to individual files. I know I can use grep -f list main.file to search them but how can I split the output into individual files? Thank you. (6 Replies)
Discussion started by: ivpz
6 Replies

7. Shell Programming and Scripting

Creating a Third File from Information Contained in Two Files

In advance, I appreciate any help or tips. I apologize for not providing examples of scripts I have already tried, the reason is that I am new to programming and do not really no where to start. I think this is possible with awk, but do not know how to go about learning how to write the script... (2 Replies)
Discussion started by: awc228
2 Replies

8. Shell Programming and Scripting

Using bash to separate files files based on parts of a filename

Hey guys, Sorry for the basic question but I have a lot of files that I want to separate into groups based on filenames which I can then cat together. Eg I have: (a_b_c.txt) WB34_2_SLA8.txt WB34_1_SLA8.txt WB34_1_DB10.txt WB34_2_DB10.txt WB34_1_SLA8.txt WB34_2_SLA8.txt 77_1_SLA8.txt... (1 Reply)
Discussion started by: Breentax
1 Replies

9. Shell Programming and Scripting

Searching across multiple files if pattern is available in all files searched

I have a list of pattern in a file, I want each of these pattern been searched from 4 files. I was wondering this can be done in SED / AWK. say my 4 files to be searched are > cat f1 abc/x(12) 1 abc/x 3 cde 2 zzz 3 fdf 4 > cat f2 fdf 4 cde 3 abc 2... (6 Replies)
Discussion started by: novice_man
6 Replies

10. Shell Programming and Scripting

how to compare counts in two separate files

Hi all, what will be the code to compare count present in two seperate files for e.g file (a) contains counts 100 and file (b) contains records 90 since both these files have differnt count so it will display count didnt match and in case of success it display (5 Replies)
Discussion started by: jojo123
5 Replies
Login or Register to Ask a Question