Visit Our UNIX and Linux User Community


Use strings from nth field from one file to match strings in entire line in another file, awk


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Use strings from nth field from one file to match strings in entire line in another file, awk
# 1  
Old 01-15-2018
Use strings from nth field from one file to match strings in entire line in another file, awk

I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file.

I would like to output the lines of File2 which contain the string of $2 in File1.

File1
Code:
PS002,002 RZN        ?           0     1  0  0  1  4 -1     6  0  3  2     2  2  2  1     -1     -1    -1   0 502   0
PS003,001 BRX        ?           0     1  1  0  1  1 -1     4  0  0  0     1  1  1 -1     -1     -1    -1   0 501   0
PS006,009 P<L        ?           0     1  0  0  1  5 -1     6  0  3  2     1  2  0 -1      2     -1    -1  -1  -1  -1
PS007,001 CJR=       ?           0     1  0  0  1 -1 -1     2  3  1  2    -1  1  1 -1     -1     -1    -1   0 501   0
PS017,003 ZMM        ?           0     1  0  0  6 -1 -1     2  1  1  0    -1  1  1 -1     -1     -1    -1   0 501   0
PS017,004 CMR        ?           0     1  0  0  6 -1 -1     2  1  1  0    -1  1  1 -1     -1     -1    -1   0 501   0
PS018,001 >JB        ?           0     1  0  0  1  5 -1     6  0  3  2     1  2  0 -1    306     -1    -1  -1  -1  -1
PS018,002 >MR        ?          -1     1  2  0  1 -1 -1    11  3  1  2    -1  1  1 -1     -1     -1    -1   0 501   0
PS018,018 FN>        ?           0     1  0  0  1  5 -1     6  0  3  2     1  2  0 -1     -1     -1    -1  -1  -1  -1

File2
Code:
PS003,001 MZMWR/ L-DWD// *
PS003,001 B-!!BRX[/+W M(N-PN(H/J >BCLWM// BN/+W *
PS004,001 L-(H-1M]]NYX[/ B-NGJN(H/WT MZMWR/ L-DWD// *
PS016,001 MKTM/ L-DWD// *
PS017,001 TPL(H/H L-DWD// *
PS018,001 L-(H-1M]]NYX[/ L-<BD/ JHWH// L-DWD// >CR ]]DBR[ L-JHWH// >T DBR/J H-CJR(H/H H-Z>T *
PS018,001 B-JWM/ ]H](NY1JL[ JHWH// >1WT+W M(N-KP/ KL/ >JB[/J+W W-M(N-JD/ C>WL// *
PS019,001 L-(H-1M]]NYX[/ MZMWR/ L-DWD// *

Desired Output:
(These two lines contain the strings BRX and >JB from $2 of File1)
Code:
PS003,001 B-!!BRX[/+W M(N-PN(H/J >BCLWM// BN/+W *
PS018,001 B-JWM/ ]H](NY1JL[ JHWH// >1WT+W M(N-KP/ KL/ >JB[/J+W W-M(N-JD/ C>WL// *

It seems to me that either of the two following awk one-liners should do the trick:
Code:
awk 'NR==FNR{A[$0]++;next}($2 in A){print A[$0]}' File2 File1

Code:
awk 'NR==FNR{A[$1]=$0;next}$2 in A{print A[$1]}' File2 File1

I have even attempted to name all of the records of File2 in a variable for use in the print statement
Code:
awk 'NR==FNR{x=$0; A[x];next}$2 in A {print x}' File2 File1

However, these keeps returning nothing.

Nevertheless I can get it to work with grep:
Code:
grep -f <(awk '{print $2}' File1) File2

I would like to get the forum's help as to why my awk code is failing.
# 2  
Old 01-16-2018
Your awk code fails because the ( var in array ) construct yields / needs an EXACT (NOT partial) match of var in the array's indices. Your third attempt comes closest but it still tries to match file1's $2 to entire lines of file2.

How about
Code:
awk 'NR == FNR {T[$2]; next} {for (t in T) if ($0 ~ t) print}' file[12]
PS003,001 B-!!BRX[/+W M(N-PN(H/J >BCLWM// BN/+W *
PS018,001 B-JWM/ ]H](NY1JL[ JHWH// >1WT+W M(N-KP/ KL/ >JB[/J+W W-M(N-JD/ C>WL// *

This User Gave Thanks to RudiC For This Post:

Previous Thread | Next Thread
Test Your Knowledge in Computers #347
Difficulty: Easy
AWK is a language for processing text files.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

(g)awk: Matching strings from one file in another file between two strings

Hello all, I can get close to what I am looking for but cannot seem to hit it exactly and was wondering if I could get your help. I have the following sample from textfile with many thousands of lines: File 1 PS001,001 HLK PS002,004 L<G PS004,002 XNN PS004,006 BVX PS004,006 ZBX=... (7 Replies)
Discussion started by: jvoot
7 Replies

2. UNIX for Advanced & Expert Users

Cut a word between two strings and repeat the same in the entire file

in the below data i need to search for the word typeMismatch and then traverse back to find the filename of that particular mismatch. Like this we have to get all the file names which has error in them. How can i acheive this. I tried use sed or awk but not able to achevie the same. Sample... (2 Replies)
Discussion started by: ATWC
2 Replies

3. Shell Programming and Scripting

Printing string from last field of the nth line of file to start (or end) of each line (awk I think)

My file (the output of an experiment) starts off looking like this, _____________________________________________________________ Subjects incorporated to date: 001 Data file started on machine PKSHS260-05CP ********************************************************************** Subject 1,... (9 Replies)
Discussion started by: samonl
9 Replies

4. UNIX for Beginners Questions & Answers

Match Strings between two files, print portions of each file together when matched ([g]awk)

I have two files and desire to use the strings from $1 of file 1 (file1.txt) as search criteria to find matches in $2 of file 2 (file2.txt). If matches are found I want to output the entire line of file 2 (file2.txt) followed by fields $2-$11 of file 1 (file1.txt). I can find the matches, I cannot... (7 Replies)
Discussion started by: jvoot
7 Replies

5. Shell Programming and Scripting

awk to remove field and match strings to add text

In file1 field $18 is removed.... column header is "Otherinfo", then each line in file1 is used to search file2 for a match. When a match is found the last four strings in file2 are copied to file1. Maybe: cut -f1-17 file1 and then match each line to file2 file1 Chr Start End ... (6 Replies)
Discussion started by: cmccabe
6 Replies

6. Shell Programming and Scripting

Print text between 2 strings for the entire file

hey guys, for the following output: starting open open close close starting close starting open close close starting open open close open (2 Replies)
Discussion started by: boaz733
2 Replies

7. Shell Programming and Scripting

Match list of strings in File A and compare with File B, C and write to a output file in CSV format

Hi Friends, I'm a great fan of this forum... it has helped me tone my skills in shell scripting. I have a challenge here, which I'm sure you guys would help me in achieving... File A has a list of job ids and I need to compare this with the File B (*.log) and File C (extend *.log) and copy... (6 Replies)
Discussion started by: asnandhakumar
6 Replies

8. Shell Programming and Scripting

Strings from one file which exactly match to the 1st column of other file and then print lines.

Hi, I have two files. 1st file has 1 column (huge file containing ~19200000 lines) and 2nd file has 2 columns (small file containing ~6000 lines). ################################# huge_file.txt a a ab b ################################## small_file.txt a 1.5 b 2.5 ab ... (4 Replies)
Discussion started by: AshwaniSharma09
4 Replies

9. Shell Programming and Scripting

Awk+Grep Input file needs to match a column and print the entire line

I'm having problems since few days ago, and i'm not able to make it works with a simple awk+grep script (or other way to do this). For example, i have a input file1.txt: cat inputfile1.txt 218299910417 1172051195 1172070231 1172073514 1183135117 1183135118 1183135119 1281440202 ... (3 Replies)
Discussion started by: poliver
3 Replies

10. Shell Programming and Scripting

Parsing file to match strings

I have a file with the following format 12g data/datasets/cct 8g data/dataset/cct 10 g data/two 5g data/something_different 10g something_different 5g data/two is there a way to loop through this... (1 Reply)
Discussion started by: yawalias
1 Replies

Featured Tech Videos