Multiple Column print after lookup using NR==FNR (awk)


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Multiple Column print after lookup using NR==FNR (awk)
# 1  
Old 01-11-2011
Question Multiple Column print after lookup using NR==FNR (awk)

Code:
foo.txt
FAMID   IID     AFF     SEX     Group   AgeCat  Dis1      Dis2     Dis3      Dis4   Dis5  Dis6  Dis6
AMD0001 Mayo_49542     1       2       AMD     8       1       1       1       1       1       1       1
AMD0002 Mayo_49606     1       1       AMD     3       1       1       1       1       1       1       1
AMD0003 Mayo_49702     1       2       AMD     3       1       1       1       1       1       1       1
AMD0004 Mayo_49544     1       1       AMD     5       1       1       1       1       1       1       1
AMD0005 Mayo_49672     1       1       AMD     3       1       1       1       1       1       1       1

Code:
bar.txt
FAMID   IID     V1    V2    V3    V4    V5    V6    V7    V8    V9    V10 Status
AMD0001        Mayo_49542    -0.0053     -0.0046      0.0036     -0.0059      0.0019      0.0029      0.0022      0.0078      0.0043     -0.0001          Control
AMD0003        Mayo_49702    -0.0045     -0.0041      0.0033      0.0016      0.0004     -0.0018     -0.0004      0.0003     -0.0021     -0.0041          Control
AMD0004        Mayo_49544    -0.0037     -0.0028      0.0036      0.0003     -0.0002     -0.0019      0.0035      0.0023      0.0077     -0.0005          Control
AMD0002        Mayo_49606    -0.0052     -0.0045      0.0035     -0.0022      0.0043     -0.0018      0.0072      0.0116     -0.0040     -0.0028          Control
AMD0005        Mayo_49672    -0.0035     -0.0060      0.0008     -0.0029      0.0037      0.0011     -0.0212      0.0034      0.0024      0.0117          Control

Code:
foobar.txt

FAMID   IID     AFF     SEX     Group   AgeCat  Dis1      Dis2     Dis3      Dis4   Dis5  Dis6  Dis6    V1    V2    V3    V4    V5    V6    V7    V8    V9    V10 Status
AMD0001 Mayo_49542     1       2       AMD     8       1       1       1       1       1       1       1    -0.0053     -0.0046      0.0036     -0.0059      0.0019      0.0029      0.0022      0.0078      0.0043     -0.0001          Control
AMD0002 Mayo_49606     1       1       AMD     3       1       1       1       1       1       1       1    -0.0052     -0.0045      0.0035     -0.0022      0.0043     -0.0018      0.0072      0.0116     -0.0040     -0.0028          Control
AMD0003 Mayo_49702     1       2       AMD     3       1       1       1       1       1       1       1    -0.0045     -0.0041      0.0033      0.0016      0.0004     -0.0018     -0.0004      0.0003     -0.0021     -0.0041          Control
AMD0004 Mayo_49544     1       1       AMD     5       1       1       1       1       1       1       1    -0.0037     -0.0028      0.0036      0.0003     -0.0002     -0.0019      0.0035      0.0023      0.0077     -0.0005          Control
AMD0005 Mayo_49672     1       1       AMD     3       1       1       1       1       1       1       1    -0.0035     -0.0060      0.0008     -0.0029      0.0037      0.0011     -0.0212      0.0034      0.0024      0.0117          Control

I have done NR==FNR earlier for a single column data, but I dont know how to get multiple column data using NR==FNR method.
The closest thread is this Join 2 files with multiple columns: awk/grep/join? - The UNIX and Linux Forums , but gives slightly different output than the one I need.
Hoping that someone can help. awk seems so much nicer than doing this in excel using VLOOKUP. Files are not sorted and duplicate lines are not expected.
Thanks
~ GH
# 2  
Old 01-11-2011
Try this:
Code:
awk 'NR==FNR{s=$1; sub(".*"$2,"");a[s]=$0; next}
a[$1]{print $0 a[$1]}
' bar.txt foo.txt

# 3  
Old 01-11-2011
Franklin,

So I tried that with the sample data as provided here and it worked perfect.
I ran the same on my actual data (having the exact structure as given here) and it did not work. Here is the output I got. It gave the two strings on different lines on my Windows Text editor Notepad++
When I opened it in vi, I found a character 1^M at the junction between the 1st and second file.
I found another thread here that dealt with the same string problem. http://bit.ly/cq0JaJ
I used the first solution by anbu23 and it worked for me.

I know the data is same structure.. I copy pasted the first 5 lines. Though wondering if the tabs may have been a problem when pasting to windows?
Any ideas or different code?
Code:
    -0.0053     -0.0046      0.0036     -0.0059      0.0019      0.0029      0.0022      0.0078      0.0043     -0.0001          Control
    -0.0052     -0.0045      0.0035     -0.0022      0.0043     -0.0018      0.0072      0.0116     -0.0040     -0.0028          Control
    -0.0045     -0.0041      0.0033      0.0016      0.0004     -0.0018     -0.0004      0.0003     -0.0021     -0.0041          Control
    -0.0037     -0.0028      0.0036      0.0003     -0.0002     -0.0019      0.0035      0.0023      0.0077     -0.0005          Control
    -0.0035     -0.0060      0.0008     -0.0029      0.0037      0.0011     -0.0212      0.0034      0.0024      0.0117          Control


Last edited by genehunter; 01-11-2011 at 11:04 PM..
# 4  
Old 01-12-2011
It seems that it doesn't print the lines ($0) of the second file.
Code:
print $0 a[$1]

Have you type the command correctly?
# 5  
Old 01-17-2011
Hi Franklin52,

The script worked fine, it prints both the 1st and the second file. However, when it joins the first and second file, there is a character ^M in between that makes the second file to be printed on the next line in DOS.
I used the solution by anbu23 as mentioned in the previous post to solve it.

Code:
FAMID   IID     AFF     SEX     Group   AgeCat  Dis1      Dis2     Dis3      Dis4   Dis5  Dis6  Dis6 ^M   V1    V2    V3    V4    V5    V6    V7    V8    V9    V10 Status
AMD0001 Mayo_49542     1       2       AMD     8       1       1       1       1       1       1       1    ^M    -0.0053     -0.0046      0.0036     -0.0059      0.0019      0.0029      0.0022      0.0078      0.0043     -0.0001          Control
AMD0002 Mayo_49606     1       1       AMD     3       1       1       1       1       1       1       1    ^M    -0.0052     -0.0045      0.0035     -0.0022      0.0043     -0.0018      0.0072      0.0116     -0.0040     -0.0028          Control
AMD0003 Mayo_49702     1       2       AMD     3       1       1       1       1       1       1       1    ^M    -0.0045     -0.0041      0.0033      0.0016      0.0004     -0.0018     -0.0004      0.0003     -0.0021     -0.0041          Control
AMD0004 Mayo_49544     1       1       AMD     5       1       1       1       1       1       1       1    ^M    -0.0037     -0.0028      0.0036      0.0003     -0.0002     -0.0019      0.0035      0.0023      0.0077     -0.0005          Control
AMD0005 Mayo_49672     1       1       AMD     3       1       1       1       1       1       1       1    ^M    -0.0035     -0.0060      0.0008     -0.0029      0.0037      0.0011     -0.0212      0.0034      0.0024      0.0117          Control

~GH
# 6  
Old 05-08-2011
I tried to understand this code, not smart enough.
Can someone please help.
Thanks

Code:
awk 'NR==FNR{s=$1; sub(".*"$2,"");a[s]=$0; next} a[$1]{print $0 a[$1]} ' bar.txt foo.txt

# 7  
Old 05-09-2011
Code:
awk 'NR==FNR{s=$1; sub(".*"$2,"");a[s]=$0; next} a[$1]{print $0 a[$1]} ' bar.txt foo.txt

Explanation:
Code:
NR==FNR

If we read the first file.
Code:
s=$1

Capture the first field in variable s to use it as an index of the array a.
Code:
sub(".*"$2,"")

Remove the first 2 fields
Code:
a[s]=$0

Store the the value of $0 in element a[s]


Code for the second file:
Code:
a[$1]{print $0 a[$1]}

If the first field is an index in array a print the line and the content of the element of the array.
This User Gave Thanks to Franklin52 For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need awk or Shell script to compare Column-1 of two different CSV files and print if column-1 matche

Example: I have files in below format file 1: zxc,133,joe@example.com cst,222,xyz@example1.com File 2 Contains: hxd hcd jws zxc cst File 1 has 50000 lines and file 2 has around 30000 lines : Expected Output has to be : hxd hcd jws (5 Replies)
Discussion started by: TestPractice
5 Replies

2. Shell Programming and Scripting

awk to lookup stored variable in file and print matching line

The bash bash below extracts the oldest folder from a directory and stores it in filename That result will match a line in bold in input. In the matching line there is an_xxx digit in italics that (once the leading zero is removed) will match a line in link. That is the lint to print in output.... (2 Replies)
Discussion started by: cmccabe
2 Replies

3. Shell Programming and Scripting

awk to print field from lookup file in output

The below awk uses $3 and $4 in search as the min and max, then takes each $2 value in lookup and compares it. If the value in lookupfalls within the range in searchthen it prints the entire line in lookup/ICODE]. What I can't seem to figure out is how to print the matching $5 from search on that... (4 Replies)
Discussion started by: cmccabe
4 Replies

4. Shell Programming and Scripting

Awk: Print count for column in a file using awk

Hi, I have the following input in a file & need output as mentioned below(need counter of every occurance of field which is to be increased by 1). Input: 919143110065 919143110065 919143110052 918648846132 919143110012 918648873782 919143110152 919143110152 919143110152... (2 Replies)
Discussion started by: siramitsharma
2 Replies

5. Shell Programming and Scripting

Awk: print lines with one of multiple pattern in the same field (column)

Hi all, I am new to using awk and am quickly discovering what a powerful pattern-recognition tool it is. However, I have what seems like a fairly basic task that I just can't figure out how to perform in one line. I want awk to find and print all the lines in which one of multiple patterns (e.g.... (8 Replies)
Discussion started by: elgo4
8 Replies

6. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

7. Shell Programming and Scripting

awk command to print only selected rows in a particular column specified by column name

Dear All, I have a data file input.csv like below. (Only five column shown here for example.) Data1,StepNo,Data2,Data3,Data4 2,1,3,4,5 3,1,5,6,7 3,2,4,5,6 5,3,5,5,6 From this I want the below output Data1,StepNo,Data2,Data3,Data4 2,1,3,4,5 3,1,5,6,7 where the second column... (4 Replies)
Discussion started by: ks_reddy
4 Replies

8. Shell Programming and Scripting

find expression with awk in only one column, and if it fits, print whole column

Hi. How do I find an expression with awk in only one column, and if it fits, then print that whole column. 1 apple oranges 2 bannanas pears 3 cats dogs 4 hesaid shesaid echo "which number:" read NUMBER (user inputs number 2 for this example) awk " /$NUMBER/ {field to search is field... (2 Replies)
Discussion started by: glev2005
2 Replies

9. Shell Programming and Scripting

Multiple file lookup using awk

I want to lookup filea with fileb,filec and filed. If entry in filea exist in fileb and filec mark Y and then if entry in filea exist in filed mark as Y. Final output should have all the entries from filea. This prints only matching entries from file a in fileb i want all entries from... (9 Replies)
Discussion started by: pinnacle
9 Replies

10. Shell Programming and Scripting

Awk: different between NR and FNR

As I know: FNR: The ordinal number of the current record in the current file. NR: The ordinal number of the current record from the start of input. I don't understand really differency between NR and FNR. Who can explain it for me? And give me an example. Thanks (1 Reply)
Discussion started by: anhtt
1 Replies
Login or Register to Ask a Question