Command for vlookup function


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Command for vlookup function
# 1  
Old 12-13-2014
Command for vlookup function

Hello experts,

I have large text files that need to be arranged using a function like excel's vlookup. I have been playing with awk command but didn't really come up with a solution. Could anyone please help me out? Below are my datasets and expected output. Any help would be greatly appreciated.

<File 1>
Code:
contig-100_1 5
contig-100_30 10
contig-100_50 20
contig-100_40 15
contig-100_60 25

<File 2>
Code:
contig-100_1 "Bacteria"
contig-100_50 "Bacteria A"
contig-100_60 "Bacteria A, Fungi B"

<Expected output>
Code:
contig-100_1 5 Bacteria
contig-100_30 10 NA
contig-100_50 20 Bacteria A
contig-100_40 15 NA
contig-100_60 25 Bacteria A, Fungi B


Last edited by Don Cragun; 12-13-2014 at 04:11 PM.. Reason: Add CODE tags.
# 2  
Old 12-13-2014
While this is a commonly asked request, but that the quoted field contains the field separator (a space) makes it a little more complicated. But with a little preprocessing:
Code:
{
  sed -e 's/ "/"/' ${2} 
  echo '""*'
  sed -e 's/ /"/g'  ${1}
} \
| awk -F\" '$3 == "*" { mark++; next; } mark == "" { X[$1] = $2; next; } { print $1, $2, $1 in X ? X[$1] : "NA"; }'

which generates the expected results:
Code:
contig-100_1 5 Bacteria
contig-100_30 10 NA
contig-100_50 20 Bacteria A
contig-100_40 15 NA
contig-100_60 25 Bacteria A, Fungi B


Last edited by derekludwig; 12-13-2014 at 04:04 PM.. Reason: better file separator...
# 3  
Old 12-13-2014
Thanks derekludwig!

May I ask you one more question?
I tried your command with the following.
Script.sh File1 File2 > output

But the output looks odd. It has LN:10832 (any numbers) in the third column and I don't understand where those numbers came from. Please correct me if I am wrong with understanding your command. Sorry that I am so beginner.
# 4  
Old 12-13-2014
Instead of invoking sed twice and awk once, try just using:
Code:
awk -F' *"' '
FNR == NR {	v[$1] = $2; next }
{		print $0, ($1 in v ? v[$1] : "NA") }
' "$2" FS=' ' "$1"

You haven't said what OS or shell you're using. If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.
# 5  
Old 12-13-2014
Thanks Don Cragun!

I am working on Mac OS. I have executed your command with bash shell. But the third column in the output file is all "NA"...
# 6  
Old 12-13-2014
You may need the double quotes in your processing to follow. Try
Code:
awk 'NR==FNR {T[$1]=$0;next} $1 in T {X=$1; $1=T[$1]; delete T[X]} 1; END {for (t in T) print T[t], "N/A"}' file1 file2
contig-100_1 5 "Bacteria"
contig-100_50 20 "Bacteria A"
contig-100_60 25 "Bacteria A, Fungi B"
contig-100_40 15 N/A
contig-100_30 10 N/A

Pipe it through a sorting step if need be.
# 7  
Old 12-13-2014
Dear RudiC,

That worked perfect. Thanks all for your help!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Vlookup not using awk

Hi I just want again to ask for help on what command to use to vlookup f1 group name in "/etc/group" matching f3 of it to "/etc/passwd" f4. I do need to display group name in the output of /etc/passwd without using awk or NR==FNR command. thank you while IFS=: read -r f1 f2 f3 f4 f5 f6 f7... (4 Replies)
Discussion started by: joonisio
4 Replies

2. UNIX for Beginners Questions & Answers

Vlookup on 2 files - inserting vlookup command on another command

Hello, i am trying to print group name column(etc/group) on script (etc/passwd) since group name is not listed on etc/passwd columns. Im trying to do a vlookup. but i cant figure out how i can insert the vlookup command FNR==NR inside the print out command or the output. I also tried exporting... (2 Replies)
Discussion started by: joonisio
2 Replies

3. Shell Programming and Scripting

Conditional Vlookup

Hi everyone, I need to replace values of column 2 array1 with values of column 2 array2 based on a lookup of column 4 value, but only return a value IF the values in column 1 of BOTH array1 and array2 match, otherwise keep original value in column 2 of array1. Both files are tab delimited... (2 Replies)
Discussion started by: Geneanalyst
2 Replies

4. Shell Programming and Scripting

Vlookup in Linux

Hello Everybody I am looking for vlookup like functionality in Linux since two files I have are very big(1000MB each) and its not opening completely in excel. Here the requirement file1 11,12,13 16,14,12 28,21,22 22,23,24 file 3 18,16,16 14,12,12 23,22,24 16,11,13 here... (8 Replies)
Discussion started by: shanul karim
8 Replies

5. Shell Programming and Scripting

Excel vlookup function like value mapping with awk

I have two files File1 175552 st_497858.1 rs86052.1 rs92185.1 st_001022416.1 174841 175552_174841 179912 st_001122967.2 rs90435.1 rs89122.1 st_001022583.1 175545 179912_175545 179912 st_001122967.2 rs90435.1 rs89122.1 st_001022584.1 175545 179912_175545 179967 st_001256606.1 rs93516.2... (1 Reply)
Discussion started by: sammy777888
1 Replies

6. Shell Programming and Scripting

Vlookup using awk

Hi folks, awk 'NR==FNR {m=$0; next} $1 in m{$0=m} {print}' file2 file1 Works a charm for a vlookup type query, sourced from https://www.unix.com/shell-programming-and-scripting/215998-vlookup-using-awk.html However my column content has white spaces and numbers. Example file1 The Man... (6 Replies)
Discussion started by: pshields1984
6 Replies

7. Shell Programming and Scripting

Vlookup using awk

Hello, I am trying to use vlookup (Excel function) using awk but there is some problem :( --file1-- ABC123 101F X1 A $P=Z X2 A $P=X X3 B $P=F X4 C $P=G MNK180 END --file2-- X1 A_t $P=Z X2 A_t $P=X X3 B_u $P=F X4 C_o $P=G (2 Replies)
Discussion started by: young
2 Replies

8. Shell Programming and Scripting

vlookup files

hi frnds i have 2 files. 1st is dddd and 2nd is ssss ==> dddd <==: 1,charit 2,gilhotra ==> ssss <==: 1,sajan 2,doda 3,hello and i want o/p ...mean join and vlookup both files sajan,charit (4 Replies)
Discussion started by: dodasajan
4 Replies

9. Shell Programming and Scripting

Vlookup functionality in Bash

Hi please help in the below request Input File 1: a,1 b,2 c,3 d,4 e,5 f,6 Input File 2: 2 5 3 1 Output Required: b 2 e 5 c 3 (3 Replies)
Discussion started by: ravin
3 Replies

10. Shell Programming and Scripting

Want to implement VLOOKUP (Excel function) in Unix

Dear All, i want to implement vookup function which is there in excel into Unix. Suppose i have 2 files. The files are given below. File1: MSC Cell SDCA Patna-1 12 Bihar Patna-2 45 Ranchi Bhopal-1 85 Raigarh Bhopal-2 ... (8 Replies)
Discussion started by: pravani1
8 Replies
Login or Register to Ask a Question