awk merge matching columns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk merge matching columns
# 1  
Old 06-20-2016
awk merge matching columns

I know I'm not the first one asking this but my code still does not work:
File 1:
Code:
gi|1283| tRNAscan exon 87020 88058 . - . transcript_id "Parent=tRNA-Tyr5.r01";
gi|3283| tRNAscan exon 97020 97058 . + . transcript_id "Parent=tRNA-Tyr6.r01";
gi|4283| rRNAscan exon 197020 197058 . - . transcript_id "Parent=rRNA-Tyr1.r01";
gi|5283| mRNAscan exon 295020 298059 . + . transcript_id "Parent=mRNA-Tyr2.r01";

This file is tab separated
File 2:
Code:
"Parent=tRNA-Tyr6.r01"; 12
"Parent=mRNA-Tyr2.r01"; 0

This file is also tab separated
desired Output:
Code:
"Parent=tRNA-Tyr6.r01"; 12 -
"Parent=mRNA-Tyr2.r01"; 0 +

I want to merge these two files based on column $10 in file 1 ("Parent=tRNA-Tyr6.r01") and column $1 in file 2 ("Parent=tRNA-Tyr6.r01"), appending column $7 from file 1 (-/+)
MY solution would go like this:
Code:
awk 'FNR==NR{a[$10]=$7;next} ($1 in a) {print $1,"2,a[$1]}' file2 file1 > Output

can anyone help me out?
best Regards
Mo




Moderator's Comments:
Mod Comment Please use code tags as required by forum rules!

Last edited by RudiC; 06-20-2016 at 08:35 AM.. Reason: Changed INDENT tags to CODE tgs
# 2  
Old 06-20-2016
Hi Mo, the desired output you have put does not match what you have specified, as
Code:
Parent=tRNA-Tyr6.r01

has a + in file1 but your desired output shows a -
Also you have put the last field from file2 in the desired output, but this is not in your specification.

Anyway, if its any help the following code

Code:
awk 'NR==FNR{a[$1];next} $10 in a {print $10,$7}' file2 file1

Will give an output of

Code:
"Parent=tRNA-Tyr6.r01"; +
"Parent=mRNA-Tyr2.r01"; +


Last edited by andy391791; 06-20-2016 at 07:15 AM..
This User Gave Thanks to andy391791 For This Post:
# 3  
Old 06-20-2016
Thank you for the fast reply!

This was just a copy/paste error!

Is there a simple solution to add the 2nd column of file 2 to the output-file?
# 4  
Old 06-20-2016
If you did want the last field from file2 you could do the following:



Code:
awk 'NR==FNR{a[$1]=$2;next} $10 in a {print $10, a[$10],$7}' file2 file1

This would give output as



Code:
"Parent=tRNA-Tyr6.r01"; 12 +
"Parent=mRNA-Tyr2.r01"; 0 +

If you find any of these useful, please hit the thanks button Smilie
This User Gave Thanks to andy391791 For This Post:
# 5  
Old 06-20-2016
TY very much
# 6  
Old 06-20-2016
Maybe the other way round:

Code:
awk 'NR == FNR {T[$10] = $7; next} {print $0, T[$1]}' file1 file2
"Parent=tRNA-Tyr6.r01"; 12 +
"Parent=mRNA-Tyr2.r01"; 0 +

This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk Matching Columns - Am I missing something?

I am using awk to match columns and output based on those matches. For some reason it is not printing matching columns, am I missing something? Operating system - windows with cygwin. Command that I am using: sed 's/]*,]*/,/g' $tempdir/file1 > $tempdir/file1.$$ && awk -F, 'FNR==NR{f2=$2... (7 Replies)
Discussion started by: dis0wned
7 Replies

2. Shell Programming and Scripting

Merge columns from two files using awk

I have two csv files : say a.csv, b.csv a.csv looks like this : property1,property2,100 property3,property4,200 In a.csv, the combination of column1 and column2 will be unique b.csv looks like this property1,property2, 300, t1 property1,property2, 400,t2 property3, property4,800,t1... (2 Replies)
Discussion started by: Lakshmikumari
2 Replies

3. Shell Programming and Scripting

awk - matching on 2 columns for differents lines

Given this file (I separated them in block to make my explanation clearer): 92157768877;Sof_deme_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;20/02/2015;1;0;0 92157768877;Sof_trav_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;20/02/2015;1;0;0 91231838895;Sof_deme_faible_Email_am;EMAIL;26/01/2015;1 0;0... (1 Reply)
Discussion started by: Andy_K
1 Replies

4. Shell Programming and Scripting

awk split columns after matching on rows and summing the last column

input: chr1 1 2 3 chr1 1 2 4 chr1 2 4 5 chr2 3 6 9 chr2 3 6 10 Code: awk '{a+=$4}END{for (i in a) print i,a}' input Output: chr112 7 chr236 19 chr124 5 Desired output: chr1 1 2 7 chr2 3 6 19 chr1 2 4 5 (1 Reply)
Discussion started by: jacobs.smith
1 Replies

5. Shell Programming and Scripting

awk to copy previous line matching a particular columns

Hello Help, 2356798 7689867 999 000 123678 20385907 9797 666 17978975 87468976 968978 98798 I am trying to have out put which actually look for the third column value of 9797 and then it insert line there after with first, second column value exactly as the previous line and replace the third... (3 Replies)
Discussion started by: Indra2011
3 Replies

6. Shell Programming and Scripting

Merge two files matching columns

Hi! I need to merge two files when col1 (x:x:x) matching and adds second column from file1.txt. # cat 1.txt aaa;a12 bbb;b13 ccc;c33 ddd;d55 eee;e11 # cat 2.txt bbb;b55;34444;d55 aaa;a15;35666;a44 I try with this awk and I get succesfully first column from 1.txt: # awk -F";"... (2 Replies)
Discussion started by: fhluque
2 Replies

7. Shell Programming and Scripting

Help with awk Matching columns from two files

Hello, I have two files as following: #bin chrom chromStart chromEnd name score strand observed 585 chr2 29442 29443 rs4637157 0 + C/T 585 chr2 33011 33012 rs13423995 0 + A/G 585 chr2 34502 34503 rs13386087 0 + ... (2 Replies)
Discussion started by: Homa
2 Replies

8. Shell Programming and Scripting

Find min.max value if matching columns found using AWK

Input_ File : 2 3 4 5 1 1 0 1 2 1 -1 1 2 1 3 1 3 1 4 1 6 5 6 6 6 6 6 7 6 7 6 8 5 8 6 7 Desired output : 2 3 4 5 -1 1 4 1 6 5 6 8 5 8 6 7 (3 Replies)
Discussion started by: vasanth.vadalur
3 Replies

9. Shell Programming and Scripting

awk - Matching columns between 2 files and reordering results

I am trying to match 4 colums (first_name,last_name,dob,ssn) between 2 files and when there is an exact match I need to write out these matches to a new file with a combination of fields from file1 and file2. I've managed to come up with a way to match these 2 files based on the columns (see below)... (7 Replies)
Discussion started by: ambroze
7 Replies

10. Shell Programming and Scripting

using command line arguments as columns for pattern matching using awk

Hi, I wish to use a column, as inputted by a user from command line, for pattern matching. awk file: { if($1 ~ /^8/) { print $0> "temp2.csv" } } something like this, but i want '$1' to be any column as selected by the user from command line. ... (1 Reply)
Discussion started by: invinclible0009
1 Replies
Login or Register to Ask a Question