This is absolutely wonderful! ... :-)
Here is my understanding of Franklin52's code:
Unix Manuals - AWK Reference
# == is “is equal”
tolower(string): Return the string with all upper case characters replaced with their lower case equivalents.
toupper(string): Return the string with all lower case characters replaced with their upper case equivalents.
FNR: Record number in input file.
NR: Number of records processed.
Thus, the above script translates (? - please correct me if I am mistaken) as
awk'
FNR==NR{a[tolower($1)]=$2;next}
while the record number (line) equals the total number of records (is true), do all of the following:
get $1 (the common gene name - converted to LOWERcase - required since the corresponding field in File_1 is lowercase; otherwise, it will fail to “match” - linux is case-sensitive) in the lookup file (File_2), set (change it) to the (already uppercase) systematic gene name ($2) in the same lookup table, then read the next record number (line);
tolower($1) in a{print "1 " a[tolower($1)] " tf " toupper($2)}
now, for each $1 in File_2 (now set to uppercase $2, from the lookup table), in the second file (File_1, the one to be converted), print
“1”, $2 from File_2; “tf”, $2 from File_1 (returned as uppercase, to convert the trailing lowercase c, w, -a, etc.)
' "File_2" "File_1"
File_1 = file to be processed (converted)
File_2 = “lookup file” ("common_to_systematic.tab)
?!
This works brilliantly!! Thank you so much, Franklin52!!
Have a super weekend! ... Greg :-)