I thought I had this figured out but was wrong so am humbly asking for help.
The task is to add an additional column to FILE 1 based on records in FILE 2.
The key is in COLUMN 1 for FILE 1 and in COLUMN 1 OR COLUMN 2 for FILE 2.
I want to add the third column from FILE 2 to the beginning of FILE 1 so that the new FILE shows for example:
DESIRED FILE 3
FILE 1
FILE 2
I have tried the join command but that doesn't allow searching either of column 1 or 2 for matches.
If there is no record of the column 1 FILE 1 value in FILE 2 I would like to put an "NA".
I have tried awk solutions to the problem but am not skilled enough to accomplish the desired output. Any help would be much appreciated.
Thank you -this is closer to a solution than I've been in several days.
I tried out the script and did some manual sanity checks.
The code correctly identifies both column 1 and column 2 values in FILE 2. However, it only seems to add the column 3 value in FILE 2 if the matched value in FILE 2 was found in column 1.
I am wondering if this part of the code needs to be modified? Does $1 in F22 refer to the first column in the created matrix?
If there is no match in any of the columns the row is eliminated from the output, which actually isn't much of a problem though.
And now it works like a charm and produced exactly the output I was looking for. Thanks !!
---------- Post updated at 01:01 PM ---------- Previous update was at 12:50 PM ----------
The main question is solved thanks to vgersh99. I have a bonus question, if I would like to run this awk line on multiple FILES 1 using the same reference FILE2, would something along these lines do the trick?
With a VERY sloppy interpretation of "along these lines" you might come close to the desired result, once you corrected the syntax / redirection error in the before-last line, and accepted the higher resource cost as you run the script multiple times.
Why not sth. along THIS line :
-I suspected the suggested code was sloppy -I'm a newbie.
As I understand, this part of your suggestions tells to take the column 1 of the f21 table. Then add the .pruned extension to the stdout file? Or does it process all files with the .pruned extension?
To be more clear, I have 400 of FILE 1 that should be matched to the FILE 2 table, of which there is only 1. The filename looks as in the below example. I would like to match all of the below FILE1 without having run them each at a time. They all have the same file extension. The resulting files should get an additional extension .pruned.
I also don't understand this part
Does the ^ mean that the files are combined?
Is it possible to use wildcard definition e.g. *.tab to process many different versions of FILE1?
Please know that I am very new to unix and trying to learn 'on the job'. I'm only manipulating large tab-delimited files (millions of rows), but I'm stuck and don't know how to proceed with the following. Hoping for some friendly advice :)
I have 2 tab-delimited files - with differing column &... (10 Replies)
Hello,
I have 40 data files where the first three columns are the same (in theory) and the 4th column is different. Here is an example of three files,
file 2: A_f0_r179_pred.txt
Id Group Name E0
1 V N(,)'1 0.2904
2 V N(,)'2 0.3180
3 V N(,)'3 0.3277
4 V N(,)'4 0.3675
5 V N(,)'5 0.3456
... (8 Replies)
Hi, I have a rquirement in unix as below .
I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column.
ex:
Input Text file:
1|A|apple
2|B|bottle
excel file to be generated as output as... (9 Replies)
Here's a sample of the data:
NAME BIRTHDAY SEX LOCATION AGE ID
Jim 05/11/1986 M Japan 27 86
Rei 08/25/1990 F Korea 24 33
Jane 02/24/1985 F India 29 78
I've been trying to sort files using the... (8 Replies)
Hi Forum.
I'm struggling to find a solution for the following issue.
I have multiple files a1.txt, a2.txt, a3.txt, etc. and I would like to insert a tab-delimited header record at the beginning of each of the files.
This is my code so far but it's not working as expected.
for i in... (2 Replies)
Hi Forum
I have a tab delimited file that opens well in Openoffice calc (excel). But when I perform any operation in command line, it reads the file incorrectly. When I 'save As' the same file in office as tab delimited then it works fine.
The file that I think is tab delimited is actually... (8 Replies)
I have a folder that contains say 50 files in a sequential order:
cdf_1.txt
cdf_2.txt
cdf_3.txt
cdf_3.txt
.
.
.
cdf_50.txt.
I need to merge these files in the same order into a single tab delimited file.
I used the following shell script:
for x in {1..50};
do cat cdf_${x}.txt >>... (3 Replies)
I have a tab-Delimited file:
Eg:
'test' file contains:
a<tab>b<tab>c<tab>....
Based on certain condition, I wanna increase the number of lines of this file.How do I do that
Eg:
If some value in the database is 1 then one line in 'test' file is fine..
If some value in the database is 2... (1 Reply)