merge two text files of different size on common index
I have two text files. text file 1:
text file 2:
I need to merge the two, keeping only the rows that appear in both files (the shorter list could be the index). The column filePath is the index, so the final file should look like.
I am guessing this could be done in awk, and certainly in perl, but I'm not sure how do to the alignment by the index.
That worked great, except the header row never made it to the output file.
---------- Post updated at 02:16 PM ---------- Previous update was at 12:24 AM ----------
I have been working on the header row. If I do,
This comes close, but prints the entire a1 file to temp.txt, not just the first row. This takes the first 5 fields from file a2 and then is supposed to add from field 3 to the last field of file a1. This will gob together the header row, and then I can use the command above to fill in the rest of the file.
---------- Post updated at 02:20 PM ---------- Previous update was at 02:16 PM ----------
This seems to work, but overall this seems an odd way of adding the header row. LMHmedchem
I'm no awk expert but..
Everything works fine here with your input files and guru's awk.
I'm getting header as it should be or ? :
After running the code vs your input (a1 and a2) i'm getting.
Also for generating headers via awk, remove the header from the data and use BEGIN block or put another pair of { } bracers infront of guru's code (you will need to remove header from data first)
Something like :
or
I found a bug in my data where the first two files had a different header name for one header. The header row is now correct, more or less.
There still seems to be an issue in that the last column has three columns of space delimited data in it.
The last two, SUB_ID and SOURCE are duplicate cols (already occur at $3,$4). These come from $3, $4 in a2. Each row should end with the sumSO2Am field.
I don't see where that is happening in the command, or I just don't get it. I see how the first 7 fields are being printed, but not the rest of each row. I can post some short test files if that would help.
Yes, apparently, my input files were not quite what I thought, so the field identifiers were not quite right. I have everything corrected now. The test files I am using have more cols than the sample I posted. I thought I had this working a bit ago, but now it seems it's not working again.
I have attached .zip with 4 files. There are the a1 and a2 input files, the output I am hoping for, and the output (incorrect) I am now getting. I am just trying to add the values from the cols SUB_ID, SOURCE, ChBrg_REGID from file a2 to file a1. File a1 has fewer rows, so it is necessary to look for the values in a1"filePath" to match the right row. This is basically looking up the values for the 3 cols in the a2 file and adding them in right after the a1"filePath" col. I won't always want all of the rest of the cols from a1, but it's just as well to leave them in, since I can edit the file further with cut, etc.
This is the command I am using,
This is part of a more involved script, but I'm just trying to get this part working.
I really thought I had it for a bit, but I guess not.
Hello,
I have 40 data files where the first three columns are the same (in theory) and the 4th column is different. Here is an example of three files,
file 2: A_f0_r179_pred.txt
Id Group Name E0
1 V N(,)'1 0.2904
2 V N(,)'2 0.3180
3 V N(,)'3 0.3277
4 V N(,)'4 0.3675
5 V N(,)'5 0.3456
... (8 Replies)
Hi, I am trying to selectively merge two files based on keys reported in the 1st column.
File1:
#file1-header1
file1-header2
111 qwe rtz uio
198 asd fgh jkl
165 yxc
789 poi uzt rew
89 lkj
File2:
#file2-header2
file2-header2
165 ghz nko2 ... (2 Replies)
Hi all,
Say i have multiple files x1 x2 x3 x4, all with common header (date, time, year, age),,
How can I merge them to one singe file "X" in shell scripting
Thanks for your suggestions. (2 Replies)
Hi,
I have two files A (2190 rows) and file B (1100 rows). I want to merge the contents of two files based on common field, also I need the unmatched rows from file A
file A:
ABC
XYZ
PQR
file B:
>LMN|chr1:11000-12456:
>ABC|chr15:176578-187678:
>PQR|chr3:14567-15866:
output... (3 Replies)
I have a need to merge two files on the value of an index column.
input file 1
id filePath MDL_NUMBER
1 MFCD00008104.mol MFCD00008104
2 MFCD00012849.mol MFCD00012849
3 MFCD00037597.mol MFCD00037597
4 MFCD00064558.mol MFCD00064558
5 MFCD00064559.mol MFCD00064559
input file 2
... (9 Replies)
Hi,
I have two files that I would like to merge and think that there should be a solution using awk. The files look something like this:
file 1
IDX1 IDY1
IDX2 IDY2
IDX3 IDY3
file 2
IDY1 dataA data1
IDY2 dataB data2
IDY3 dataC data3
Desired output
IDX1 IDY1 dataA data1
IDX2 ... (5 Replies)
I have 100 data files labelled 250.1.txt through 250.100.txt. The second column of the data files partially match (there is about %90 overlap). Each data file has 4 columns.
I want the merge all these text files by the matching values in the second column. In the output, the first column should... (1 Reply)
I'm running on freebsd -- with a default shell of csh.
I have two files named A and B. Each line of each file contains a file name. How can I write a script that removes all the file names in file B from A.
I tried to use perl to create a huge regular expression with "|" separating the file... (2 Replies)
hi,
i am facing a problem in merging two files using awk,
the problem is as stated below,
file1:
A|B|C|D|E|F|G|H|I|1
M|N|O|P|Q|R|S|T|U|2
AA|BB|CC|DD|EE|FF|GG|HH|II|1
....
....
....
file2 :
1|Mn|op|qr (2 Replies)