I'm trying to use the "join" function for more than 1 field. Since it's not possible as it is, I want to take my input files and concatenate the joining fields as 1 field (separated by "|"). I wrote 2 awk script to do and undo it (see below). However I'm new to awk and I'm certain I could do it in a much more efficient way.
I found various topics around the question but often the syntax proposed is a bit of a mystery to me. For instance someone posted this:
what does the trailing '1' mean? what are there 2 separated {} and what distinguish them? finally, where can I find doc on that kind of questions (googling "awk trailing digit" didn't help me much!!)
Here are my scripts, I don't care much about syntax shortcuts, I only care about speed of execution!
First thanks for responding!
I'm not sure what you mean by "how" I want to get this but I'll give you a more thorough example:
I have this file for instance (TSV):
And say I want to join fields 2, 3 and 6 with 3 columns of another file. Because join uses only 1 field, I want to put the fields 2, 3 and 6 together separated by only pipe (as opposed to my other fields separated with tabs). So the result of the concatene.awk script will give me the following:
to do so in the current script, I pass "2,3,6" as a parameter and for each line create two arrays like:
(example for the first line only)
JFS[0] = b, JFS[1] = c, JFS[2] = f
RJFS[2] = b, RJFS[3] = c, RJFS[6] = f
from there I rebuild my line by first going through JFS with a pipe separation, then adding the other fields with a tab separation by going through the NF fields and ignoring the ones for which RJFS[field] exist.
Hope this makes more sense! I bet there is a way to do it in a much more optimized way though..!
Need your help in solving this puzzle. Any kind of help will be appreciated and link for any documents to read and learn and to deal with such scenarios would be helpful
Concatenate column1 and column2 of file 1. Then check for the concatenated value in Column1 of File2. If found extract the... (14 Replies)
Hello All,
I have many zipped XMLs (example file name in tgz formate - file_rec.trx.2016-01-23.000123.exc.85sesdzd45wsds5299c8f2994f7.tgz) looks following and I need to verify two numbers, they are RecordNumber and EnrolData (only sequence number, NOT hole).
for all the records, both should be... (5 Replies)
The below awk parser works for most data inputs, but I am having trouble with the last one. The problem is in the below rules steps 1 and 2 come from $2 (NC_000013.10:g.20763686_20763687delinsA) and steps 3 and 4 come from $1 (NM_004004.5:c.34_35delGGinsT).
Parse Rules:
The header is... (0 Replies)
I have a flat file A.txt with field seperate by a pipe
2012/11/13 20:06:11 | 284:hawk pid=014268 opened Locations 12, 13, 14, 15 for /home/hawk_t112/t112/macteam/qt/NET12/full_ddr3_2X_FV_4BD_1.qt/dbFiles/t112.proto|2012/11/14 15:19:26 | still running |norway|norway
2012/11/14 12:53:51 | ... (6 Replies)
Hi All,
I want to remove the rows from File1.csv by comparing the columns/fields in the File2.csv. I only need the records whose first column is same and the second column is different for the same record in both files.Here is an example on what I need.
File1.csv:
RAJAK|ACTIVE|1... (2 Replies)
Hi colleagues,
I have a file in this format.
"/cccc/pppp/dddd/ggg/prueba.txt".
ERROR" THE error bbbbbbbbbb finish rows.
"/kkkk/mmmm/hhhh/jjj/ejemplo.txt".
ERROR This is other error rows.I need my file in this format.
"/cccc/pppp/dddd/ggg/prueba.txt". ERROR" THE error bbbbbbbbbb finish rows.... (3 Replies)
Need a script that manipulates a fixed length file that will compare 2 fields in that file and if they are equal write that line to a new file.
i.e. If fields 87-93 = fields 119-125, then write the entire line to a new file. Do this for every line in the file. After we get only the fields... (1 Reply)
Hi,
I have a huge text file. It looks like
abcde bangalo country 12345 lastfield
i want to get first 3 characters from field1 and first 3 characters from field 2 and insert the result as a new field. example the result should be:
abcde bangalo abcban country 12345 lastfield
Please... (4 Replies)
I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want.
The vendor wants me to submit three fields to them. Field 1 is the studentid field,... (4 Replies)