Extract specific content from data and rename its header problem asking
Input file 1:
Input file 2:
Desired output:
I got a long list of input file 1 and input file 2. Input file 1 is the raw data while input file 2 is the range of input file 1 data that I'm interested to extract and generate the output result file. The column 2 and column 3 of input file 2 is the position that I interested to extract from the data of input file 1. The output file I will rename with the header like "pattern_*_0.0*"
It seems like awk or perl scripts able to archive these goal.
Thanks a lot for any advice.
Last edited by patrick87; 03-23-2010 at 06:31 AM..
Reason: further explaining of my question
Hi, patrick87:
While processing the first file (FNR==NR), if a line begins with ">", grab everything that follows it and store it in p, the pattern name. If a line does not begin with a ">", then it is data for the current pattern, p; append the line to a[p], that pattern's entry in array a. Repeat until done with the first file.
For the second file, we use the pattern name in the first field and the index values in the second and third fields to extract the required substring from a[$1], while incrementing a counter for each pattern name seen, in the i array, i[$1].
Thanks alister,
I'm trying apply your awk code to my case now
Besides that, thanks a lot for your further explanation of your awk code too.
I very appreciate and thanks for your help and advice.
Thanks again ^^
---------- Post updated at 04:19 AM ---------- Previous update was at 03:43 AM ----------
Hi Alister,
Your awk code worked perfectly in my case. Thanks a lot.
Can I ask you if my input file 2 change like this:
How I can edit the awk code that you suggested to give the same output result as above?
Is it I need to add the "if" condition in the awk code for this problem?
Thanks again for your advice.
That tweak is incorrect, if I understand the modification to f2 correctly. If the second field is greater than the third, then it instead of being treated as the beginning index of the substring, it should be considered the end index (and the interpretation of the third field should be complementarily swapped). The correct solution requires that the second argument to substr() be modified as well, since in the case of $2 > $3, it should be $3 not $2.
By the way, malcomeex999 and rdcwayx, thank you very much for your bit awards. It's appreciated
Hi, patrick87:
One solution to handle both cases (even if they appear within the same file2):
It works identically to my earlier solution except that it tests the second and third fields in f2. If the first index is greater than the second, their values are swapped before the substr() call.
Hi all,
I am working on a small prog..
i have a file.txt which contains random data...
K LINES V4 ADD CODE `COMPANY` ADD CODE `DISTRIBUTOR` SEQ NAME^K LINES V5 SEQ NAME^K LINES V6 ADD `PACK-LDATE` SEQ NAME^K^KCOMMAND END^KHEADINFO... (1 Reply)
Dear all-
I have a requirement to send an email via email with body content which looks something below-
Email body contents
--------------------
RequestType: Update
DateAcctOpened: 1/5/2010
Note that header information and data content should be normal text..
Please advice on... (5 Replies)
My input:
Data name: ABC001
Data length: 1000
Detail info
Data Direction Start_time End_time Length
1 forward 10 100 90
1 forward 15 200 185
2 reverse 50 500 450
Data name: XFG110
Data length: 100
Detail info
Data Direction Start_time End_time Length
1 forward 50 100 50 ... (11 Replies)