@Aia, please feel free to make any improvements/suggestions to any code posted by me. I am a scientist learning programming and so this is still new to me. I learn from each post and try to improve each time. Thank you very much . I will try again tomorrow and post back.
Unfortunately, I can tell you what the code is doing, but I can only guess and infer your intentions, and that's the hard part. If you were to post a few representative lines of $input_file and an example of the expected result to be saved in $output_file, I am sure many people would be happy to help. And we, both, might learn something along the way.
---------- Post updated at 09:45 PM ---------- Previous update was at 08:14 PM ----------
This is an example of how you might be able to insert an extra member at the header line and later output with tabs
splice works as well for inserting an array in the middle of another array
Last edited by Aia; 02-04-2016 at 01:32 AM..
Reason: Add example2.pl
The basic idea of my program is that it combines multiple steps/processes into one. A set of data is inputted that is, for lack of a better term, not useful so I use a SOAP API to connect to a python tool that verifies that the input data is found in a database and converts the data into something that is useful. This data which is a set of coordinates15 25653864 25653864 G C is an example. Those cordinates are saved as a text file that is piped into a perl program (not the one posted) to apply meaning to the coordinate. That data is then reformatted using the perl posted and the process is complete.
The science behind this makes alot of sense to me but I am learning more and more about the programming aspect. Science, especially in my field of molecular genetics and genomics is advancing very quickly and using more and more programming. Thank you for all your help .
edit (perlupdate).
Output
all the other $out fields populate from the split. Not sure why the "VUS" isn't populating but that string is hardcoded and not from the split.
Last edited by cmccabe; 02-04-2016 at 11:45 AM..
Reason: added details and perl edit
I wish you would had posted at least four lines of the input file and how do you expect those four lines to be reformatted as output. Due to the lack of that information the situation has not changed much.
If you are unable to do that, please take a look at these parts:
What do you think is happening to @colsleft and @colsright at each iteration of the while loop?
Since those are outside the while loop, they never get refreshed for each line of the input line. Therefore, they will contain pieces of data from previous iterations, if they are not rewritten in the loop. I do not know if that's what you want.
$_ will always have something through the loop, if not, the for loop stops. So, I do not know what's your intention even when you commented with skip if AB empty.
remove the \, : is not especial in any way there.
Please, explain this part. What do you think that this part is doing for you?
provided the additional columns to the left and right. Basically, the file that is used is only 24 columns (represented by $_) and @colsleft put 18 columns before them with values of Null and @colsright put 8 additional columns after with values of Null. The total is now 50 and that is what is expected. So basically it is my @out=($.,@colsleft,$_,@colsright); unique entry, followed by 18 columns, then 24, then 8.
The split of column 10 [9] provides the arrays that are used in the $out. grep {$transcript eq $_} keys %nms or next; is a special case that matches the Stranscript with the nms in the beginning. This actually all seems to work.
The loop doesn't get refreshed because if there are multiple enteries then they are on separate rows.
I can not seem to figure out why VUS doesn't populate in $out[45] as that is the only part that doesn't seem to work.
I hope this helps a bit and thank you for all your help .
I can not seem to figure out why VUS doesn't populate in $out[45] as that is the only part that doesn't seem to work.
Earlier, in post #6, I mentioned the following:
Quote:
Originally Posted by Aia
$out[45] = "VUS"; pretty much assigns it and there is not way you have "NULL" after that unless it gets changed or your understanding of what you are looking at is not correct.
I still believe that what you expect to see is misleading you of what it really is.
At this point, "VUS" is still, the 46th element of array out.
At that point, array out has been flatten out as an string separated with tabs.
You have lost your ability to know what element it would be in a string that now contains new tab separated elements.
In fact, since $out[45] is the last part, it would, actually, become $out[64] if you were to split again by tab.
You have introduced new parts that have tabs as well.
You want to test it?
Here's your original code, with some prints to show some details. Run it with just the two lines of input you posted. Run as:
Output:
Please, scroll all the way to the right to see VUS highlighted.
Again, knowing what the code does... it is not a problem; knowing what you expect is the hard part.
If you were to post an example of how those 2 lines are supposed to look after the process, that might help.
Last edited by Aia; 02-05-2016 at 10:17 PM..
Reason: Run only with the first two lines of input
desired output: column [45] or classification is VUS
So if I am understanding (please correct me if I am wrong)
introduced 16 additional tabs, presumably from the @out split [9] (5 additional fields) and the @nms (3 additional fields) . Thank you for all your help .
So if I am understanding (please correct me if I am wrong)
introduced 16 additional tabs, presumably from the @out split [9] (5 additional fields) and the @nms (3 additional fields) . Thank you for all your help .
Yes, you have elements of the array that has tabs in itself and when you convert that array to a string it creates more fields if you were to separate again those fields by tab.
Here's the input you posted in #11
Here's the desired output posted in #13
I have to assume you did not correlate them since it is not possible to produce this output from that input, unless other information is missing.
Please, explain where those fields highlighted in red come from, since they are not found anywhere in your posted input or code.
If this were a case of wrong output against input, please, provide a corrected set of input and output, to remove ambiguity.
Also, please, explain the extra tabs in your output file, every ^I identify a tab in the line.
Or if you substitute tab for bar
As it stands, your example output has 50 fields. Could you, please, confirm you want an output of 50 fields, all the time?
Here's a breakout of it:
Please, explain what would produce fields number 22, 23 and 27 and 26, 29 to 42.
Would you like those tab-empty fields to be Null?
Another question, concerning your code: $vals[9] contains PHOX2B:NM_003924.3:exon3:c.C639G:p.G213G according to your input. It can not be split by commas.
Can you explain that? Are there any lines that would have something like:
And if so how would you like to handle them?
Thank you.
Last edited by Aia; 02-06-2016 at 07:48 PM..
Reason: Add more questions.
Below code extracts multiple field values from XML into array and prints all in one line.
perl -nle '@r=/(?: jndiName| authDataAlias| value| minConnections| maxConnections| connectionTimeout| name)="(+)/g and print join ",",$ENV{tIPnSCOPE},$ENV{pr
ovider},$ENV{impClassName},@r' server.xml
... (4 Replies)
I am trying to use awk skip each line with a ## or # and check each line after for STB= and if that value in greater than or = to 0.8, then at the end of line the text "STRAND BIAS" is written in else "GOOD".
So in the file of 4 entries attached.
awk tried:
awk NR > "##"' "#" -F"STB="... (6 Replies)
Hi All
I need to add a line to a file but after a certain block of text is found
The block of text looks like this
<RDF:Description RDF:about="urn:mimetype:video/quicktime"
NC:value="video/quicktime"
and i need to add this in the next line down ( note there is... (4 Replies)
I'm writing a program which uses curl to be run on Linux PCs which will be used by a number of different users. I cannot make the users all install curl on their individual machines, so I have tried to link curl in statically, rather than using libcurl.so. I downloaded the source and created a... (8 Replies)
As an addition to our ongoing investigation into static code analysis tools for a Perl programming we are maintaining, can anyone recommend a certain tool that he/she is experienced with?
We are already actively using perl::critic (Perl::Critic) and rats... (2 Replies)
Hi everyone,
I need to replace the text between two strings (html tags) and I'm having trouble figuring out how to do so. I can display the text with sed but I'm not having any luck deleting the text between the two strings.
My file looks like this:
<oths>test</oths><div class="text">1928... (2 Replies)
hello, i need help on setting my coyote linux, i've working on this for last 5 days, can't get it to work. I've been posting this message to coyote forum, and other linux forum, but haven't get any answer yet. Hope someone here can help me...... please see my attached picture first.
... (0 Replies)
I've got a simple log file that looks something like this:
And I need to append it to look like this:
So I just want to add a timestamp and a static (non-variable) word to each line in the file. Is there an easy scripted way to cat the file and append that data to each line....?? (4 Replies)
I have a machine with an interface that has two different addresses on CentOS 5
eth0: 10.20.21.77
eth0:1 141.218.1.221
If I issue this command I get the result I'm looking for.
/sbin/route add -net 141.218.1.0 netmask 255.255.255.0 gw 10.20.21.77
ip route show dev eth0
141.218.1.0/24... (1 Reply)