Sponsored Content
Top Forums Shell Programming and Scripting Perl to run different parser based on digit Post 302993186 by cmccabe on Tuesday 7th of March 2017 01:13:11 PM
Old 03-07-2017
Perl to run different parser based on digit

The perl parser below works as expected assuming the last digit in the NC_ before the . is a single digit.

Code:
perl -ne 'next if $. == 1;
	if(/.*del([A-Z]+)ins([A-Z]+).*NC_0{4}([0-9]+).*g\.([0-9]+)_([0-9]+)/)   # indel
{
        print join("\t", $3, $4, $5, $1, $2), "\n";
}
           ' out_position.txt > out1.txt


out_position.txt > out2.txt

out_position.txt
Code:
Input Variant	Errors	Chromosomal Variant	Coding Variant(s)
NM_003924.3:c.*18_*19delGCinsAA		NC_000004.11:g.41747805_41747806delinsTT	LRG_513t1:c.*18_*19delinsAA	NM_003924.3:c.*18_*19delinsA

contents of out1.txt --- output is correct

Code:
4	41747805	41747806	GC	AA

However, I can not seem to adjust it to account for the last digit in NC_ before the . in bold, may not always be 1 digit as in the case above, it could be 2 digits, as n the case below. In this case I would need to parse out 4 zeros, instead of 5. So my question is I am not sure how to make the condition in italics in the perl command adjust based on the NC_ being 1 or 2 digits? Thank you Smilie.

Code:
Input Variant	Errors	Chromosomal Variant	Coding Variant(s)
NM_003924.3:c.*18_*19delGCinsAA		NC_000014.11:g.41747805_41747806delinsTT	LRG_513t1:c.*18_*19delinsAA	NM_003924.3:c.*18_*19delinsA

So in this case the desired output would be:

Code:
14     41747805     41747806     GC     AA

It is also possible for the NC_ to be a letter, not a digit, but in that case it is always one letter, NC_00000X.11:g.41747805_41747806delinsTT

Code:
.*NC_0{5}([0-9]+).

to this:

Code:
.*NC_0{5}([0-9]+[A-Z]+).


Last edited by cmccabe; 03-07-2017 at 02:14 PM.. Reason: fixed format
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

xml parser in perl

hi all i want to read xml file in perl i am using XML::Simple for this. i am not getting how to read following file removing xml file due to some reason (1 Reply)
Discussion started by: zedex
1 Replies

2. Shell Programming and Scripting

Perl XML:Parser help

I am very new to XML. Really I have an excel file that I am trying to read w/ Perl on a Linux machine. I don't have a mod for reading excel files so I have to convert the excel file to xml to be able to read it. I can read the file and everything is ok except...the Number style is being dropped... (0 Replies)
Discussion started by: vincaStar
0 Replies

3. Shell Programming and Scripting

xml-parser with perl

Hello I want to write an xml- parser with perl an i use the libary XML::LibXML. I have a problem with the command getElementsByTagName. If there is an empty tag, the getElementsByTagName method returns a NodeList of length zero. how can i check if this is a nodelist of lenght zero?? i... (1 Reply)
Discussion started by: trek
1 Replies

4. Shell Programming and Scripting

perl config parser

Hello. Can anybody help me with some sub on perl that can parse config like this: %CFG ( 'databases' => { 'db1' => 'db_11', 'db_12', 'db_13', 'db2' => 'db_21', 'db_22', 'db_23' } 'datafiles' => { 'datadir1' => 'datadir_11', 'datadir_12', 'datadir2' =>... (4 Replies)
Discussion started by: drack
4 Replies

5. Shell Programming and Scripting

Split large file based on last digit from a column

Hello, What's the best way to split a large into multiple files based on the last digit in the first column. input file: f 2738483300000x0y03772748378831x1y13478378358383x2y23743878383802x3y33787828282820x4y43748838383881x5y5 Desired Output: f0 3738483300000x0y03787828282820x4y4 f1... (9 Replies)
Discussion started by: alain.kazan
9 Replies

6. Shell Programming and Scripting

Where to find 64-bit based perl module like XML::Parser::Expat?

Q: Where to get a 64 bit Expat.so? I run a perl script and got this error: Can't load '/usr/perl5/vendor_perl/5.8.4/i86pc-solaris-64int/auto/XML/Parser/Expat/Expat.so' for module XML:parser::Expat: ld.so.1:myPerl: fatal:... (0 Replies)
Discussion started by: lilili07
0 Replies

7. Programming

Parser - multiple in Perl

Dear Perl Experts, Could some body help me to find the solution for my problem below: Input file: ----------- THE-0 tsjp THE-32 tsjp THE-64 tsjp Output desired: --------------- THE-0&&-31 tsjp THE-32&&-63 tsjp THE-64&&-95 tsjp Note: 31 = 0+31, (2 Replies)
Discussion started by: askari
2 Replies

8. Shell Programming and Scripting

Update perl code with parser

The below perl code imports the data in the attached document. However, I can not seem to update the perl code to include a parser like in the desired tab of that document. Thank you :). Most of the data for the parse is included in the document except for the gene and RNA which can is... (0 Replies)
Discussion started by: cmccabe
0 Replies

9. UNIX for Beginners Questions & Answers

Cut first value after underscore and replace first two digit with zero in perl

Like I have below string XX_49154534_491553_201_122023_D XX_49159042_491738_201_103901_D and the expected output would be 0154534 0159042 XX and 49 can be dynamic. (1 Reply)
Discussion started by: nadeemrafikhan
1 Replies
All times are GMT -4. The time now is 09:56 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy