Sponsored Content
Top Forums Shell Programming and Scripting Perl to run different parser based on digit Post 302993225 by durden_tyler on Tuesday 7th of March 2017 07:47:02 PM
Old 03-07-2017
Quote:
Originally Posted by cmccabe
...assuming the last digit in the NC_ before the . is a single digit.
...
However, .... the last digit in NC_ before the . in bold, may not always be 1 digit as in the case above, it could be 2 digits, as n the case below. In this case I would need to parse out 4 zeros, instead of 5.
...
...
It is also possible for the NC_ to be a letter, not a digit, but in that case it is always one letter, ...
So the string is one of the following:

(1) NC_ + five zeros + 1 digit + "." character => you want that one digit before before "." character
(2) NC_ + four zeros + 2 digits + "." character => you want those two digits before "." character
(3) NC_ + five zeros + 1 character + "." character => you want that one character before "." character

One way to look at it is:
NC_ + a sequence of more than one zeros + sequence of characters that are not zero + "." character

And you want to capture that sequence of non-zero characters before the "." character.

Here's a sample regex that does that:

Code:
$ 
$ cat input.txt
NC_000004.11
NC_000014.11
NC_00000X.11
$ 
$ perl -lne 's/NC_0+(.*?)\..*/$1/; print' input.txt
4
14
X
$ 
$

This User Gave Thanks to durden_tyler For This Post:
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

xml parser in perl

hi all i want to read xml file in perl i am using XML::Simple for this. i am not getting how to read following file removing xml file due to some reason (1 Reply)
Discussion started by: zedex
1 Replies

2. Shell Programming and Scripting

Perl XML:Parser help

I am very new to XML. Really I have an excel file that I am trying to read w/ Perl on a Linux machine. I don't have a mod for reading excel files so I have to convert the excel file to xml to be able to read it. I can read the file and everything is ok except...the Number style is being dropped... (0 Replies)
Discussion started by: vincaStar
0 Replies

3. Shell Programming and Scripting

xml-parser with perl

Hello I want to write an xml- parser with perl an i use the libary XML::LibXML. I have a problem with the command getElementsByTagName. If there is an empty tag, the getElementsByTagName method returns a NodeList of length zero. how can i check if this is a nodelist of lenght zero?? i... (1 Reply)
Discussion started by: trek
1 Replies

4. Shell Programming and Scripting

perl config parser

Hello. Can anybody help me with some sub on perl that can parse config like this: %CFG ( 'databases' => { 'db1' => 'db_11', 'db_12', 'db_13', 'db2' => 'db_21', 'db_22', 'db_23' } 'datafiles' => { 'datadir1' => 'datadir_11', 'datadir_12', 'datadir2' =>... (4 Replies)
Discussion started by: drack
4 Replies

5. Shell Programming and Scripting

Split large file based on last digit from a column

Hello, What's the best way to split a large into multiple files based on the last digit in the first column. input file: f 2738483300000x0y03772748378831x1y13478378358383x2y23743878383802x3y33787828282820x4y43748838383881x5y5 Desired Output: f0 3738483300000x0y03787828282820x4y4 f1... (9 Replies)
Discussion started by: alain.kazan
9 Replies

6. Shell Programming and Scripting

Where to find 64-bit based perl module like XML::Parser::Expat?

Q: Where to get a 64 bit Expat.so? I run a perl script and got this error: Can't load '/usr/perl5/vendor_perl/5.8.4/i86pc-solaris-64int/auto/XML/Parser/Expat/Expat.so' for module XML:parser::Expat: ld.so.1:myPerl: fatal:... (0 Replies)
Discussion started by: lilili07
0 Replies

7. Programming

Parser - multiple in Perl

Dear Perl Experts, Could some body help me to find the solution for my problem below: Input file: ----------- THE-0 tsjp THE-32 tsjp THE-64 tsjp Output desired: --------------- THE-0&&-31 tsjp THE-32&&-63 tsjp THE-64&&-95 tsjp Note: 31 = 0+31, (2 Replies)
Discussion started by: askari
2 Replies

8. Shell Programming and Scripting

Update perl code with parser

The below perl code imports the data in the attached document. However, I can not seem to update the perl code to include a parser like in the desired tab of that document. Thank you :). Most of the data for the parse is included in the document except for the gene and RNA which can is... (0 Replies)
Discussion started by: cmccabe
0 Replies

9. UNIX for Beginners Questions & Answers

Cut first value after underscore and replace first two digit with zero in perl

Like I have below string XX_49154534_491553_201_122023_D XX_49159042_491738_201_103901_D and the expected output would be 0154534 0159042 XX and 49 can be dynamic. (1 Reply)
Discussion started by: nadeemrafikhan
1 Replies
ISWDIGIT(3)						     Linux Programmer's Manual						       ISWDIGIT(3)

NAME
iswdigit - test for decimal digit wide character SYNOPSIS
#include <wctype.h> int iswdigit(wint_t wc); DESCRIPTION
The iswdigit() function is the wide-character equivalent of the isdigit(3) function. It tests whether wc is a wide character belonging to the wide-character class "digit". The wide-character class "digit" is a subclass of the wide-character class "xdigit", and therefore also a subclass of the wide-character class "alnum", of the wide-character class "graph" and of the wide-character class "print". Being a subclass of the wide character class "print", the wide-character class "digit" is disjoint from the wide-character class "cntrl". Being a subclass of the wide-character class "graph", the wide-character class "digit" is disjoint from the wide-character class "space" and its subclass "blank". Being a subclass of the wide-character class "alnum", the wide-character class "digit" is disjoint from the wide-character class "punct". The wide-character class "digit" is disjoint from the wide-character class "alpha" and therefore also disjoint from its subclasses "lower", "upper". The wide-character class "digit" always contains exactly the digits '0' to '9'. RETURN VALUE
The iswdigit() function returns nonzero if wc is a wide character belonging to the wide-character class "digit". Otherwise it returns zero. CONFORMING TO
C99. NOTES
The behavior of iswdigit() depends on the LC_CTYPE category of the current locale. SEE ALSO
isdigit(3), iswctype(3) COLOPHON
This page is part of release 3.27 of the Linux man-pages project. A description of the project, and information about reporting bugs, can be found at http://www.kernel.org/doc/man-pages/. GNU
1999-07-25 ISWDIGIT(3)
All times are GMT -4. The time now is 06:14 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy