Sponsored Content
Top Forums Shell Programming and Scripting Perl to parse a variety of formats Post 302996746 by Aia on Tuesday 2nd of May 2017 12:14:33 AM
Old 05-02-2017
While in Perl you could execute a lot of code at the command line, I would recommend that you use one-liners and executable for just testing concepts and quick throw-away.

For example, I want to test that I can split the line by tabs and work with the second field.
I want to extract information from the second field. I might do the following:

Code:
cat example.txt
Input Variant    Errors    Chromosomal Variant    Coding Variant(s)
NM_004992.3:c.274G>T        NC_000023.10:g.153297761C>A    XM_005274683.1:c.-6G>T    XM_005274682.1:c.-6G>T    XM_005274681.1:c.274G>T    LRG_764t2:c.274G>T    NM_004992.3:c.274G>T    LRG_764t1:c.310G>T    NM_001110792.1:c.310G>
NM_004992.3:c.274G>T        NC_0000X.10:g.153297761C>A    XM_005274683.1:c.-6G>T    XM_005274682.1:c.-6G>T    XM_005274681.1:c.274G>T    LRG_764t2:c.274G>T    NM_004992.3:c.274G>T    LRG_764t1:c.310G>T    NM_001110792.1:c.310G>T

Code:
perl -nale '@f = $F[1] =~ /NC_0+(\w+)\.\d+:g\.(\d+)(\w)>(\w)/; print join "\t", @f[0,1],$f[1],@f[2,3]' example.txt

23      153297761       153297761       C       A
X       153297761       153297761       C       A

Now, that I know that my regex is extracting what I want, let's implement it to keep:
Code:
#!/usr/bin/perl
# extract.pl
use strict;
use warnings;

{
    local $, = "\t";
    while(<>) {
        my @fields = split /\t+/;
        my @u = $fields[1] =~ /NC_0+(\w+)\.\d+:g\.(\d+)(\w)>(\w)/;
        print @u[0,1],$u[1],@u[2,3] . "\n" if @u;
    }
}


Code:
perl extract.pl example.txt
23      153297761       153297761       C       A
X       153297761       153297761       C       A


Last edited by Aia; 05-02-2017 at 01:47 AM.. Reason: remove scalar
This User Gave Thanks to Aia For This Post:
 

10 More Discussions You Might Find Interesting

1. Windows & DOS: Issues & Discussions

print image files to variety printer models

Hi, I am currently working on a windows platform (2000 and XP) and was wondering if there are today solutions for the task I have. I need to print image files onto a variety of inkjet printer models, most epson non-postscript. Some of the models I know but new models are added almost every... (0 Replies)
Discussion started by: jokofix007
0 Replies

2. Shell Programming and Scripting

Passing date formats in Perl: i.e. Jul/10/2007 -> 20070710 (yyyymmdd) - Perl

Hi , This script working for fine if pass script-name.sh Jul/10/2007 ,I want to pass 20070710(yyyymmdd) .Please any help it should be appereciated. use Time::Local; my $d = $ARGV; my $t = $ARGV; my $m = ""; @d = split /\//, $d; @t = split /:/, $t; if ( $d eq "Jan" ) { $m = 0 }... (7 Replies)
Discussion started by: akil
7 Replies

3. Shell Programming and Scripting

Breaking if-else loop and variety of comparisions

Hello Friends, Im trying to write a script to invoke nagios. In order to do this I grep some words that comes from output of some backup scripts. When there is "End-of-tape detected" in directed output logs it should give alarm. First I would like to know if there is any better way to write... (5 Replies)
Discussion started by: EAGL€
5 Replies

4. Shell Programming and Scripting

Perl Parse

Hi I'm writing simple perl script to parse the ftp log as below: Local directory now /home/user/testing 227 Entering Passive Mode (192,254,19,34,8,228). 125 Data connection already open; Transfer starting. 09-25-09 02:33PM 25333629 abc.tar 09-14-09 12:50PM 18015752... (1 Reply)
Discussion started by: netxus
1 Replies

5. Programming

Perl parse string

Hi Perl Guys I have another perl question I have the following code that i have written Getopt::Long::config(qw( permute bundling )); my $OPT = {}; GetOptions($OPT, qw( ver=s help|h )) or die "options parsing failed"; This will allow the user to do something like... (4 Replies)
Discussion started by: ab52
4 Replies

6. Shell Programming and Scripting

Using awk to parse a file with mixed formats in columns

Greetings I have a file formatted like this: rhino grey weight=1003;height=231;class=heaviest;histology=9,0,0,8 bird white weight=23;height=88;class=light;histology=7,5,1,0,0 turtle green weight=40;height=9;class=light;histology=6,0,2,0... (2 Replies)
Discussion started by: Twinklefingers
2 Replies

7. Shell Programming and Scripting

Perl to parse

The below code works great to parse out a file if the input is in the attached SNP format ">". perl -ne 'next if $.==1; while(/\t*NC_(\d+)\.\S+g\.(\d+)()>()/g){printf("%d\t%d\t%d\t%s\t%s\n",$1,$2,$2,$3,$4,$5)}' out_position.txt > out_parse.txt My question is if there is another format in... (10 Replies)
Discussion started by: cmccabe
10 Replies

8. Shell Programming and Scripting

Perl:: mass replacement of converting C code formats to tgmath.h

hello, i have a lot of C old code I'm updating to C11 with tgmath.h for generic math. the old code has very specific types, real and complex, like cabsl, csinhl, etc usually for simple bulk replacements i would do something simple like this perl -pi -e 's/cosl/cos/g' *.c the reference... (0 Replies)
Discussion started by: f77hack
0 Replies

9. Shell Programming and Scripting

Help to parse syslog with perl

logver=56 idseq=63256900099118326 itime=1563205190 devid=FG-5KDTB18800138 devname=LAL-C1-FGT-03 vd=USER date=2019-07-15 time=18:39:49 logid="0000000013" type="traffic" subtype="forward" level="notice" eventtime=1563205189 srcip=11.3.3.17 srcport=50544 srcintf="SGI-CORE.123" srcintfrole="undefined"... (3 Replies)
Discussion started by: arm
3 Replies

10. UNIX for Beginners Questions & Answers

Parse apache log file with three different time formats

Hi, I want to parse below file and Write a function to extract the logs between two given timestamp. Apache (Unix) Log Samples - MonitorWare The challenge here is there are three date and time format. First :- 07/Mar/2004:16:05:49 Second :- Sun Mar 7 16:02:00 2004 Third :- 29-Mar... (6 Replies)
Discussion started by: sahil_shine
6 Replies
Devel::Refcount(3pm)					User Contributed Perl Documentation				      Devel::Refcount(3pm)

NAME
"Devel::Refcount" - obtain the REFCNT value of a referent SYNOPSIS
use Devel::Refcount qw( refcount ); my $anon = []; print "Anon ARRAY $anon has " . refcount($anon) . " reference "; my $otherref = $anon; print "Anon ARRAY $anon now has " . refcount($anon) . " references "; DESCRIPTION
This module provides a single function which obtains the reference count of the object being pointed to by the passed reference value. FUNCTIONS
$count = refcount($ref) Returns the reference count of the object being pointed to by $ref. COMPARISON WITH SvREFCNT This function differs from "Devel::Peek::SvREFCNT" in that SvREFCNT() gives the reference count of the SV object itself that it is passed, whereas refcount() gives the count of the object being pointed to. This allows it to give the count of any referent (i.e. ARRAY, HASH, CODE, GLOB and Regexp types) as well. Consider the following example program: use Devel::Peek qw( SvREFCNT ); use Devel::Refcount qw( refcount ); sub printcount { my $name = shift; printf "%30s has SvREFCNT=%d, refcount=%d ", $name, SvREFCNT($_[0]), refcount($_[0]); } my $var = []; printcount 'Initially, $var', $var; my $othervar = $var; printcount 'Before CODE ref, $var', $var; printcount '$othervar', $othervar; my $code = sub { undef $var }; printcount 'After CODE ref, $var', $var; printcount '$othervar', $othervar; This produces the output Initially, $var has SvREFCNT=1, refcount=1 Before CODE ref, $var has SvREFCNT=1, refcount=2 $othervar has SvREFCNT=1, refcount=2 After CODE ref, $var has SvREFCNT=2, refcount=2 $othervar has SvREFCNT=1, refcount=2 Here, we see that SvREFCNT() counts the number of references to the SV object passed in as the scalar value - the $var or $othervar respectively, whereas refcount() counts the number of reference values that point to the referent object - the anonymous ARRAY in this case. Before the CODE reference is constructed, both $var and $othervar have SvREFCNT() of 1, as they exist only in the current lexical pad. The anonymous ARRAY has a refcount() of 2, because both $var and $othervar store a reference to it. After the CODE reference is constructed, the $var variable now has an SvREFCNT() of 2, because it also appears in the lexical pad for the new anonymous CODE block. PURE-PERL FALLBACK An XS implementation of this function is provided, and is used by default. If the XS library cannot be loaded, a fallback implementation in pure perl using the "B" module is used instead. This will behave identically, but is much slower. Rate pp xs pp 225985/s -- -66% xs 669570/s 196% -- SEE ALSO
o Test::Refcount - assert reference counts on objects AUTHOR
Paul Evans <leonerd@leonerd.org.uk> perl v5.14.2 2011-11-15 Devel::Refcount(3pm)
All times are GMT -4. The time now is 02:52 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy