Sponsored Content
Top Forums Shell Programming and Scripting Perl to parse a variety of formats Post 302996746 by Aia on Tuesday 2nd of May 2017 12:14:33 AM
Old 05-02-2017
While in Perl you could execute a lot of code at the command line, I would recommend that you use one-liners and executable for just testing concepts and quick throw-away.

For example, I want to test that I can split the line by tabs and work with the second field.
I want to extract information from the second field. I might do the following:

Code:
cat example.txt
Input Variant    Errors    Chromosomal Variant    Coding Variant(s)
NM_004992.3:c.274G>T        NC_000023.10:g.153297761C>A    XM_005274683.1:c.-6G>T    XM_005274682.1:c.-6G>T    XM_005274681.1:c.274G>T    LRG_764t2:c.274G>T    NM_004992.3:c.274G>T    LRG_764t1:c.310G>T    NM_001110792.1:c.310G>
NM_004992.3:c.274G>T        NC_0000X.10:g.153297761C>A    XM_005274683.1:c.-6G>T    XM_005274682.1:c.-6G>T    XM_005274681.1:c.274G>T    LRG_764t2:c.274G>T    NM_004992.3:c.274G>T    LRG_764t1:c.310G>T    NM_001110792.1:c.310G>T

Code:
perl -nale '@f = $F[1] =~ /NC_0+(\w+)\.\d+:g\.(\d+)(\w)>(\w)/; print join "\t", @f[0,1],$f[1],@f[2,3]' example.txt

23      153297761       153297761       C       A
X       153297761       153297761       C       A

Now, that I know that my regex is extracting what I want, let's implement it to keep:
Code:
#!/usr/bin/perl
# extract.pl
use strict;
use warnings;

{
    local $, = "\t";
    while(<>) {
        my @fields = split /\t+/;
        my @u = $fields[1] =~ /NC_0+(\w+)\.\d+:g\.(\d+)(\w)>(\w)/;
        print @u[0,1],$u[1],@u[2,3] . "\n" if @u;
    }
}


Code:
perl extract.pl example.txt
23      153297761       153297761       C       A
X       153297761       153297761       C       A


Last edited by Aia; 05-02-2017 at 01:47 AM.. Reason: remove scalar
This User Gave Thanks to Aia For This Post:
 

10 More Discussions You Might Find Interesting

1. Windows & DOS: Issues & Discussions

print image files to variety printer models

Hi, I am currently working on a windows platform (2000 and XP) and was wondering if there are today solutions for the task I have. I need to print image files onto a variety of inkjet printer models, most epson non-postscript. Some of the models I know but new models are added almost every... (0 Replies)
Discussion started by: jokofix007
0 Replies

2. Shell Programming and Scripting

Passing date formats in Perl: i.e. Jul/10/2007 -> 20070710 (yyyymmdd) - Perl

Hi , This script working for fine if pass script-name.sh Jul/10/2007 ,I want to pass 20070710(yyyymmdd) .Please any help it should be appereciated. use Time::Local; my $d = $ARGV; my $t = $ARGV; my $m = ""; @d = split /\//, $d; @t = split /:/, $t; if ( $d eq "Jan" ) { $m = 0 }... (7 Replies)
Discussion started by: akil
7 Replies

3. Shell Programming and Scripting

Breaking if-else loop and variety of comparisions

Hello Friends, Im trying to write a script to invoke nagios. In order to do this I grep some words that comes from output of some backup scripts. When there is "End-of-tape detected" in directed output logs it should give alarm. First I would like to know if there is any better way to write... (5 Replies)
Discussion started by: EAGL€
5 Replies

4. Shell Programming and Scripting

Perl Parse

Hi I'm writing simple perl script to parse the ftp log as below: Local directory now /home/user/testing 227 Entering Passive Mode (192,254,19,34,8,228). 125 Data connection already open; Transfer starting. 09-25-09 02:33PM 25333629 abc.tar 09-14-09 12:50PM 18015752... (1 Reply)
Discussion started by: netxus
1 Replies

5. Programming

Perl parse string

Hi Perl Guys I have another perl question I have the following code that i have written Getopt::Long::config(qw( permute bundling )); my $OPT = {}; GetOptions($OPT, qw( ver=s help|h )) or die "options parsing failed"; This will allow the user to do something like... (4 Replies)
Discussion started by: ab52
4 Replies

6. Shell Programming and Scripting

Using awk to parse a file with mixed formats in columns

Greetings I have a file formatted like this: rhino grey weight=1003;height=231;class=heaviest;histology=9,0,0,8 bird white weight=23;height=88;class=light;histology=7,5,1,0,0 turtle green weight=40;height=9;class=light;histology=6,0,2,0... (2 Replies)
Discussion started by: Twinklefingers
2 Replies

7. Shell Programming and Scripting

Perl to parse

The below code works great to parse out a file if the input is in the attached SNP format ">". perl -ne 'next if $.==1; while(/\t*NC_(\d+)\.\S+g\.(\d+)()>()/g){printf("%d\t%d\t%d\t%s\t%s\n",$1,$2,$2,$3,$4,$5)}' out_position.txt > out_parse.txt My question is if there is another format in... (10 Replies)
Discussion started by: cmccabe
10 Replies

8. Shell Programming and Scripting

Perl:: mass replacement of converting C code formats to tgmath.h

hello, i have a lot of C old code I'm updating to C11 with tgmath.h for generic math. the old code has very specific types, real and complex, like cabsl, csinhl, etc usually for simple bulk replacements i would do something simple like this perl -pi -e 's/cosl/cos/g' *.c the reference... (0 Replies)
Discussion started by: f77hack
0 Replies

9. Shell Programming and Scripting

Help to parse syslog with perl

logver=56 idseq=63256900099118326 itime=1563205190 devid=FG-5KDTB18800138 devname=LAL-C1-FGT-03 vd=USER date=2019-07-15 time=18:39:49 logid="0000000013" type="traffic" subtype="forward" level="notice" eventtime=1563205189 srcip=11.3.3.17 srcport=50544 srcintf="SGI-CORE.123" srcintfrole="undefined"... (3 Replies)
Discussion started by: arm
3 Replies

10. UNIX for Beginners Questions & Answers

Parse apache log file with three different time formats

Hi, I want to parse below file and Write a function to extract the logs between two given timestamp. Apache (Unix) Log Samples - MonitorWare The challenge here is there are three date and time format. First :- 07/Mar/2004:16:05:49 Second :- Sun Mar 7 16:02:00 2004 Third :- 29-Mar... (6 Replies)
Discussion started by: sahil_shine
6 Replies
SIEVESHELL(1)						User Contributed Perl Documentation					     SIEVESHELL(1)

NAME
sieveshell - remotely manipulate sieve scripts SYNOPSIS
sieveshell [--user=user] [--authname=authname] [--realm=realm] [--exec=script] server[:port] sieveshell --help DESCRIPTION
sieveshell allows users to manipulate their scripts on a remote server. It works via MANAGESIEVE, a work in progress. The following commands are recognized: list list scripts on server. put <filename> upload script to server. get <name> [<filename>] get script. if no filename display to stdout delete <name> delete script. activate <name> activate script. deactivate deactivate all scripts. OPTIONS
-u user, --user=user The authorization name to request; by default, derived from the authentication credentials. -a authname, --authname=authname The user to use for authentication (defaults to current user). -r realm, --realm=realm The realm to attempt authentication in. -e script, --exec=script Instead of working interactively, run commands from script, and exit when done. REFERENCES
[MANAGESIEVE] Martin, T.; "A Protocol for Remotely Managing Sieve Scripts", draft-ietf-managesieve-03.txt, Mirapoint, Inc.; May 2001, work in progress. AUTHOR
Tim Martin <tmartin@mirapoint.com>, and the rest of the Cyrus team <cyrus-bugs@andrew.cmu.edu>. perl v5.16.3 2014-06-10 SIEVESHELL(1)
All times are GMT -4. The time now is 01:43 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy