Sponsored Content
Top Forums Shell Programming and Scripting Perl to parse a variety of formats Post 302996746 by Aia on Tuesday 2nd of May 2017 12:14:33 AM
Old 05-02-2017
While in Perl you could execute a lot of code at the command line, I would recommend that you use one-liners and executable for just testing concepts and quick throw-away.

For example, I want to test that I can split the line by tabs and work with the second field.
I want to extract information from the second field. I might do the following:

Code:
cat example.txt
Input Variant    Errors    Chromosomal Variant    Coding Variant(s)
NM_004992.3:c.274G>T        NC_000023.10:g.153297761C>A    XM_005274683.1:c.-6G>T    XM_005274682.1:c.-6G>T    XM_005274681.1:c.274G>T    LRG_764t2:c.274G>T    NM_004992.3:c.274G>T    LRG_764t1:c.310G>T    NM_001110792.1:c.310G>
NM_004992.3:c.274G>T        NC_0000X.10:g.153297761C>A    XM_005274683.1:c.-6G>T    XM_005274682.1:c.-6G>T    XM_005274681.1:c.274G>T    LRG_764t2:c.274G>T    NM_004992.3:c.274G>T    LRG_764t1:c.310G>T    NM_001110792.1:c.310G>T

Code:
perl -nale '@f = $F[1] =~ /NC_0+(\w+)\.\d+:g\.(\d+)(\w)>(\w)/; print join "\t", @f[0,1],$f[1],@f[2,3]' example.txt

23      153297761       153297761       C       A
X       153297761       153297761       C       A

Now, that I know that my regex is extracting what I want, let's implement it to keep:
Code:
#!/usr/bin/perl
# extract.pl
use strict;
use warnings;

{
    local $, = "\t";
    while(<>) {
        my @fields = split /\t+/;
        my @u = $fields[1] =~ /NC_0+(\w+)\.\d+:g\.(\d+)(\w)>(\w)/;
        print @u[0,1],$u[1],@u[2,3] . "\n" if @u;
    }
}


Code:
perl extract.pl example.txt
23      153297761       153297761       C       A
X       153297761       153297761       C       A


Last edited by Aia; 05-02-2017 at 01:47 AM.. Reason: remove scalar
This User Gave Thanks to Aia For This Post:
 

10 More Discussions You Might Find Interesting

1. Windows & DOS: Issues & Discussions

print image files to variety printer models

Hi, I am currently working on a windows platform (2000 and XP) and was wondering if there are today solutions for the task I have. I need to print image files onto a variety of inkjet printer models, most epson non-postscript. Some of the models I know but new models are added almost every... (0 Replies)
Discussion started by: jokofix007
0 Replies

2. Shell Programming and Scripting

Passing date formats in Perl: i.e. Jul/10/2007 -> 20070710 (yyyymmdd) - Perl

Hi , This script working for fine if pass script-name.sh Jul/10/2007 ,I want to pass 20070710(yyyymmdd) .Please any help it should be appereciated. use Time::Local; my $d = $ARGV; my $t = $ARGV; my $m = ""; @d = split /\//, $d; @t = split /:/, $t; if ( $d eq "Jan" ) { $m = 0 }... (7 Replies)
Discussion started by: akil
7 Replies

3. Shell Programming and Scripting

Breaking if-else loop and variety of comparisions

Hello Friends, Im trying to write a script to invoke nagios. In order to do this I grep some words that comes from output of some backup scripts. When there is "End-of-tape detected" in directed output logs it should give alarm. First I would like to know if there is any better way to write... (5 Replies)
Discussion started by: EAGL€
5 Replies

4. Shell Programming and Scripting

Perl Parse

Hi I'm writing simple perl script to parse the ftp log as below: Local directory now /home/user/testing 227 Entering Passive Mode (192,254,19,34,8,228). 125 Data connection already open; Transfer starting. 09-25-09 02:33PM 25333629 abc.tar 09-14-09 12:50PM 18015752... (1 Reply)
Discussion started by: netxus
1 Replies

5. Programming

Perl parse string

Hi Perl Guys I have another perl question I have the following code that i have written Getopt::Long::config(qw( permute bundling )); my $OPT = {}; GetOptions($OPT, qw( ver=s help|h )) or die "options parsing failed"; This will allow the user to do something like... (4 Replies)
Discussion started by: ab52
4 Replies

6. Shell Programming and Scripting

Using awk to parse a file with mixed formats in columns

Greetings I have a file formatted like this: rhino grey weight=1003;height=231;class=heaviest;histology=9,0,0,8 bird white weight=23;height=88;class=light;histology=7,5,1,0,0 turtle green weight=40;height=9;class=light;histology=6,0,2,0... (2 Replies)
Discussion started by: Twinklefingers
2 Replies

7. Shell Programming and Scripting

Perl to parse

The below code works great to parse out a file if the input is in the attached SNP format ">". perl -ne 'next if $.==1; while(/\t*NC_(\d+)\.\S+g\.(\d+)()>()/g){printf("%d\t%d\t%d\t%s\t%s\n",$1,$2,$2,$3,$4,$5)}' out_position.txt > out_parse.txt My question is if there is another format in... (10 Replies)
Discussion started by: cmccabe
10 Replies

8. Shell Programming and Scripting

Perl:: mass replacement of converting C code formats to tgmath.h

hello, i have a lot of C old code I'm updating to C11 with tgmath.h for generic math. the old code has very specific types, real and complex, like cabsl, csinhl, etc usually for simple bulk replacements i would do something simple like this perl -pi -e 's/cosl/cos/g' *.c the reference... (0 Replies)
Discussion started by: f77hack
0 Replies

9. Shell Programming and Scripting

Help to parse syslog with perl

logver=56 idseq=63256900099118326 itime=1563205190 devid=FG-5KDTB18800138 devname=LAL-C1-FGT-03 vd=USER date=2019-07-15 time=18:39:49 logid="0000000013" type="traffic" subtype="forward" level="notice" eventtime=1563205189 srcip=11.3.3.17 srcport=50544 srcintf="SGI-CORE.123" srcintfrole="undefined"... (3 Replies)
Discussion started by: arm
3 Replies

10. UNIX for Beginners Questions & Answers

Parse apache log file with three different time formats

Hi, I want to parse below file and Write a function to extract the logs between two given timestamp. Apache (Unix) Log Samples - MonitorWare The challenge here is there are three date and time format. First :- 07/Mar/2004:16:05:49 Second :- Sun Mar 7 16:02:00 2004 Third :- 29-Mar... (6 Replies)
Discussion started by: sahil_shine
6 Replies
Locale::Currency(3perl) 				 Perl Programmers Reference Guide				   Locale::Currency(3perl)

NAME
Locale::Currency - standard codes for currency identification SYNOPSIS
use Locale::Currency; $curr = code2currency('usd'); # $curr gets 'US Dollar' $code = currency2code('Euro'); # $code gets 'eur' @codes = all_currency_codes(); @names = all_currency_names(); DESCRIPTION
The "Locale::Currency" module provides access to standard codes used for identifying currencies and funds, such as those defined in ISO 4217. Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 4217 three-letter codes will be used. SUPPORTED CODE SETS
There are several different code sets you can use for identifying currencies. The ones currently supported are: alpha This is a set of three-letter (uppercase) codes from ISO 4217 such as EUR for Euro. Two of the codes specified by the standard (XTS which is reserved for testing purposes and XXX which is for transactions where no currency is involved) are omitted. This code set is identified with the symbol "LOCALE_CURR_ALPHA". This is the default code set. num This is the set of three-digit numeric codes from ISO 4217. This code set is identified with the symbol "LOCALE_CURR_NUMERIC". ROUTINES
code2currency ( CODE [,CODESET] ) currency2code ( NAME [,CODESET] ) currency_code2code ( CODE ,CODESET ,CODESET2 ) all_currency_codes ( [CODESET] ) all_currency_names ( [CODESET] ) Locale::Currency::rename_currency ( CODE ,NEW_NAME [,CODESET] ) Locale::Currency::add_currency ( CODE ,NAME [,CODESET] ) Locale::Currency::delete_currency ( CODE [,CODESET] ) Locale::Currency::add_currency_alias ( NAME ,NEW_NAME ) Locale::Currency::delete_currency_alias ( NAME ) Locale::Currency::rename_currency_code ( CODE ,NEW_CODE [,CODESET] ) Locale::Currency::add_currency_code_alias ( CODE ,NEW_CODE [,CODESET] ) Locale::Currency::delete_currency_code_alias ( CODE [,CODESET] ) These routines are all documented in the Locale::Codes man page. SEE ALSO
Locale::Codes Locale::Constants http://www.iso.org/iso/support/currency_codes_list-1.htm The ISO 4217 data. AUTHOR
See Locale::Codes for full author history. Currently maintained by Sullivan Beck (sbeck@cpan.org). COPYRIGHT
Copyright (c) 1997-2001 Canon Research Centre Europe (CRE). Copyright (c) 2001 Michael Hennecke Copyright (c) 2001-2010 Neil Bowers Copyright (c) 2010-2011 Sullivan Beck This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.14.2 2011-09-26 Locale::Currency(3perl)
All times are GMT -4. The time now is 03:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy