awk or perl to parse file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk or perl to parse file
# 8  
Old 03-11-2015
you cannot simply join the lines.
Firstly, try it like so:
Code:
awk '
FNR>1 {
   for(i=1;i<=NF;i++)
     if ($i ~ /^NC_0000/) {
       n=split($i,a, "[.:>_]")
       print a[2]+0, a[5]+0, a[5]+0, substr(a[5],length(a[5])), a[n]
     }
}' OFS='\t' GJB-2.txt

This User Gave Thanks to vgersh99 For This Post:
# 9  
Old 03-11-2015
Thank you Smilie
# 10  
Old 03-13-2015
Quote:
Originally Posted by cmccabe
...
Code:
 perl -lane 'map{$_="NA" if $_<100}@F[5..$#F] if $.>1; print join "\t", "@F"' ${id}.txt > ${id}_parse.txt

${id} = text file attached
...
I couldn't understand the purpose of the map operator. However, the join function accepts two parameters - the join character and a list (array). "@F" is not an array. It is the string representation of the array @F, where each array element is separated by a single blank space. Examples:
Code:
$
$ # @x is the array in the one-liner below
$ perl -le '@x = qw(AA BB CC DD); print @x; print join "|", @x'
AABBCCDD
AA|BB|CC|DD

$
$ # "@x" is the string representation of the array @x in the one-liner below
$ perl -le '@x = qw(AA BB CC DD); print "@x"; print join "|", "@x"'
AA BB CC DD
AA BB CC DD

$

I added some dummy data in your file and it looks like this now (^I is the tab character and ^M$ is the Windows end-of-line character):

Code:
$
$ cat -nvET GJB-2.txt
     1  Input Variant^IErrors^IChromosomal Variant^ICoding Variant(s)^M$
     2  NM_004004.5:c.244G>C^I^INC_000013.10:g.20763477C>G^INM_004004.5:c.244G>C^IXM_005266354.1:c.244G>C^IXM_005266355.1:c.244G>C^IXM_005266356.1:c.244G>C^M$
     3  NM_004004.5:c.244G>C^I^INC_00001.10:g.20763477C>G^INM_004004.5:c.244G>C^INC_00005.10:g.77888999G>C^IXM_005266355.1:c.244G>C^IXM_005266356.1:c.244G>C^M$

$
$

The following Perl one-liner uses a regular expression to parse each token in each line as per the parse rules and prints the desired information:

Code:
$
$ perl -ne 'next if $.==1; while(/\t*NC_0000(\d+)\.\S+g\.(\d+)([A-Z])>([A-Z])/g){printf("%d\t%d\t%d\t%s\t%s\n",$1,$2,$2,$3,$4,$5)}' GJB-2.txt
13      20763477        20763477        C       G
1       20763477        20763477        C       G
5       77888999        77888999        G       C

$
$

This User Gave Thanks to durden_tyler For This Post:
# 11  
Old 03-16-2015
Thank you Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parse through a txt file PERL scripting

Below is a perl code I am trying. #!/usr/bin/perl #use strict; use warnings qw/ all FATAL /; use constant ENV_FILE => '/apps/env_data.txt'; $uenv = $ARGV; my $input = $uenv; open my $fh, '<', ENV_FILE or die sprintf qq{Unable to open "%s" for input: $!}, ENV_FILE; ... (2 Replies)
Discussion started by: Tuxidow
2 Replies

2. Shell Programming and Scripting

Using awk to Parse File

Hi all, I have a file that contains a good hundred of these job definitions below: Job Name Last Start Last End ST Run Pri/Xit ________________________________________________________________ ____________________... (7 Replies)
Discussion started by: atticuss
7 Replies

3. Shell Programming and Scripting

Parse a file using awk

Hi Experts, I am trying to parse the following file; FILEA a|b|c|c|c|c a|b|d|d|d|d e|f|a|a|a|a e|f|b|b|b|boutput expected: a<TAB>b <TAB><TAB>c<TAB>c<TAB>c<TAB>c<TAB> <TAB><TAB>d<TAB>d<TAB>d<TAB>d<TAB> e<TAB>f <TAB><TAB>a<TAB>a<TAB>a<TAB>a<TAB> <TAB><TAB>b<TAB>b<TAB>b<TAB>b<TAB>*... (7 Replies)
Discussion started by: rajangupta2387
7 Replies

4. Shell Programming and Scripting

Parse a file with awk?

Hi guys (and gals). I need some help. I'm running an IVR purely on Asterisk where I capture the DTMFs. After pulsing each DTMF I have Asterisk write to a file with whatever was dialed (mostly used for record-keeping) and at the end of the survey I write all variables in a single line to a... (2 Replies)
Discussion started by: tulf210
2 Replies

5. Shell Programming and Scripting

Perl: Parse Hex file into fields

Hi, I want to split/parse certain bits of the hex data into another field. Example: Input data is Word1: 4f72abfd Output: Parse bits (5 to 0) into field word1data1=0x00cd=205 decimal Parse bits (7 to 6) into field word1data2=0x000c=12 decimal etc. Word2: efff3d02 Parse bits (13 to... (1 Reply)
Discussion started by: morrbie
1 Replies

6. Shell Programming and Scripting

AWK - Parse a big file

INPUT SAMPLE Symmetrix ID : 000192601507 Masking View Name : TS00P22_13E_1 Last updated at : 05:10:18 AM on Tue Mar 22,2011 Initiator Group Name : 10000000c960b9cd Host Initiators { WWN : 10000000c960b9cd } Port Group Name :... (8 Replies)
Discussion started by: greycells
8 Replies

7. Shell Programming and Scripting

Shell script (not Perl) to parse xml with awk

Hi, I have to make an script according to these: - I have couples of files like: xxxxxxxxxxxxx.csv xxxxxxxxxxxxx_desc.xml - every xml file has diferent fields, but keeps this format: ........ <defaultName>2011-02-25T16:43:43.582Z</defaultName> ........... (2 Replies)
Discussion started by: Pluff
2 Replies

8. Shell Programming and Scripting

Parse file contents in perl...

Hi, I have the file like this: #Contents of file 1 are: Dec 10 12:33:44 User1 Interface: Probe Dec 10 12:33:47 uSER1 SOME DATA Dec 10 12:33:47 user1 Interface: MSGETYPE Dec 10 12:34:48 user1 ID: 10. Dec 10 12:33:55 user1 Interface: MSGTYPE Dec 10 12:33:55 user1 Id: 9 ... (1 Reply)
Discussion started by: vanitham
1 Replies

9. Shell Programming and Scripting

Parse file using awk and work in awk output

hi guys, i want to parse a file using public function, the file contain raw data in the below format i want to get the output like this to load it to Oracle DB MARWA1,BSS:26,1,3,0,0,0,0,0.00,22,22,22.00 MARWA2,BSS:26,1,3,0,0,0,0,0.00,22,22,22.00 this the file raw format: Number of... (6 Replies)
Discussion started by: dagigg
6 Replies

10. Shell Programming and Scripting

CSV File parse help in Perl

Folks, I have a bit of an issue trying to obtain some data from a csv file using PERL. I can sort the file and remove any duplicates leaving only 4 or 5 rows containing data. My problem is that the data contained in the original file contains a lot more columns and when I try ro run this script... (13 Replies)
Discussion started by: lodey
13 Replies
Login or Register to Ask a Question