Rewrite sed to perl or run sed in perl


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Rewrite sed to perl or run sed in perl
# 8  
Old 04-09-2015
Well, I don't.

This exact program:

Code:
next if $. == 1;

if(/.*del([A-Z]+)ins([A-Z]+).*NC_0{4}([0-9]+).*g\.([0-9]+)_([0-9]+)/)
{
        print join("\t", $3, $4, $5, $1, $2), "\n";
}

running on this exact input:

Code:
Input Variant   Errors  Chromosomal Variant     Coding Variant(s)
NM_004004.5:c.592_600delGTGTCTGGAinsCAGTGTTCATGACATTC           NC_000013.10:g.20763121_20763129delinsGAATGTCATGAACACTG NM_004004.5:c.592_600delinsCAGTGTTCATGACATTC    XM_005266354.1:c.592_600delinsCAGTGTTCATGACATTC XM_005266355.1:c.592_600delinsCAGTGTTCATGACATTC XM_005266356.1:c.592_600delinsCAGTGTTCATGACATTC

Gives this output:

Code:
13      20763121        20763129        GTGTCTGGA       CAGTGTTCATGACATTC

...so you're either running very different data or very different code.
# 9  
Old 04-09-2015
Lets say there are two inputs combined into a file (out_position2), the output leaving the while statement in and removing it is attached. Leaving the while statement in results in output that is close, but the third line in it 13 20763121 20763121 C - is not needed and Im not sure why it is there. Thank you Smilie.

It looks the third line is the parsed output of the first if statement, is there a way to prevent this?

Last edited by cmccabe; 04-09-2015 at 03:04 PM.. Reason: added a comment
# 10  
Old 04-09-2015
Running the code I showed on out_position2.txt gives this exact output:

Code:
13      20763121        20763129        GTGTCTGGA       CAGTGTTCATGACATTC

...and nothing else.

So please show exactly what you are doing, word for word, letter for letter, keystroke for keystroke, to get the output you are.
# 11  
Old 04-09-2015
Two variants are inputted by the user and written to one file (out_position.txt). Then the below parser is used on that file. Depending on the input line one of the parse conditions is used. So, for the "del" in line 1 of the input file the # deletion is used. For the "delins" on line 2 the # indel is used. So the resulting output of the parsed file would be two lines (it is 3 now). I am not sure how to fix it but it looks like the while may be the issue, bit removing it doesnt result in the desired output. Thank you Smilie.

Code:
 perl -ne 'next if $. == 1;
	if(/.*del([A-Z]+)ins([A-Z]+).*NC_0{4}([0-9]+).*g\.([0-9]+)_([0-9]+)/)   # indel
{
        print join("\t", $3, $4, $5, $1, $2), "\n";
}
            while (/\t*NC_(\d+)\.\S+g\.(\d+)(\S+)/g) {                                            # conditional parse
                ($num1, $num2, $common) = ($1, $2, $3);
                $num3 = $num2;
                if    ($common =~ /^([A-Z])>([A-Z])$/)   { ($ch1, $ch2) = ($1, $2) }              # SNP
                elsif ($common =~ /^del([A-Z])$/)        { ($ch1, $ch2) = ($1, "-") }             # deletion
                elsif ($common =~ /^ins([A-Z])$/)        { ($ch1, $ch2) = ("-", $1) }             # insertion
                elsif ($common =~ /^_(\d+)del([A-Z]+)$/) { ($num3, $ch1, $ch2) = ($1, $2, "-") }  # multi deletion
				elsif ($common =~ /^_(\d+)ins([A-Z]+)$/) { ($num3, $ch1, $ch2) = ("-", $1, $2) }  # multi insertion
                printf ("%d\t%d\t%d\t%s\t%s\n", $num1, $num2, $num3, $ch1, $ch2);                 # output
                map {undef} ($num1, $num2, $num3, $common, $ch1, $ch2);
            }
           ' C:/Users/cmccabe/Desktop/annovar/out_position.txt > C:/Users/cmccabe/Desktop/annovar/out_parse.txt

# 12  
Old 04-09-2015
So, that 'while' loop should only happen when the data doesn't match the regex from sed?

Code:
next if $. == 1;
	if(/.*del([A-Z]+)ins([A-Z]+).*NC_0{4}([0-9]+).*g\.([0-9]+)_([0-9]+)/)   # indel
{
        print join("\t", $3, $4, $5, $1, $2), "\n";
}
else
{
            while (/\t*NC_(\d+)\.\S+g\.(\d+)(\S+)/g) {                                            # conditional parse
                ($num1, $num2, $common) = ($1, $2, $3);
                $num3 = $num2;
                if    ($common =~ /^([A-Z])>([A-Z])$/)   { ($ch1, $ch2) = ($1, $2) }              # SNP
                elsif ($common =~ /^del([A-Z])$/)        { ($ch1, $ch2) = ($1, "-") }             # deletion
                elsif ($common =~ /^ins([A-Z])$/)        { ($ch1, $ch2) = ("-", $1) }             # insertion
                elsif ($common =~ /^_(\d+)del([A-Z]+)$/) { ($num3, $ch1, $ch2) = ($1, $2, "-") }  # multi deletion
				elsif ($common =~ /^_(\d+)ins([A-Z]+)$/) { ($num3, $ch1, $ch2) = ("-", $1, $2) }  # multi insertion
                printf ("%d\t%d\t%d\t%s\t%s\n", $num1, $num2, $num3, $ch1, $ch2);                 # output
                map {undef} ($num1, $num2, $num3, $common, $ch1, $ch2);
            }
}

This User Gave Thanks to Corona688 For This Post:
# 13  
Old 04-10-2015
Thats it, works perfect Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl or sed command ?

Hi Guys Am working on a bash script but got stuck, in this line: 32 $configValues = ''; What would be the best command to enter the password between the " Perl or sed ? Been trying with Perl using this command: perl -pi -e 's/''/Seattle#1669!/g'... (5 Replies)
Discussion started by: Tox
5 Replies

2. Shell Programming and Scripting

sed and egrep in perl

Hi i have a data file whcih contains the data as follows : FH332OY86|AAABBB CCCC DDDA FHLMC 30 8.000|FHLMC|3|30|8.00000000|1986|26.29164289|3.29544844|0.00000000|10.05940539|107.50704264|Mar 8 2013 12:00AM|20130311|D|DA|DAA|DAAC|201302 FH332OY87|AAABBB CCCC DDDA FHLMC 30... (9 Replies)
Discussion started by: ptappeta
9 Replies

3. Shell Programming and Scripting

Rsync script to rewrite suffix - BASH, awk, sed, perl?

trying to write up a script to put the suffix back. heres what I have but can't get it to do anything :( would like it to be name.date.suffix rsync -zrlpoDtub --suffix=".`date +%Y%m%d%k%M%S`.~" --bwlimit=1024 /mymounts/test1/ /mymounts/test2/ while IFS=. read -r -u 9 -d '' name... (1 Reply)
Discussion started by: jmituzas
1 Replies

4. Shell Programming and Scripting

awk sed perl??

Hi guys First of all a salute to this wonderful platform which has helped me many a times. Im now faced with something that I cannot solve. I have data like this 11:14:18 0.46975 11:14:18 0.07558 11:14:18 0.00020 11:14:18 0.00120 11:14:18 0.25879 11:14:19 0.00974 11:14:19 0.05656... (13 Replies)
Discussion started by: jamie_123
13 Replies

5. Shell Programming and Scripting

re-Substitution Sed (or Perl)

I have a large text csv file that I'm working with. It will look something like this: D,",E",C O,"F,",I O,gh,R The second column always has a two digit random code (can be numbers, letters or any characters). When one of the characters happens to be a comma, the string is quoted. I want to... (5 Replies)
Discussion started by: beenny
5 Replies

6. Shell Programming and Scripting

Need Help with sed/perl !

In a file the content is 13 box google unix.com "he is google" hello "he is unix.com" - I need to replace each space char with "a" char but not inside the double quoted strings. So, the output must look like, 13aboxagoogleaunix.coma"he is google"ahelloa"he is unix.com"a- I tried with ... (9 Replies)
Discussion started by: gameboy87
9 Replies

7. Shell Programming and Scripting

Sed/Perl help

Some text is like this.... <table>This is first text.</table>mouse <table>This is second text</table>keyboard <table>This is third text</table>Pad I need to insert <a></a> between "mouse","keyboard","Pad". I it possible to do with sed/Perl ? Please help.. The text should look like... (8 Replies)
Discussion started by: gameboy87
8 Replies

8. Shell Programming and Scripting

Problem with sed in perl!!

Hi, I am facing an issue with sed in perl. I have a file which has 2 header lines and one trailer line. I need to process the file without these headers and trailer. My file looks like : File.txt:- Header1 Header2 data1 data2 trailer For removing header and trailer from file I am using... (5 Replies)
Discussion started by: abhisharma23
5 Replies

9. Shell Programming and Scripting

[Perl] Accessing array elements within a sed command in Perl script

I am trying to use a script to replace the header of each file, whose filename are stored within the array $test, using the sed command within a Perl script as follows: $count = 0; while ( $count < $#test ) { `sed -e 's/BIOGRF 321/BIOGRF 332/g' ${test} > 0`; `cat 0 >... (2 Replies)
Discussion started by: userix
2 Replies

10. Shell Programming and Scripting

Perl: Run perl script in the current process

I have a question regarding running perl in the current process. I shall demonstrate with an example. Look at this. sh-2.05b$ pwd /tmp sh-2.05b$ cat test.sh #! /bin/sh cd /etc sh-2.05b$ ./test.sh sh-2.05b$ pwd /tmp sh-2.05b$ . ./test.sh sh-2.05b$ pwd /etc sh-2.05b$ So... (10 Replies)
Discussion started by: vino
10 Replies
Login or Register to Ask a Question