Perl to identify specific runs in input and print only lines identified


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Perl to identify specific runs in input and print only lines identified
Prev   Next
# 1  
Old 12-17-2016
Perl to identify specific runs in input and print only lines identified

In the perl one-liner below I am identifying the runs of 6a or 6A in each line starting with >. The code seems close but it prints each > line no matter if it has 6a or 6A in it. Only the line with the 6a or 6A needs to be printed.

So using the input file, only the >hg19_refGene_NM_001918_3 line would be printed because it had either 6a or 6A in it. The other lines are just skipped (not printed). Thank you Smilie.

input
Code:
>hg19_refGene_NM_001918_2 range=chr1:100700982-100701077 5'pad=10 3'pad=10 strand=- repeatMasking=none
gtctttgaagCTCTCCGTGGACAGGTTGTTCAGTTCAAGCTCTCAGACAT
TGGAGAAGGGATTAGAGAAGTAACTGTTAAAGAATGgtaagtgaat
>hg19_refGene_NM_001918_3 range=chr1:100696279-100696480 5'pad=10 3'pad=10 strand=- repeatMasking=none
tttcttttagGTATGTAAAAGAAGGAGATACAGTGTCTCAGTTTGATAGC
ATCTGTGAAGTTCAAAGTGATAAAGCTTCTGTTACCATCACTAGTCGTTA
TGATGGAGTCATTAAAAAACTCTATTATAATCTAGACGATATTGCCTATG
TGGGGAAGCCATTAGTAGACATAGAAACGGAAGCTTTAAAAGgtattgta
ag
>hg19_refGene_NM_001918_4 range=chr1:100684172-100684313 5'pad=10 3'pad=10 strand=- repeatMasking=none
ttgttaccagATTCAGAAGAAGATGTTGTTGAAACTCCTGCAGTGTCTCA
TGATGAACATACACACCAAGAGATAAAGGGCCGAAAAACACTGGCAACTC
CTGCAGTTCGCCGTCTGGCAATGGAAAACAATgtaagttctc
>hg19_refGene_NM_001918_5 range=chr1:100681529-100681765 5'pad=10 3'pad=10 strand=- repeatMasking=none
cattttttagATTAAGCTGAGTGAAGTTGTTGGCTCAGGAAAAGATGGCA
GAATACTTAAAGAAGATATCCTCAACTATTTGGAAAAGCAGACAGGAGCT
ATATTGCCTCCTTCACCCAAAGTTGAAATTATGCCACCTCCACCAAAGCC
AAAAGACATGACTGTTCCTATACTAGTATCAAAACCTCCGGTATTCACAG
GCAAAGACAAAACAGAACCCATAAAAGgtaatgataa

current output
Code:
>hg19_refGene_NM_001918_2 range=chr1:100700982-100701077 5'pad=10 3'pad=10 strand=- repeatMasking=none
>hg19_refGene_NM_001918_3 range=chr1:100696279-100696480 5'pad=10 3'pad=10 strand=- repeatMasking=none
AAAAAA
>hg19_refGene_NM_001918_4 range=chr1:100684172-100684313 5'pad=10 3'pad=10 strand=- repeatMasking=none
>hg19_refGene_NM_001918_5 range=chr1:100681529-100681765 5'pad=10 3'pad=10 strand=- repeatMasking=none

desired output
Code:
>hg19_refGene_NM_001918_3 range=chr1:100696279-100696480 5'pad=10 3'pad=10 strand=- repeatMasking=none
AAAAAA

perl
Code:
perl -076 -nE 'chomp; s/(.+)// && say qq{>$1}; s/\s//g; say $1 while /(a{6})/gi' input

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to print lines from a files with specific start and end patterns and pick only the last lines?

Hi, I need to print lines which are matching with start pattern "SELECT" and END PATTERN ";" and only select the last "select" statement including the ";" . I have attached sample input file and the desired input should be as: INPUT FORMAT: SELECT ABCD, DEFGH, DFGHJ, JKLMN, AXCVB,... (5 Replies)
Discussion started by: nani2019
5 Replies

2. Shell Programming and Scripting

awk to combine all matching fields in input but only print line with largest value in specific field

In the below I am trying to use awk to match all the $13 values in input, which is tab-delimited, that are in $1 of gene which is just a single column of text. However only the line with the greatest $9 value in input needs to be printed. So in the example below all the MECP2 and LTBP1... (0 Replies)
Discussion started by: cmccabe
0 Replies

3. Shell Programming and Scripting

How to print the specific lines?

I need to print specific lines 5,100,67,123 in a file. file name: today.csv (3 Replies)
Discussion started by: ramkumar15
3 Replies

4. Shell Programming and Scripting

Help to just print out specific line from an input file

Hi, I have a file which contains 2,500,500,432 lines. Can I know what command I should type in order just print out particular line from the input file? eg. I just wanna to see what is the contents at line 522,484,612. Thanks for advice. (3 Replies)
Discussion started by: perl_beginner
3 Replies

5. Shell Programming and Scripting

how to print specific lines or words

Hi, Please have a look on below records. STG_HCM_STATE_DIS_TAX_TBL.1207.Xfm: The value of the row is: EMPLID = 220677 COMPANY = 919 BALANCE_ID = 0 BALANCE_YEAR = 2012 STG_HCM_STATE_DIS_TAX_TBL.1207.Xfm: ORA-00001: unique constraint (SYSADM.PS_TAX_BALANCE) violated ... (4 Replies)
Discussion started by: Sachin Lakka
4 Replies

6. Shell Programming and Scripting

Print Specific lines when found specific character

Hello all, I have thousand file input like this: file1: $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$ | | | |$$ $$ UERT | TTYH | TAFE | FRFG |$$ $$______|______|________|______|$$ $$ | | | |$$ $$ 1 | DISK | TR1311 | 1 |$$ $$ 1 |... (4 Replies)
Discussion started by: attila
4 Replies

7. Shell Programming and Scripting

print first few lines, then apply regex on a specific column to print results.

abc.dat tty cpu tin tout us sy wt id 0 0 7 3 19 71 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 133.2 0.0 682.9 0.0 1.0 0.0 7.2 0 79 c1t0d0 0.2 180.4 0.1 5471.2 3.0 2.8 16.4 15.6 15 52 aaaaaa1-xx I want to skip first 5 line... (4 Replies)
Discussion started by: kchinnam
4 Replies

8. Shell Programming and Scripting

Sed one-liner to print specific lines?

I need to print specific lines from a file, say 2-5, 8, 12-15, 17, 19, 21-27. How do I achieve this? (2 Replies)
Discussion started by: Ilja
2 Replies

9. Shell Programming and Scripting

print specific lines

I have a text file made of different blocks separated by blank lines. I need to print the blocks with odd indexes. How can I get it with awk? For example i need to print the first and the third block of a file like this: asgdg sadsd ssgsdgd ass uff fedd sddddso ieeduydd dddee deeo ssancnc... (4 Replies)
Discussion started by: littleboyblu
4 Replies

10. Shell Programming and Scripting

How to print specific lines with awk

Hi! How can I print out a specific range of rows, like "cat file | awk NR==5,NR==9", but in the END-statement? I have a small awk-script that finds specific rows in a file and saves the line number in an array, like this: awk ' BEGIN { count=0} /ZZZZ/ { list=NR ... (10 Replies)
Discussion started by: Bugenhagen
10 Replies
Login or Register to Ask a Question