Searching and extracting records


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Searching and extracting records
# 1  
Old 03-12-2011
Searching and extracting records

Hello,

I have a file with DNA sequences and I want to extract some records by searching them with a word in it and then write the whole record into another file. I am new to perl and having trouble to extract the whole record. Instead I am only able to write the line that contains the word. Can anyone help me to get the whole record extracted.

Here is a sample input file:
Code:
>cel-let-7 MI0000001 Caenorhabditis elegans let-7 stem-loop
TACACTGTGGATCCGGTGAGGTAGTAGGTTGTATAGTTTGGAATATTACCACCGGTGAACTATGCAATTTTCTACCTTACCGGAGACAGAACTCTTCGA
>cel-lin-4 MI0000002 Caenorhabditis elegans lin-4 stem-loop
ATGCTTCCGGCCTGTTCCCTGAGACCTCAAGTGTGAGTGTACTATTGATGCTTCACACCTGGGCTCTCCGGGTACCAGGACGGTTTGAGCAGAT
>dps-mir-317 MI0001358 Drosophila pseudoobscura miR-317 stem-loop
TGCAACTGCCGTTGGGATACACCCTGTGCTCGCTTTGAATATGGTGCAAGCAAGTGAACACAGCTGGTGGTATCCAATGGCCGTTCTGCA
>dps-mir-318 MI0001359 Drosophila pseudoobscura miR-318 stem-loop
TTTATGGGATGCACCAAGTTCAGTTTTGTCACATTTCGAGCATCACTGGGCTTTGTTTATCTCATAAG
>dre-mir-7b MI0001360 Danio rerio miR-7b stem-loop
TGAACGCTGGCTTGCTTCTGTGTGGAAGACTTGTGATTTTGTTGTTGTTAGTTAGATGAAGTGACAACAAATCACGGTCTGCCCTACAGCACAGGCCCAGCATC

I want to extract records that contain "dre". Using grep I am able to get only the first line but not the sequence.

Here is what I tried:
Code:
$ grep 'dre' hairpin.fa > dre.fa

What I have in dre.fa is
Code:
>dre-mir-7b MI0001360 Danio rerio miR-7b stem-loop
>dre-mir-7a-1 MI0001361 Danio rerio miR-7a-1 stem-loop
>dre-mir-7a-2 MI0001362 Danio rerio miR-7a-2 stem-loop

How can I get the sequences associated with it. Any help is appreciated.

Thank you.

Last edited by radoulov; 03-12-2011 at 05:15 PM.. Reason: Code tags, please!
# 2  
Old 03-12-2011
If you're on Solaris, use nawk or /usr/xpg4/bin/awk.

Code:
awk 'END { 
   if (r) print r 
   }
/^>/ {
  if (r ~ p) 
    print r 
  r = x 
  }
{ 
  r = r ? r RS $0 : $0 
  }' p=dre infile

# 3  
Old 03-12-2011
assuming your input file has the same format than the sample you provided :

Code:
sed '/^>/N;s/\n/ /;/dps/!d' infile

---------- Post updated at 11:19 PM ---------- Previous update was at 11:18 PM ----------

... ooops i meant :

Code:
sed '/^>/N;s/\n/ /;/dre/!d' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Searching for a pattern and extracting records related to that pattern

Hi there, Looking forward to your advice for the below: I have a file which contains 2 paragraphs related to a particular pattern. I have to search for those paragraphs from a log file and then print a particular line from those paragraphs. Sample: I have one file with the fixed... (3 Replies)
Discussion started by: danish0909
3 Replies

2. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Hi Freinds , I have 2 files . File 1 |nag|HYd|1|Che |esw|Gun|2|hyd |pra|bhe|3|hyd |omu|hei|4|bnsj |uer|oeri|5|uery File 2 |nag|HYd|1|Che |esw|Gun|2|hyd |uer|oi|3|uery output : (9 Replies)
Discussion started by: i150371485
9 Replies

3. Shell Programming and Scripting

Extracting non zero records from Binary File

Dear Experts, I have one "binary file" which contains multiple records of fixed size 31744. I need to extract only those records which have non-zero data. Sample file could be: a6 82 (+31742 bytes) a6 00 12 00 (+31740 bytes) 00 00 (00 31742 times) a6 00 12 34... (11 Replies)
Discussion started by: dhiraj4mann
11 Replies

4. UNIX for Dummies Questions & Answers

Grep specific records from a file of records that are separated by an empty line

Hi everyone. I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this: ID: 20 Name: X Age: 19 ID: 21 Name: Z ID: 22 Email: xxx@yahoo.com Name: Y Age: 19 I want to grep records that... (4 Replies)
Discussion started by: Atrisa
4 Replies

5. Shell Programming and Scripting

Extracting the records which contains atleast one double quote(")

Hi Experts, I have a file with some of records contain double quotes ("). I need to write these records in separate file and have to delete the same records from the original file. For Example: Orginal File : 100000,abcd,CRED,MO 100001,"efgh",CRED 100002,ijkl,CRED,TX... (3 Replies)
Discussion started by: vsairam
3 Replies

6. Shell Programming and Scripting

Based on num of records in file1 need to check records in file2 to set some condns

Hi All, I have two files say file1 and file2. I want to check the number of records in file1 and if its atleast 2 (i.e., 2 or greater than 2 ) then I have to check records in file2 .If records in file2 is atleast 1 (i.e. if its not empty ) i have to set some conditions . Could you pls... (3 Replies)
Discussion started by: mavesum
3 Replies

7. Shell Programming and Scripting

Extracting records with unique fields from a fixed width txt file

Greetings, I would like to extract records from a fixed width text file that have unique field elements. Data is structured like this: John A Smith NY Mary C Jones WA Adam J Clark PA Mary Jones WA Fieldname / start-end position Firstname 1-10... (8 Replies)
Discussion started by: sitney
8 Replies

8. Shell Programming and Scripting

Searching and extracting text from output

I have the following output which I need to obtain the values for "Next Seq is xxx" and "Last Seq is xxx" and "Pending count is xxx". You will notice that the number of words prior to that value can be variable hence the reason for asking this question. LINECMD> Line /xxx///ABC9_SND is UP.... (3 Replies)
Discussion started by: sjday
3 Replies

9. Shell Programming and Scripting

Extracting a string from one file and searching the same string in other files

Hi, Need to extract a string from one file and search the same in other files. Ex: I have file1 of hundred lines with no delimiters not even space. I have 3 more files. I should get 1 to 10 characters say substring from each line of file1 and search that string in rest of the files and get... (1 Reply)
Discussion started by: mohancrr
1 Replies

10. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies
Login or Register to Ask a Question