|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
read regex from ID file, print regex and line below from source file
I have a file of protein sequences with headers (my source file). Based on a list of IDs (which are included in some of the headers), I'd like to print out only the specified sequences, with only the ID as header. In other words, I'd like to search source.txt for the terms in IDs.txt, and print the ID as well as the sequence. Ideally the process would continue even if an ID is not found in the source file. All headers in source.txt are of similar format. source.txt Quote:
Quote:
Quote:
Code:
awk '/comp51893_c0_seq1/ { getline; print $0 }' source.txtI also tried extracting the entire header and the sequence by modifying a script I had for a sequence file with different header type, but again it's one-by-one it only prints the header. Code:
awk '{lines[NR] = $0} /comp47911_c0_seq1/ {print lines [NR]; print lines [NR+1]}' source.txtAs is probably clear, I'm still pretty low on the learning curve. Any help would be really appreciated! |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
Code:
awk ' FILENAME=="ID.txt" {arr[$0]++}
FILENAME=="source.txt"
{for(i in arr) {if (i ~ $0)
{print ">", i; getline; print $0; getline; print $0 }
}
} ' ID.txt source.txt > newfiletry that for starters. |
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
jim, thanks for taking a look.
Using the code you provide, I get the following in terminal: Quote:
Quote:
Quote:
|
|
#4
|
|||
|
|||
|
Jim is correct (cool way of performing the task) but you need to switch the comparison operator in the if statement Code:
awk 'FILENAME=="ID.txt" {arr[$0]++}
FILENAME=="source.txt" { for(i in arr) {if ($0 ~ i) {print ">", i; getline; print $0; getline; print $0 } } }' ID.txt source.txt |
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Every regex tp new file | jamie_123 | Shell Programming and Scripting | 7 | 05-16-2012 03:19 AM |
| read file line by line print column wise | rocking77 | Shell Programming and Scripting | 2 | 12-07-2010 07:02 AM |
| print first few lines, then apply regex on a specific column to print results. | kchinnam | Shell Programming and Scripting | 4 | 08-24-2010 03:24 PM |
| sed - print only matching regex | domi55 | Shell Programming and Scripting | 5 | 05-11-2009 10:51 AM |
| awk - print file contents except regex | rmsagar | Shell Programming and Scripting | 6 | 08-09-2008 12:29 PM |
|
|