Fgrep or grep or awk help - scanning for delimiters.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Fgrep or grep or awk help - scanning for delimiters.
Prev   Next
# 1  
Old 02-18-2010
Fgrep or grep or awk help - scanning for delimiters.

Hi,

I'm struggling a little here, so I figured it's time to ask for help.

I have a file with a list of several hundred IDs (the hit file- "hitfile.txt"), which is newline delimited, and a much bigger (~500Mb) text file, "FASTA.txt" with several thousand entries, delimited by ">". It's the FASTA format, for those interested.

On the same line as the >, several different IDs are contained, delimited by "/". One of them is an internal ID ("internalID" which is not much use) and the other an external ID ("externalID" which is much more useful). The file therefore looks like this:


Code:
>internalID1 / externalID1

GATTACA

>internalID2 / externalID2

GATTACA


I have been able to extract the Identifier containing lines and also extract the more useful external ID.

I used:
Code:
fgrep -f hitfile.txt FASTA.txt > outfile.txt

With a hitfile of:

Code:
internalID1
internalID2

This outputs the lines as:

Code:
>internalID1 / externalID1
>internalID2 / externalID2

From which it is trivial to further extract the externalIDs.

Now, I would like to not only pull out single lines, but pull out all lines from the ID (which is always the first item after the >) until the next >, which is the next entry. This will mean I have a file not only of the IDs but also the sequences therein. So with a hitfile of:
Code:
internalID1

The output is:
Code:
>internalID1 / externalID1

GATTACA




This is where my complete n00bism and lack of bash-fu get me stuck. I have tried a couple of promising looking awk scripts, to no avail...

Any help in this matter will be much, much appreciated.

Last edited by radoulov; 02-18-2010 at 07:12 AM.. Reason: Added code tags.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delimiters with awk?

I have a file which is separated by delimiter "|", but the prob is one of my column do contain delimiter as description so how can i differentiate it? PS : the delmiter does have backslash coming before it, if occurring in column Annual|Beleagured|Desc|Denver... (2 Replies)
Discussion started by: nikhil jain
2 Replies

2. Shell Programming and Scripting

Grep lines only with 3 delimiters

Hi All, my file has following Data 04:38:34 02:03 24:40 02:09:58 09:13 03:04:11 02:09:58 35:00 I want to display only lines with 3 fields. ie.. 04:38:34 02:09:58 03:04:11 (6 Replies)
Discussion started by: Arunselvan
6 Replies

3. Shell Programming and Scripting

Fgrep/grep -f and literal strings

I have a file like this: cat file name = server jobname = 1010 snapshot_name = funky_Win2k12_20140213210409 I'm trying to use grep to isolate that first line (name = server), but grep -f "name = " file as well as fgrep "name = " file returns all 3 lines. How do I return... (1 Reply)
Discussion started by: ampsys
1 Replies

4. Shell Programming and Scripting

Use two delimiters in awk

I have a file having lines like: 14: <a="b" val="c"/> 18: <a="x" val="d"/> 54: <a="b" val="c"/> 58: <a="x" val="e"/> I need to create a file with output: 14 d 54 e So basically, for every odd line I need 1st word if delimiter is ':' and for every even... (14 Replies)
Discussion started by: shekhar2010us
14 Replies

5. Shell Programming and Scripting

Delimiters in awk

Line from input file a : b : c " d " e " f : g : h " i " j " k " l output k b a Its taking 7th word when " is the delimiter, 2nd and 1st word when : is the delimiter and returning all in one line.... I am on solaris Thanks..... (1 Reply)
Discussion started by: shekhar2010us
1 Replies

6. Shell Programming and Scripting

Two delimiters with AWK

Hello, this thread is more about scripting style than a specific issue. I've to grep from a output some lines and from them obtain a specific entry delimited by < and >. This is my way : 1) grep -i user list | awk '{FS="<";print $NF}' | sed -e 's/>//g' 2) grep -i user list | cut -d","... (10 Replies)
Discussion started by: gogol_bordello
10 Replies

7. Shell Programming and Scripting

grep/fgrep/egrep for a very large matrix

All, I have a problem with grep/fgrep/egrep. Basically I am building a 200 times 200 correlation matrix. The entries of this matrix need to be retrieved from another very large matrix (~100G). I tried to use the grep/fgrep/egrep to locate each entry and put them into one file. It looks very... (1 Reply)
Discussion started by: realwindfly
1 Replies

8. Shell Programming and Scripting

Awk Vs Fgrep

Hi All, I have 2 files new.txt and old.txt cat new.txt sku1|v1|v2|v3 sku2|v11|v22|v33 sku3|v11|v22|v33 cat old.txt sku1|vx1|vx2|vx3 sku2|vx11|vx22|vx33 sku3|v11|v22|v33 The key column in both files are first column itself. I want to get records in... (6 Replies)
Discussion started by: morbid_angel
6 Replies

9. UNIX Desktop Questions & Answers

Difference grep, egrep and fgrep

Hi All, Can anyone please explain me the difference between grep, egrep and fgrep with examples. I am new to unix environment.. Your help is highly appreciated. Regards, ravi (2 Replies)
Discussion started by: ravind27
2 Replies

10. UNIX for Dummies Questions & Answers

I need help with fgrep or grep

How can I do an and condition with fgrep. I want to do: ps -ef | fgrep -f searchvalues > tempmail.file mailx -s "Email Subject" email@domain.com < tempmail.file The search values file contains: opt/bea.*java.*80 mysqld What I want is to find things that contain: mysqld OR... (7 Replies)
Discussion started by: jimmy
7 Replies
Login or Register to Ask a Question