Fgrep or grep or awk help - scanning for delimiters.
Hi,
I'm struggling a little here, so I figured it's time to ask for help.
I have a file with a list of several hundred IDs (the hit file- "hitfile.txt"), which is newline delimited, and a much bigger (~500Mb) text file, "FASTA.txt" with several thousand entries, delimited by ">". It's the FASTA format, for those interested.
On the same line as the >, several different IDs are contained, delimited by "/". One of them is an internal ID ("internalID" which is not much use) and the other an external ID ("externalID" which is much more useful). The file therefore looks like this:
I have been able to extract the Identifier containing lines and also extract the more useful external ID.
I used:
With a hitfile of:
This outputs the lines as:
From which it is trivial to further extract the externalIDs.
Now, I would like to not only pull out single lines, but pull out all lines from the ID (which is always the first item after the >) until the next >, which is the next entry. This will mean I have a file not only of the IDs but also the sequences therein. So with a hitfile of:
The output is:
This is where my complete n00bism and lack of bash-fu get me stuck. I have tried a couple of promising looking awk scripts, to no avail...
Any help in this matter will be much, much appreciated.
Last edited by radoulov; 02-18-2010 at 07:12 AM..
Reason: Added code tags.
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 2,288
Thanks Given: 430
Thanked 480 Times in 395 Posts
Hi.
The AT&T cgrep command was designed to extract sections of text within a window:
producing:
See the URL noted in the script. You would need to download and compile the code, but I have done that in 32-bit and 64-bit environments without trouble.
If you are not comfortable with that, then someone may stop by shortly to offer an awk or perl code.
I have a file which is separated by delimiter "|", but the prob is one of my column do contain delimiter as description so how can i differentiate it?
PS : the delmiter does have backslash coming before it, if occurring in column
Annual|Beleagured|Desc|Denver... (2 Replies)
Hi All,
my file has following Data
04:38:34
02:03
24:40
02:09:58
09:13
03:04:11
02:09:58
35:00
I want to display only lines with 3 fields.
ie..
04:38:34
02:09:58
03:04:11 (6 Replies)
I have a file like this:
cat file
name = server
jobname = 1010
snapshot_name = funky_Win2k12_20140213210409
I'm trying to use grep to isolate that first line (name = server), but
grep -f "name = " file
as well as
fgrep "name = " file
returns all 3 lines. How do I return... (1 Reply)
I have a file having lines like:
14: <a="b" val="c"/>
18: <a="x" val="d"/>
54: <a="b" val="c"/>
58: <a="x" val="e"/>
I need to create a file with output:
14
d
54
e
So basically, for every odd line I need 1st word if delimiter is ':' and for every even... (14 Replies)
Line from input file
a : b : c " d " e " f : g : h " i " j " k " l
output
k b a
Its taking 7th word when " is the delimiter, 2nd and 1st word when : is the delimiter and returning all in one line.... I am on solaris
Thanks..... (1 Reply)
Hello,
this thread is more about scripting style than a specific issue.
I've to grep from a output some lines and from them obtain a specific entry delimited by < and >.
This is my way :
1) grep -i user list | awk '{FS="<";print $NF}' | sed -e 's/>//g'
2) grep -i user list | cut -d","... (10 Replies)
All,
I have a problem with grep/fgrep/egrep. Basically I am building a 200 times 200 correlation matrix. The entries of this matrix need to be retrieved from another very large matrix (~100G). I tried to use the grep/fgrep/egrep to locate each entry and put them into one file. It looks very... (1 Reply)
Hi All,
I have 2 files new.txt and old.txt
cat new.txt
sku1|v1|v2|v3
sku2|v11|v22|v33
sku3|v11|v22|v33
cat old.txt
sku1|vx1|vx2|vx3
sku2|vx11|vx22|vx33
sku3|v11|v22|v33
The key column in both files are first column itself.
I want to get records in... (6 Replies)
Hi All,
Can anyone please explain me the difference between grep, egrep and fgrep with examples.
I am new to unix environment.. Your help is highly appreciated.
Regards,
ravi (2 Replies)
How can I do an and condition with fgrep.
I want to do:
ps -ef | fgrep -f searchvalues > tempmail.file
mailx -s "Email Subject" email@domain.com < tempmail.file
The search values file contains:
opt/bea.*java.*80
mysqld
What I want is to find things that contain:
mysqld OR... (7 Replies)