06-24-2010
Removing specific sequences from file
My file looks like this
Quote:
>GHL8OVD01BNNCA Freq 4
TTGATGTGCCCGTGGGTTTCCCGTCAACACCGGCAAATAGTAGCAGCACTACCAGGACCTTCGCCCA
>GHL8OVD01CMQVT Freq 15
TTGATGTCGTGGGTTTCCCGTCAACACCGGCAAATAGTAGCAGCACTACCAGGACCTTCGCCCA
>Reference1 Freq 1
TTGATGTGCCAGCTGCACTTCCCCCGGTGACGTGGGTTTCCCGTCTAGCAGCACTACCAGGACCTTCGCCCA
>GHL8OVD01CMQVW Freq 11
TTGATGTGTCCCGTCGACACCGGCAAATAGCAGCAGCACTACCAGGACCTTCGCCCA
>GHL8OVD01A45V3 Freq 9
TTGATTCCCGTCGACACCGGCAAATAGCAGCAGCACTACAGGACCTTCGCCCA
>GHL8OVD01B9PRR Freq 1
TTGATGTGCCAGCTTTCGCGTCGACACCGGCAAATAGTAGCAGCGCTACCAGGACCTTCGCCCA
>GHL8OVD01BL8BD Freq 1
TTGATGAGTACTTCCCCCGGTGACGTGGGTCAGCACTACCAGGACCTTCGCCCA
>GHL8OVD01AV2U9 Freq 17
But I need to remove the entry with the identifier >Reference1 along with the entire sequence. Thus, I will end up having the following file
Quote:
>GHL8OVD01BNNCA Freq 4
TTGATGTGCCCGTGGGTTTCCCGTCAACACCGGCAAATAGTAGCAGCACTACCAGGACCTTCGCCCA
>GHL8OVD01CMQVT Freq 15
TTGATGTCGTGGGTTTCCCGTCAACACCGGCAAATAGTAGCAGCACTACCAGGACCTTCGCCCA
>GHL8OVD01CMQVW Freq 11
TTGATGTGTCCCGTCGACACCGGCAAATAGCAGCAGCACTACCAGGACCTTCGCCCA
>GHL8OVD01A45V3 Freq 9
TTGATTCCCGTCGACACCGGCAAATAGCAGCAGCACTACAGGACCTTCGCCCA
>GHL8OVD01B9PRR Freq 1
TTGATGTGCCAGCTTTCGCGTCGACACCGGCAAATAGTAGCAGCGCTACCAGGACCTTCGCCCA
>GHL8OVD01BL8BD Freq 1
TTGATGAGTACTTCCCCCGGTGACGTGGGTCAGCACTACCAGGACCTTCGCCCA
>GHL8OVD01AV2U9 Freq 17
Thanks in advance!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi
I have a .conf file having many location tags like
<Location /main>
AuthName main
AuthUserFile /ppt/gaea/passwd_main
Require user admin
</Location>
......
...
<Location /wonder>
AuthName gaea
AuthUserFile /ppt/gaea/passwd_gaea
Require... (3 Replies)
Discussion started by: catgovind
3 Replies
2. Shell Programming and Scripting
My files look like this
And I need to cut the sequences at the last "A" found in the following 'pattern' -highlighted for easier identification, the pattern is the actual file is not highlighted.
The expected result should look like this
Thus, all the sequences would end with AGCCCTA... (2 Replies)
Discussion started by: Xterra
2 Replies
3. Shell Programming and Scripting
If I have a file with the following information
And I would like to remove all the sequences with Freq less than 3, so I end up having the following file:
I am currently using awk to accomplish this task but I am not getting the results I actually want.
Any help will be greatly appreciated. (3 Replies)
Discussion started by: Xterra
3 Replies
4. Shell Programming and Scripting
Hai,
How to remove the repeated 'Chr's in different sequences. In the given example, Chr19 is repeated in two samples
with the same number i.e. +52245923. How to remove one of the entry in any of the samples and to give the range for each
Chr which is -20 for minimum range value and +120 for... (1 Reply)
Discussion started by: hravisankar
1 Replies
5. Shell Programming and Scripting
I have two files containing hundreds of different sequences with the same Identifiers (ID-001, ID-002, etc.,), something like this:
Infile1:
ID-001 ATGGGAGCGGGGGCGTCTGCCTTGAGGGGAGAGAAGCTAGATACA
ID-002 ATGGGAGCGGGGGCGTCTGTTTTGAGGGGAGAGAAGCTAGATACA
ID-003... (18 Replies)
Discussion started by: Xterra
18 Replies
6. Shell Programming and Scripting
I have two files. File1 is shown below.
>153L:B|PDBID|CHAIN|SEQUENCE
RTDCYGNVNRIDTTGASCKTAKPEGLSYCGVSASKKIAERDLQAMDRYKTIIKKVGEKLCVEPAVIAGIISRESHAGKVL
KNGWGDRGNGFGLMQVDKRSHKPQGTWNGEVHITQGTTILINFIKTIQKKFPSWTKDQQLKGGISAYNAGAGNVRSYARM
DIGTTHDDYANDVVARAQYYKQHGY
>16VP:A|PDBID|CHAIN|SEQUENCE... (7 Replies)
Discussion started by: nelsonfrans
7 Replies
7. Windows & DOS: Issues & Discussions
So, I have a text file that looks like this:
0,0: (168,168,176) #A8A8B0 srgb(168,168,176)
1,0: (168,168,176) #A8A8B0 srgb(168,168,176)
2,0: (166,166,174) #A6A6AE srgb(166,166,174)
3,0: (166,166,174) #A6A6AE srgb(166,166,174)
4,0: (168,168,176) #A8A8B0 srgb(168,168,176)
5,0:... (0 Replies)
Discussion started by: pasc
0 Replies
8. Shell Programming and Scripting
Hello guys,
I would need to remove the last character ")" of a specific line. This can be from any line. Your help is appreciated. Below is the line.
HOSTNAME=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP))
Please help. (6 Replies)
Discussion started by: sang8g
6 Replies
9. Shell Programming and Scripting
Hi. I've tried several different programs to try and solve this problem, but none of them seem to have done exactly what I want (and I need the file in a very specific format). I have a large file of DNA sequences in a multifasta file like this, with around 15 000 genes:
... (2 Replies)
Discussion started by: 4galaxy7
2 Replies
10. UNIX for Beginners Questions & Answers
Hi,
I have to add 7 bases of specific nucleotide at the beginning and ending of all the fasta sequences of a file. For example, I have a multi fasta file namely test.fasta as given below
test.fasta
>TalAA18_Xoo_CIAT_NZ_CP033194.1:_2936369-2939570:+1... (1 Reply)
Discussion started by: dineshkumarsrk
1 Replies
LEARN ABOUT DEBIAN
ppmtopgm
ppmtopgm(1) General Commands Manual ppmtopgm(1)
NAME
ppmtopgm - convert a portable pixmap into a portable graymap
SYNOPSIS
ppmtopgm [ppmfile]
DESCRIPTION
Reads a portable pixmap as input. Produces a portable graymap as output. The output is a "black and white" rendering of the original
image, as in a black and white photograph. The quantization formula used is .299 r + .587 g + .114 b.
Note that although there is a pgmtoppm program, it is not necessary for simple conversions from pgm to ppm , because any ppm program can
read pgm (and pbm ) files automatically. pgmtoppm is for colorizing a pgm file. Also, see ppmtorgb3 for a different way of converting
color to gray. And ppmdist generates a grayscale image from a color image, but in a way that makes it easy to differentiate the original
colors, not necessarily a way that looks like a black and white photograph.
QUOTE
Cold-hearted orb that rules the night
Removes the colors from our sight
Red is gray, and yellow white
But we decide which is right
And which is a quantization error.
SEE ALSO
pgmtoppm(1),ppmtorgb3(1),rgb3toppm(1),ppmdist(1),ppm(5),pgm(5)
AUTHOR
Copyright (C) 1989 by Jef Poskanzer.
10 April 2000 ppmtopgm(1)