My files look like this
And I need to cut the sequences at the last "A" found in the following 'pattern' -highlighted for easier identification, the pattern is the actual file is not highlighted.
The expected result should look like this
Thus, all the sequences would end with AGCCCTA... (2 Replies)
This is what I would like to accomplish, I have an input file (file A) that consist of thousands of sequence elements with the same number of characters (length), each headed by a free text header starting with the chevron ‘>' character followed by the ID (all different IDs with different lenghts)... (9 Replies)
My file looks something like this
Wnat I need is to look for the Reference sequence (">Reference1") and based on the length of that sequence trim all the entries in that file. So, the rersulting file will contain all sequences with the same length, like this
Thus, all sequences will keep... (5 Replies)
I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that? (4 Replies)
I have two files with thousands of sequences of different lengths. infile1 contains the actual sequences and infile2 the scores for each A, T, G and C in infile1. Something like this:
infile1:
>HZVJKYI01ECH5R
TTGATGTGCCAGCTGCCGTTGGTGTGCCAA
>HZVJKYI01AQWJ8
GGATATGATGATGAACTGGTTTGGCACACC... (4 Replies)
I have to remove sequences from a file based on the distance value. I am attaching the file containing the distances (Distance.xls)
The second file looks something like this:
Sequences.txt
>Sample1 Freq 59
ggatatgatgatgaactggt
>Sample1 Freq 54
ggatatgatgttgaactggt
>Sample1 Freq 44... (2 Replies)
Hello to all,
I would like to search sequences of bytes inside big binary file.
The bin file contains blocks of information, each block begins is estructured as follow:
1- Each block begins with the hex 32 (1 byte) and ends with FF. After the FF of the last block, it follows 33.
2- Next... (59 Replies)
I have a text file, input.fasta contains some protein sequences. input.fasta is shown below.
>P02649
MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQT
LSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQA
RLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVY... (8 Replies)
I have two files. File1 is shown below.
>153L:B|PDBID|CHAIN|SEQUENCE
RTDCYGNVNRIDTTGASCKTAKPEGLSYCGVSASKKIAERDLQAMDRYKTIIKKVGEKLCVEPAVIAGIISRESHAGKVL
KNGWGDRGNGFGLMQVDKRSHKPQGTWNGEVHITQGTTILINFIKTIQKKFPSWTKDQQLKGGISAYNAGAGNVRSYARM
DIGTTHDDYANDVVARAQYYKQHGY
>16VP:A|PDBID|CHAIN|SEQUENCE... (7 Replies)
I have this file:
>ID1
AA
>ID2
TTTTTT
>ID-3
AAAAAAAAA
>ID4
TTTTTTGGAGATCAGTAGCAGATGACAG-GGGGG-TGCACCCC
Add I am trying to use this script to output sequences longer than 15 characters:
sed -r '/^>/N;{/^.{,15}$/d}'
The desire output would be this:
>ID4... (8 Replies)
Discussion started by: Xterra
8 Replies
LEARN ABOUT DEBIAN
paps
PAPS(1) General Commands Manual PAPS(1)NAME
paps - UTF-8 to PostScript converter using Pango
SYNOPSIS
paps [options] files...
DESCRIPTION
paps reads a UTF-8 encoded file and generates a PostScript language rendering of the file. The rendering is done by creating outline curves
through the pango ft2 backend.
OPTIONS
These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is
included below.
--landscape
Landscape output. Default is portrait.
--columns=cl
Number of columns output. Default is 1.
Please notice this option isn't related to the terminal length as in a "80 culums terminal".
--font=desc
Set the font description. Default is Monospace 12.
--rtl Do right to left (RTL) layout.
--paper ps
Choose paper size. Known paper sizes are legal, letter and A4. Default is A4.
Postscript points
Each postscript point equals to 1/72 of an inch. 36 points are 1/2 of an inch.
--bottom-margin=bm
Set bottom margin. Default is 36 postscript points.
--top-margin=tm
Set top margin. Default is 36 postscript points.
--left-margin=lm
Set left margin. Default is 36 postscript points.
--right-margin=rm
Set right margin. Default is 36 postscript points.
--gutter-width=gw
Set gutter width. Default is 40 postscript points.
--help Show summary of options.
--header
Draw page header for each page.
--markup
Interpret the text as pango markup.
--lpi Set the lines per inch. This determines the line spacing.
--cpi Set the characters per inch. This is an alternative method of specifying the font size.
--stretch-chars
Indicates that characters should be stretched in the y-direction to fill up their vertical space. This is similar to the texttops
behaviour.
AUTHOR
paps was written by Dov Grobgeld <dov.grobgeld@gmail.com>.
This manual page was written by Lior Kaplan <kaplan@debian.org>, for the Debian project (but may be used by others).
April 17, 2006 PAPS(1)