Parsing a fasta sequence with start and end coordinates Post: 302514160

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Help Parsing Sequence File

Hi Everyone, I am new in the world of UNIX and Shell scripting. I am working with a sequence file that looks like this: >contig00001 length=128 numreads=2 aTGTGCTGGgTGGGTGCCTGTTgCCccATGCTCCAGTtCAGGATTtCAGGCAttCTCATG TCCAGCATTTCTATTTAATCCTGCTGCTGGACTTGGGTGGtCTCAGTCtGGGAAGTGAGC tGTCTGTG...

2. UNIX for Dummies Questions & Answers

How to change sequence name in along fasta file?

Hi I have an alignment file (.fasta) with ~80 sequences. They look like this- >JV101.contig00066(+):25302-42404|sequence_index=0|block_index=4|species=JV101|JV101_4_0 GAGGTTAATTATCGATAACGTTTAATTAAAGTGTTTAGGTGTCATAATTT TAAATGACGATTTCTCATTACCATACACCTAAATTATCATCAATCTGAAT...

3. Shell Programming and Scripting

Remove lines between the start string and end string including start and end string Python

Hi, I am trying to remove lines once a string is found till another string is found including the start string and end string. I want to basically grab all the lines starting with color (closing bracket). PS: The line after the closing bracket for color could be anything (currently 'more')....

4. UNIX for Dummies Questions & Answers

Change sequence names in fasta file

I have fasta files with multiple sequences in each. I need to change the sequence name headers from: >accD:_59176-60699 ATGGAAAAGTGGAGGATTTATTCGTTTCAGAAGGAGTTCGAACGCA >atpA_(reverse_strand):_showing_revcomp_of_10525-12048 ATGGTAACCATTCAAGCCGACGAAATTAGTAATCTTATCCGGGAAC...

5. Shell Programming and Scripting

Extract sequence from fasta file

Hi, I want to match the sequence id (sub-string of line starting with '>' and extract the information upto next '>' line ). Please help . input > fefrwefrwef X900 AGAGGGAATTGG AGGGGCCTGGAG GGTTCTCTTC > fefrwefrwef X932 AGAGGGAATTGG AGGAGGTGGAG GGTTCTCTTC > fefrwefrwef X937...

6. Shell Programming and Scripting

Count and search by sequence in multiple fasta file

Hello, I have 10 fasta files with sequenced reads information with read sizes from 15 - 35 . I have combined the reads and collapsed in to unique reads and filtered for sizes 18 - 26 bp long unique reads. Now i wanted to count each unique read appearance in all the fasta files and make a table...

7. Shell Programming and Scripting

Parsing and masking regions from a single fasta file with subsequence

HI, I have a Complete genome fasta file and I have list of sub sequence regions in the format as : 4353..5633 6795..9354 1034..14456 I want a script which can mask these region in a single complete genome fasta file with the alphabet N kindly help

8. Shell Programming and Scripting

Command Line Perl for parsing fasta file

I would like to take a fasta file formated like >0001 agttcgaggtcagaatt >0002 agttcgag >0003 ggtaacctga and use command line perl to move the all sample gt 8 in length to a new file. the result would be >0001 agttcgaggtcagaatt >0003 ggtaacctga cat ${sample}.fasta | perl -lane...

9. UNIX for Beginners Questions & Answers

How to find a specific sequence pattern in a fasta file?

I have to mine the following sequence pattern from a large fasta file namely gene.fasta (contains multiple fasta sequences) along with the flanking sequences of 5 bases at starting position and ending position, AAGCZ-N16-AAGCZ Z represents A, C or G (Except T) N16 represents any of the four...

10. UNIX for Beginners Questions & Answers

Splitting week start date and end date based on custom period start dates

Below are my custom period start and end dates based on a calender, these dates are placed in a file, for each period i need to split into three weeks for each period row, example is given below. Could you please help out to achieve solution through shell script.. File content: ...

LEARN ABOUT ULTRIX

cut

cut(1) General Commands Manual cut(1)

Name
cut - cut out selected fields of each line of a file

Syntax
cut -clist [file1 file2...]
cut -flist [-dchar] [-s] [file1 file2...]

Description
Use the command to cut out columns from a table or fields from each line of a file. The fields as specified by list can be fixed length,
that is, character positions as on a punched card (-c option), or the length can vary from line to line and be marked with a field delim-
iter character like tab (-f option). The command can be used as a filter. If no files are given, the standard input is used.

Use to make horizontal ``cuts'' (by context) through a file, or to put files together in columns. To reorder columns in a table, use and

Options
list Specifies ranges that must be a comma-separated list of integer field numbers in increasing order. With optional - indicates
ranges as in the -o option of nroff/troff for page ranges; for example, 1,4,7; 1-3,8; -5,10 (short for 1-5,10); or 3- (short
for third through last field).

-clist Specifies character positions to be cut out. For example, -c1-72 would pass the first 72 characters of each line.

-flist Specifies the fields to be cut out. For example, -f1,7 copies the first and seventh field only. Lines with no field delim-
iters are passed through intact (useful for table subheadings), unless -s is specified.

-dchar Uses the specified character as the field delimiter. Default is tab. Space or other characters with special meaning to the
shell must be quoted. The -d option is used only in combination with the -f option, according to XPG3 and SVID2/SVID3.

-s Suppresses lines with no delimiter characters. Unless specified, lines with no delimiters are passed through untouched.
Either the -c or -f option must be specified.

Examples
Mapping of user IDs to names:
cut -d: -f1,5 /etc/passwd
To set name to the current login name for the csh shell:
set name=`who am i | cut -f1 -d" "`
To set name to the current login name for the sh, sh5, and ksh shells:
name=`who am i | cut -f1 -d" "`

Diagnostics
"line too long" A line can have no more than 511 characters or fields.

"bad list for c/f option"
Missing -c or -f option or incorrectly specified list. No error occurs if a line has fewer fields than the list calls
for.

"no fields" The list is empty.

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Help Parsing Sequence File

Discussion started by: Fahmida

2. UNIX for Dummies Questions & Answers

How to change sequence name in along fasta file?

Discussion started by: baika

3. Shell Programming and Scripting

Remove lines between the start string and end string including start and end string Python

Discussion started by: Dabheeruz

4. UNIX for Dummies Questions & Answers

Change sequence names in fasta file

Discussion started by: tyrianthinae

5. Shell Programming and Scripting

Extract sequence from fasta file

Discussion started by: ritakadm

6. Shell Programming and Scripting

Count and search by sequence in multiple fasta file

Discussion started by: empyrean

7. Shell Programming and Scripting

Parsing and masking regions from a single fasta file with subsequence

Discussion started by: margarita