Parsing and masking regions from a single fasta file with subsequence Post: 302918684

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing a fasta sequence with start and end coordinates

Hi.. I have a seperate chromosome sequences and i wanted to parse some regions of chromosome based on start site and end site.. how can i achieve this? For Example Chr 1 is in following format I need regions from 2 - 10 should give me AATTCCAAA and in a similar way 15- 25 should give...

2. Shell Programming and Scripting

Masking data for different file format

Hi, I have 3 kind of files that contains date data needed to be masked. The file is like this: File 1 (all contents in 1 line): input:DTM+7:201103281411:203'LOC+175+SGSIN:139:6+TERMINATOR......'DTM+132:201103281413:203'LOC.... output:...

3. Shell Programming and Scripting

[SED] Parsing to get a single value

Hello guys, I guess you are fed up with sed command and parse questions, but after a while researching the forum, I could not get an answer to my doubt. I know it must be easy done with sed command, but unfortunately, I never get right syntax of this command OK, this is what I have in my...

4. UNIX for Dummies Questions & Answers

How to change sequence name in along fasta file?

Hi I have an alignment file (.fasta) with ~80 sequences. They look like this- >JV101.contig00066(+):25302-42404|sequence_index=0|block_index=4|species=JV101|JV101_4_0 GAGGTTAATTATCGATAACGTTTAATTAAAGTGTTTAGGTGTCATAATTT TAAATGACGATTTCTCATTACCATACACCTAAATTATCATCAATCTGAAT...

5. UNIX for Dummies Questions & Answers

extract regions of file based on start and end position

Hi, I have a file1 of many long sequences, each preceded by a unique header line. file2 is 3-columns list: headers name, start position, end position. I'd like to extract the sequence region of file1 specified in file2. Based on a post elsewhere, I found the code: awk...

6. Shell Programming and Scripting

Extract sequence from fasta file

Hi, I want to match the sequence id (sub-string of line starting with '>' and extract the information upto next '>' line ). Please help . input > fefrwefrwef X900 AGAGGGAATTGG AGGGGCCTGGAG GGTTCTCTTC > fefrwefrwef X932 AGAGGGAATTGG AGGAGGTGGAG GGTTCTCTTC > fefrwefrwef X937...

7. Shell Programming and Scripting

Command Line Perl for parsing fasta file

I would like to take a fasta file formated like >0001 agttcgaggtcagaatt >0002 agttcgag >0003 ggtaacctga and use command line perl to move the all sample gt 8 in length to a new file. the result would be >0001 agttcgaggtcagaatt >0003 ggtaacctga cat ${sample}.fasta | perl -lane...

8. Shell Programming and Scripting

Extraction of upstream and downstream regions from long sequence file

Hello, here I am posting my query again with modified data input files. see my query is : i have two input files file1 and file2. file1 is smalldata.fasta >gi|546671471|gb|AWWX01449637.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig449636, whole genome shotgun sequence...

9. UNIX for Dummies Questions & Answers

Round up -FASTA file

I have the following script: awk 'FNR==NR{s+=$3;next;} { print $1 , $2, 100*$3/s }' and the following file: >P39PT-1224 Freq 900 cccctacgacggcattggtaatggctcagctgctccggatcccgcaagccatcttggatatgagggttcgtcggcctcttcagccaagg-cccccagcagaacatccagctgatcg >P39PT-784 Freq 2...

10. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing a fasta sequence with start and end coordinates

Discussion started by: empyrean

2. Shell Programming and Scripting

Masking data for different file format

Discussion started by: Alvin123

3. Shell Programming and Scripting

[SED] Parsing to get a single value

Discussion started by: manolain

4. UNIX for Dummies Questions & Answers

How to change sequence name in along fasta file?

Discussion started by: baika

5. UNIX for Dummies Questions & Answers

extract regions of file based on start and end position

Discussion started by: pathunkathunk

6. Shell Programming and Scripting

Extract sequence from fasta file

Discussion started by: ritakadm

7. Shell Programming and Scripting

Command Line Perl for parsing fasta file

Discussion started by: jdilts

8. Shell Programming and Scripting

Extraction of upstream and downstream regions from long sequence file

Discussion started by: harpreetmanku04

9. UNIX for Dummies Questions & Answers

Round up -FASTA file

Discussion started by: Xterra

10. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Discussion started by: patrick87