Sponsored Content
Top Forums Shell Programming and Scripting Extract sequence from fasta file Post 302864055 by ctsgnb on Tuesday 15th of October 2013 06:51:29 PM
Old 10-15-2013
Very simple one:
Code:
awk '/X937/' RS='> ' input

...using variable:
Code:
awk -v P="$s" '$0~P' RS='> ' input

... adding leading RS:
Code:
awk -v P="$s" '!c++{printf RS}$0~P' RS='> ' input

Code:
$ s=X937
$ awk -v P="$s" '$0~P' RS='> ' input
fefrwefrwef X937
AGAGGGAATTGG
AGGAGACTTTAG
GGTTCTCTTC


Last edited by ctsgnb; 10-15-2013 at 08:00 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to extract a sequence of n lines from a file

Hi I want to be able to extract a sequence of n lines from a file. ideas, commands and suggestions would be highly appreciated. Thanks (4 Replies)
Discussion started by: 0ktalmagik
4 Replies

2. Shell Programming and Scripting

Extract Pattern Sequence

Dear Collegues I have to extract Some pattern from raw text file using perl The input will be raw text. Pattern to get - Sequence of Capital Letter Words ( e.g. he is working in Center for Perl Studies. He will come tomorrow...) from thos I have to extract sequences like "Center for Perl... (5 Replies)
Discussion started by: jaganadh
5 Replies

3. Shell Programming and Scripting

Extract sequence blocks

Hi, I have an one-line file consisting of a sequence of 660 letters. I would like to extract 9-letter blocks iteratively: ASDFGHJKLQWERTYUIOPZXCVBNM first block: ASDFGHJKL 1nd block: SDFGHJKLQ What I have so far only gives me the first block, can anyone please explain why? cat... (7 Replies)
Discussion started by: solli
7 Replies

4. Shell Programming and Scripting

Parsing a fasta sequence with start and end coordinates

Hi.. I have a seperate chromosome sequences and i wanted to parse some regions of chromosome based on start site and end site.. how can i achieve this? For Example Chr 1 is in following format I need regions from 2 - 10 should give me AATTCCAAA and in a similar way 15- 25 should give... (8 Replies)
Discussion started by: empyrean
8 Replies

5. UNIX for Dummies Questions & Answers

How to change sequence name in along fasta file?

Hi I have an alignment file (.fasta) with ~80 sequences. They look like this- >JV101.contig00066(+):25302-42404|sequence_index=0|block_index=4|species=JV101|JV101_4_0 GAGGTTAATTATCGATAACGTTTAATTAAAGTGTTTAGGTGTCATAATTT TAAATGACGATTTCTCATTACCATACACCTAAATTATCATCAATCTGAAT... (2 Replies)
Discussion started by: baika
2 Replies

6. UNIX for Dummies Questions & Answers

Change sequence names in fasta file

I have fasta files with multiple sequences in each. I need to change the sequence name headers from: >accD:_59176-60699 ATGGAAAAGTGGAGGATTTATTCGTTTCAGAAGGAGTTCGAACGCA >atpA_(reverse_strand):_showing_revcomp_of_10525-12048 ATGGTAACCATTCAAGCCGACGAAATTAGTAATCTTATCCGGGAAC... (2 Replies)
Discussion started by: tyrianthinae
2 Replies

7. Shell Programming and Scripting

Extract sequences from a FASTA file based on another file

I have two files. File1 is shown below. >153L:B|PDBID|CHAIN|SEQUENCE RTDCYGNVNRIDTTGASCKTAKPEGLSYCGVSASKKIAERDLQAMDRYKTIIKKVGEKLCVEPAVIAGIISRESHAGKVL KNGWGDRGNGFGLMQVDKRSHKPQGTWNGEVHITQGTTILINFIKTIQKKFPSWTKDQQLKGGISAYNAGAGNVRSYARM DIGTTHDDYANDVVARAQYYKQHGY >16VP:A|PDBID|CHAIN|SEQUENCE... (7 Replies)
Discussion started by: nelsonfrans
7 Replies

8. Shell Programming and Scripting

Count and search by sequence in multiple fasta file

Hello, I have 10 fasta files with sequenced reads information with read sizes from 15 - 35 . I have combined the reads and collapsed in to unique reads and filtered for sizes 18 - 26 bp long unique reads. Now i wanted to count each unique read appearance in all the fasta files and make a table... (5 Replies)
Discussion started by: empyrean
5 Replies

9. Shell Programming and Scripting

Extract distinc sequence of letters

Hallo, I need to extract distinct sequence of letters for example from 136 to 193 Files are quite big, so I would prefer not to use "fold -w1" Thank you very much Input file look like this: 1 cttttacctt catgtgtttt tgcagatatt tgttcataat aacatcttct ttttaagtta 61 ttaaaatctt... (4 Replies)
Discussion started by: kamcamonty
4 Replies

10. UNIX for Beginners Questions & Answers

How to find a specific sequence pattern in a fasta file?

I have to mine the following sequence pattern from a large fasta file namely gene.fasta (contains multiple fasta sequences) along with the flanking sequences of 5 bases at starting position and ending position, AAGCZ-N16-AAGCZ Z represents A, C or G (Except T) N16 represents any of the four... (3 Replies)
Discussion started by: dineshkumarsrk
3 Replies
Locale::Codes::LangVar(3)				User Contributed Perl Documentation				 Locale::Codes::LangVar(3)

NAME
Locale::Codes::LangVar - standard codes for language variation identification SYNOPSIS
use Locale::Codes::LangVar; $lvar = code2langvar('acm'); # $lvar gets 'Mesopotamian Arabic' $code = langvar2code('Mesopotamian Arabic'); # $code gets 'acm' @codes = all_langvar_codes(); @names = all_langvar_names(); DESCRIPTION
The "Locale::Codes::LangVar" module provides access to standard codes used for identifying language variations, such as those as defined in the IANA language registry. Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default IANA language registry codes will be used. SUPPORTED CODE SETS
There are several different code sets you can use for identifying language variations. A code set may be specified using either a name, or a constant that is automatically exported by this module. For example, the two are equivalent: $lvar = code2langvar('arevela','alpha'); $lvar = code2langvar('arevela',LOCALE_LANGVAR_ALPHA); The codesets currently supported are: alpha This is the set of alphanumeric codes from the IANA language registry, such as 'arevela' for Eastern Armenian. This code set is identified with the symbol "LOCALE_LANGVAR_ALPHA". This is the default code set. ROUTINES
code2langvar ( CODE [,CODESET] ) langvar2code ( NAME [,CODESET] ) langvar_code2code ( CODE ,CODESET ,CODESET2 ) all_langvar_codes ( [CODESET] ) all_langvar_names ( [CODESET] ) Locale::Codes::LangVar::rename_langvar ( CODE ,NEW_NAME [,CODESET] ) Locale::Codes::LangVar::add_langvar ( CODE ,NAME [,CODESET] ) Locale::Codes::LangVar::delete_langvar ( CODE [,CODESET] ) Locale::Codes::LangVar::add_langvar_alias ( NAME ,NEW_NAME ) Locale::Codes::LangVar::delete_langvar_alias ( NAME ) Locale::Codes::LangVar::rename_langvar_code ( CODE ,NEW_CODE [,CODESET] ) Locale::Codes::LangVar::add_langvar_code_alias ( CODE ,NEW_CODE [,CODESET] ) Locale::Codes::LangVar::delete_langvar_code_alias ( CODE [,CODESET] ) These routines are all documented in the Locale::Codes::API man page. SEE ALSO
Locale::Codes The Locale-Codes distribution. Locale::Codes::API The list of functions supported by this module. http://www.iana.org/assignments/language-subtag-registry The IANA language subtag registry. AUTHOR
See Locale::Codes for full author history. Currently maintained by Sullivan Beck (sbeck@cpan.org). COPYRIGHT
Copyright (c) 2011-2013 Sullivan Beck This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.16.3 2013-04-12 Locale::Codes::LangVar(3)
All times are GMT -4. The time now is 10:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy