Split file into multiple files Post: 302447611

Sponsored Content

Top Forums Shell Programming and Scripting Split file into multiple files Post 302447611 by jdhahbi on Monday 23rd of August 2010 08:21:49 PM

08-23-2010

Registered User

Split file into multiple files

Hi

I have a file that has multiple sequences; the sequence name is the line starting with '>'. It looks like below:
infile.txt:

Code:

>HE_ER
tttggtgccttgactcggattgggggacctcccttgggagatcaatcccctgtcctcctgctctttgctc
cgtgaaaaggatccacctatgacctctagtcctcagacccaccagcccaaggaacatctcaccaatttca
>M7B_Ho_sap
tgagaactgcagaactctcggcacagaacaactccatccaaacccctgcactaagagacttgaccaaact
aactagtgtccggctttgtttatctttgaca
>LT_H_ss
gtgagacaaagtaacaaatgtaagaagccatgtctgctcatttctgcttgccaacataatttcacaaagc
ccctgactctgtgatgacatgcagctctcnagaaagatgctttgaagacaaarcaggatrgagcacacag
ccccccayrtctcttgcctgagtcactayattccttaaaagataaatgaccctagtccttgccttttcct
>L_5_Et
ttaaaaacaaagcgggagacttccgcttccgggaagatggagtagacgtacttttccctattcctcccgc
taagtacaactaaaaaccctggacattatatataaaacaaacataagaagactctgaaaggtggagagaa

I need to extract the sequnces in individual files; the sequence name will be the file name. The output files will be like:

HE_ER.fa:

Code:

>HE_ER
tttggtgccttgactcggattgggggacctcccttgggagatcaatcccctgtcctcctgctctttgctc
cgtgaaaaggatccacctatgacctctagtcctcagacccaccagcccaaggaacatctcaccaatttca

M7B_Ho_sap.fa:

Code:

>M7B_Ho_sap
tgagaactgcagaactctcggcacagaacaactccatccaaacccctgcactaagagacttgaccaaact
aactagtgtccggctttgtttatctttgaca

LT_H_ss.fa:

Code:

>LT_H_ss
gtgagacaaagtaacaaatgtaagaagccatgtctgctcatttctgcttgccaacataatttcacaaagc
ccctgactctgtgatgacatgcagctctcnagaaagatgctttgaagacaaarcaggatrgagcacacag
ccccccayrtctcttgcctgagtcactayattccttaaaagataaatgaccctagtccttgccttttcct

L_5_Et.fa:

Code:

>L_5_Et
ttaaaaacaaagcgggagacttccgcttccgggaagatggagtagacgtacttttccctattcctcccgc
taagtacaactaaaaaccctggacattatatataaaacaaacataagaagactctgaaaggtggagagaa

I searched for some examples and so far I tried:

Code:

awk -v a=">" '{print $0 >> ($1a".fa")}' infile.txt

but it is not working for me. Please help

Joseph

jdhahbi

View Public Profile for jdhahbi

Find all posts by jdhahbi

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a file into multiple files

I have a file ehich has multiple create statements as create abc 123 one two create xyz 456 four five create nnn 666 six four I want to separte each create statement in seperate files

2. UNIX for Dummies Questions & Answers

split a file into multiple files

Hi All, I have a file ABC.txt and I need to split this file on every 250 rows. And the file name should be ABC1.txt , ABC2.txt and so on. I tried with split command split -l 250 <filename> '<filename>' but the file name returned was ABC.txtaa ABC.txtab. Please...

3. UNIX for Dummies Questions & Answers

How to split multiple records file in n files

Hello, Each record has a lenght of 7 characters I have 2 types of records 010 and 011 There is no character of end of line. For example my file is like that : 010hello 010bonjour011both 011sisters I would like to have 2 files 010.txt (2 records) hello bonjour and ...

4. Shell Programming and Scripting

Split a file into multiple files

Hi, i have a file like this: 1|2|3|4|5| 1|2|8|4|6| Trailer1||||| 1|2|3| Trailer2||| 3|4|5|6| 3|4|5|7| 3|4|5|8| Trailer2||| I want to generate 3 files out of this based on the trailer record. Trailer record string can be different for each file or it may be same for one or two. No...

5. Shell Programming and Scripting

split file into multiple files

Hi, I have a file of the following syntax that has around 120K records that are tab separated. input.txt abc def klm 20 76 . + . klm_mango unix_00000001; abc def klm 83 84 . + . klm_mango unix_0000103; abc def klm 415 439 . + . klm_mango unix_00001043; I am looking for an awk oneliner...

6. Shell Programming and Scripting

Split file in unix into multiple files

Hi Gurus I have to split the incoming source file into multiple file. File contains some unwanted XML tags also . Files looks like some XML tags FILEHEADERABC 12 -- --- ---- EOF some xml tags xxxFILEHEADERABC 13 -- --- ---- EOF I have to ignore XML tags and only split file...

7. Shell Programming and Scripting

Split a file into multiple files with an extension

Hi I have a file with 100 million rows. I want to split them into 1000 subfiles and name them from 1.xls to 1000.xls.. Can I do it in awk? Thanks,

8. Shell Programming and Scripting

Split file into multiple files using delimiter

Hi, I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file. http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html I have used the below code to arrange...

9. Shell Programming and Scripting

Split a .csv File into Multiple Files

Hi guys, I have a requirement where i need to split a .csv file into multiple files. Say for example i have data.csv file and i have splitted that into multiple files based on some conditions i.e first file should have 100, last file 50 and other files 1000 each. Am passing the values in...

10. Shell Programming and Scripting

Split file into multiple files using awk

I have following file: FHEAD0000000001RTLG20161205110959201612055019 THEAD...... TCUST..... TITEM.... TTEND... TTAIL... THEAD...... TCUST..... TITEM.... TITEM..... TTEND... TTAIL... FTAIL<number of lines in file- 10 digits;prefix 0><number of lines in file-2 - 10 digits- perfix 0>...

LEARN ABOUT OSX

locale::codes::langext

Locale::Codes::LangExt(3pm)				 Perl Programmers Reference Guide			       Locale::Codes::LangExt(3pm)

NAME

       Locale::Codes::LangExt - standard codes for language extension identification

SYNOPSIS

	  use Locale::Codes::LangExt;

	  $lext = code2langext('acm');		       # $lext gets 'Mesopotamian Arabic'
	  $code = langext2code('Mesopotamian Arabic'); # $code gets 'acm'

	  @codes   = all_langext_codes();
	  @names   = all_langext_names();

DESCRIPTION

       The "Locale::Codes::LangExt" module provides access to standard codes used for identifying language extensions, such as those as defined in
       the IANA language registry.

       Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default IANA language
       registry codes will be used.

SUPPORTED CODE SETS

       There are several different code sets you can use for identifying language extensions. A code set may be specified using either a name, or
       a constant that is automatically exported by this module.

       For example, the two are equivalent:

	  $lext = code2langext('acm','alpha');
	  $lext = code2langext('acm',LOCALE_LANGEXT_ALPHA);

       The codesets currently supported are:

       alpha
	   This is the set of three-letter (lowercase) codes from the IANA language registry, such as 'acm' for Mesopotamian Arabic.

	   This is the default code set.

ROUTINES

       code2langext ( CODE [,CODESET] )
       langext2code ( NAME [,CODESET] )
       langext_code2code ( CODE ,CODESET ,CODESET2 )
       all_langext_codes ( [CODESET] )
       all_langext_names ( [CODESET] )
       Locale::Codes::LangExt::rename_langext  ( CODE ,NEW_NAME [,CODESET] )
       Locale::Codes::LangExt::add_langext  ( CODE ,NAME [,CODESET] )
       Locale::Codes::LangExt::delete_langext  ( CODE [,CODESET] )
       Locale::Codes::LangExt::add_langext_alias  ( NAME ,NEW_NAME )
       Locale::Codes::LangExt::delete_langext_alias  ( NAME )
       Locale::Codes::LangExt::rename_langext_code  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Codes::LangExt::add_langext_code_alias  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Codes::LangExt::delete_langext_code_alias  ( CODE [,CODESET] )
	   These routines are all documented in the Locale::Codes::API man page.

SEE ALSO

       Locale::Codes
	   The Locale-Codes distribution.

       Locale::Codes::API
	   The list of functions supported by this module.

       http://www.iana.org/assignments/language-subtag-registry
	   The IANA language subtag registry.

AUTHOR

       See Locale::Codes for full author history.

       Currently maintained by Sullivan Beck (sbeck@cpan.org).

COPYRIGHT

	  Copyright (c) 2011-2012 Sullivan Beck

       This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.16.2							    2012-10-11					       Locale::Codes::LangExt(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a file into multiple files

Discussion started by: glamo_2312

2. UNIX for Dummies Questions & Answers

split a file into multiple files

Discussion started by: kumar66

3. UNIX for Dummies Questions & Answers

How to split multiple records file in n files

Discussion started by: jeuffeu

4. Shell Programming and Scripting

Split a file into multiple files

Discussion started by: pparthji

5. Shell Programming and Scripting

split file into multiple files

Discussion started by: jacobs.smith

6. Shell Programming and Scripting

Split file in unix into multiple files

Discussion started by: manish2608

7. Shell Programming and Scripting

Split a file into multiple files with an extension

Discussion started by: Diya123

8. Shell Programming and Scripting

Split file into multiple files using delimiter

Discussion started by: vel4ever

9. Shell Programming and Scripting

Split a .csv File into Multiple Files

Discussion started by: azherkn3

10. Shell Programming and Scripting

Split file into multiple files using awk

Discussion started by: amitdaf

LEARN ABOUT OSX

locale::codes::langext