Split file into multiple files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split file into multiple files
# 1  
Old 08-23-2010
Split file into multiple files

Hi

I have a file that has multiple sequences; the sequence name is the line starting with '>'. It looks like below:
infile.txt:
Code:
>HE_ER
tttggtgccttgactcggattgggggacctcccttgggagatcaatcccctgtcctcctgctctttgctc
cgtgaaaaggatccacctatgacctctagtcctcagacccaccagcccaaggaacatctcaccaatttca
>M7B_Ho_sap
tgagaactgcagaactctcggcacagaacaactccatccaaacccctgcactaagagacttgaccaaact
aactagtgtccggctttgtttatctttgaca
>LT_H_ss
gtgagacaaagtaacaaatgtaagaagccatgtctgctcatttctgcttgccaacataatttcacaaagc
ccctgactctgtgatgacatgcagctctcnagaaagatgctttgaagacaaarcaggatrgagcacacag
ccccccayrtctcttgcctgagtcactayattccttaaaagataaatgaccctagtccttgccttttcct
>L_5_Et
ttaaaaacaaagcgggagacttccgcttccgggaagatggagtagacgtacttttccctattcctcccgc
taagtacaactaaaaaccctggacattatatataaaacaaacataagaagactctgaaaggtggagagaa

I need to extract the sequnces in individual files; the sequence name will be the file name. The output files will be like:

HE_ER.fa:
Code:
>HE_ER
tttggtgccttgactcggattgggggacctcccttgggagatcaatcccctgtcctcctgctctttgctc
cgtgaaaaggatccacctatgacctctagtcctcagacccaccagcccaaggaacatctcaccaatttca

M7B_Ho_sap.fa:
Code:
>M7B_Ho_sap
tgagaactgcagaactctcggcacagaacaactccatccaaacccctgcactaagagacttgaccaaact
aactagtgtccggctttgtttatctttgaca

LT_H_ss.fa:

Code:
>LT_H_ss
gtgagacaaagtaacaaatgtaagaagccatgtctgctcatttctgcttgccaacataatttcacaaagc
ccctgactctgtgatgacatgcagctctcnagaaagatgctttgaagacaaarcaggatrgagcacacag
ccccccayrtctcttgcctgagtcactayattccttaaaagataaatgaccctagtccttgccttttcct

L_5_Et.fa:
Code:
>L_5_Et
ttaaaaacaaagcgggagacttccgcttccgggaagatggagtagacgtacttttccctattcctcccgc
taagtacaactaaaaaccctggacattatatataaaacaaacataagaagactctgaaaggtggagagaa

I searched for some examples and so far I tried:

Code:
awk -v a=">" '{print $0 >> ($1a".fa")}' infile.txt

but it is not working for me. Please help

Joseph
# 2  
Old 08-23-2010
Code:
awk ' /^>/ { file=substr($0,2)} { print $0 > file} ' inputfile

These 2 Users Gave Thanks to jim mcnamara For This Post:
# 3  
Old 08-23-2010
Quote:
Originally Posted by jim mcnamara
Code:
awk ' /^>/ { file=substr($0,2)} { print $0 > file} ' inputfile

Code:
awk -F \> ' /^>/ { file=$2 ".fa"} { print $0 > file }' inputfile


Last edited by rdcwayx; 08-23-2010 at 10:46 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split file into multiple files using awk

I have following file: FHEAD0000000001RTLG20161205110959201612055019 THEAD...... TCUST..... TITEM.... TTEND... TTAIL... THEAD...... TCUST..... TITEM.... TITEM..... TTEND... TTAIL... FTAIL<number of lines in file- 10 digits;prefix 0><number of lines in file-2 - 10 digits- perfix 0>... (6 Replies)
Discussion started by: amitdaf
6 Replies

2. Shell Programming and Scripting

Split a .csv File into Multiple Files

Hi guys, I have a requirement where i need to split a .csv file into multiple files. Say for example i have data.csv file and i have splitted that into multiple files based on some conditions i.e first file should have 100, last file 50 and other files 1000 each. Am passing the values in... (2 Replies)
Discussion started by: azherkn3
2 Replies

3. Shell Programming and Scripting

Split file into multiple files using delimiter

Hi, I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file. http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html I have used the below code to arrange... (6 Replies)
Discussion started by: vel4ever
6 Replies

4. Shell Programming and Scripting

Split a file into multiple files with an extension

Hi I have a file with 100 million rows. I want to split them into 1000 subfiles and name them from 1.xls to 1000.xls.. Can I do it in awk? Thanks, (8 Replies)
Discussion started by: Diya123
8 Replies

5. Shell Programming and Scripting

Split file in unix into multiple files

Hi Gurus I have to split the incoming source file into multiple file. File contains some unwanted XML tags also . Files looks like some XML tags FILEHEADERABC 12 -- --- ---- EOF some xml tags xxxFILEHEADERABC 13 -- --- ---- EOF I have to ignore XML tags and only split file... (6 Replies)
Discussion started by: manish2608
6 Replies

6. Shell Programming and Scripting

split file into multiple files

Hi, I have a file of the following syntax that has around 120K records that are tab separated. input.txt abc def klm 20 76 . + . klm_mango unix_00000001; abc def klm 83 84 . + . klm_mango unix_0000103; abc def klm 415 439 . + . klm_mango unix_00001043; I am looking for an awk oneliner... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

7. Shell Programming and Scripting

Split a file into multiple files

Hi, i have a file like this: 1|2|3|4|5| 1|2|8|4|6| Trailer1||||| 1|2|3| Trailer2||| 3|4|5|6| 3|4|5|7| 3|4|5|8| Trailer2||| I want to generate 3 files out of this based on the trailer record. Trailer record string can be different for each file or it may be same for one or two. No... (24 Replies)
Discussion started by: pparthji
24 Replies

8. UNIX for Dummies Questions & Answers

How to split multiple records file in n files

Hello, Each record has a lenght of 7 characters I have 2 types of records 010 and 011 There is no character of end of line. For example my file is like that : 010hello 010bonjour011both 011sisters I would like to have 2 files 010.txt (2 records) hello bonjour and ... (1 Reply)
Discussion started by: jeuffeu
1 Replies

9. UNIX for Dummies Questions & Answers

split a file into multiple files

Hi All, I have a file ABC.txt and I need to split this file on every 250 rows. And the file name should be ABC1.txt , ABC2.txt and so on. I tried with split command split -l 250 <filename> '<filename>' but the file name returned was ABC.txtaa ABC.txtab. Please... (8 Replies)
Discussion started by: kumar66
8 Replies

10. Shell Programming and Scripting

Split a file into multiple files

I have a file ehich has multiple create statements as create abc 123 one two create xyz 456 four five create nnn 666 six four I want to separte each create statement in seperate files (3 Replies)
Discussion started by: glamo_2312
3 Replies
Login or Register to Ask a Question