Help with reformat single-line multi-fasta into multi-line multi-fasta


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with reformat single-line multi-fasta into multi-line multi-fasta
# 1  
Old 06-13-2018
Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File:
Code:
>Seq1
ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD
>Seq2
SDASDAQEQWEQeqAdfaasd
>Seq3
ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG
......

Desired Output File
Code:
>Seq1
ASDADAFASF
ASFADGSDGF
SDFSDFSDFS
DFSDFSDFSD
FSDFSDFSDF
SD
>Seq2
SDASDAQEQW
EQeqAdfaasd
>Seq3
ASDSALGHIU
DFJANCAGPA
THLACJHPAU
TYNJKG
......

Any idea how to reformat a single-line multi-fasta into 10 words/symbols per line multi-fasta file?

Thanks for any advice.
# 2  
Old 06-13-2018
Did you consider
Code:
fold -b10 file
>Seq1
ASDADAFASF
ASFADGSDGF
SDFSDFSDFS
DFSDFSDFSD
FSDFSDFSDF
SD
>Seq2
SDASDAQEQW
EQeqAdfaas
d
>Seq3
ASDSALGHIU
DFJANCAGPA
THLACJHPAU
TYNJKG

# 3  
Old 06-14-2018
Hi RudiC,

It seems like no work when my data is huge?
eg.
Input file
Code:
cat input_file
>scaffold1|size134247
TCTCTGTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG

Output file
Code:
fold -b50 input_file
>scaffold1|size134247
TCTCTGTCTCTCTCTCTCTCTCTCTCTC
TCTCTCTCTCTCTCTCTCTCTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG

Desired Output file
Code:
>scaffold1|size134247
TCTCTGTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTG
TGTGTGTGTGTGTGTGTGTGTGTGTGTG

I believe fold command will take header into account too?
I wanna split a single-line multi-fasta (exclude >header) into 10 words/symbols per line multi-fasta file.

Thanks for any advice.
# 4  
Old 06-14-2018
Try (GNU sed):
Code:
sed '/^>/!s/.\{10\}/&\n/g' infile

Regular sed:
Code:
sed '/^>/!s/.\{10\}/&\
/g' infile

# 5  
Old 06-15-2018
Also
Code:
fold -b -w 50 input_file

works on many *x systems, and should not have a problem with large files.
If there is a 2GB limit (happens if compiled with 32-bit and without largefile support), then it can help to let the shell open it (assuming the shell is either 64-bit or with largefile support).
Code:
fold -b -w 50 < input_file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Multi line log files to single line format

I want to read the log file which was generate from other command . And the output was having multi line in log files for job name and server name. But i need to make all the logs on one line Source file 07/15/2018 17:02:00 TRANSLOG_1700 Server0005_SQL ... (2 Replies)
Discussion started by: ranjancom2000
2 Replies

2. UNIX for Dummies Questions & Answers

Remove multi line and single line comments

Hi, I am trying to remove multi line and single line comments like examples below I have tried this pattern. it works fine for single line comments and multi line comments in a single line only. but this fails when the comments are extended in multiple lines as shown in the comment 2 of... (3 Replies)
Discussion started by: ahmedwaseem2000
3 Replies

3. Shell Programming and Scripting

Multi-line filtering based on multi-line pattern in a file

I have a file with data records separated by multiple equals signs, as below. ========== RECORD 1 ========== RECORD 2 DATA LINE ========== RECORD 3 ========== RECORD 4 DATA LINE ========== RECORD 5 DATA LINE ========== I need to filter out all data from this file where the... (2 Replies)
Discussion started by: Finja
2 Replies

4. Shell Programming and Scripting

Joining multi-line output to a single line in a group

Hi, My Oracle query is returing below o/p ---------------------------------------------------------- Ins trnas value a lkp1 x a lkp1 y b lkp1 a b lkp2 x b lkp2 y ... (7 Replies)
Discussion started by: gvk25
7 Replies

5. Shell Programming and Scripting

Multi lines to single line

HI, My input file contains the data as like below: A1234119993 B6271113 Bghjkjk A1234119992 B6271113hi Bghjkjkmkl the output i require is : A1234119993 B6271113 Bghjkjk A1234119992 B6271113hi Bghjkjkmkl Please help me in this. Thanks (6 Replies)
Discussion started by: pandeesh
6 Replies

6. Shell Programming and Scripting

Merge multi-line output into a single line

Hello I did do a search and the past threads doesn't really solve my issue. (using various awk commands) I need to combine the output from java -version into 1 line, but I am having difficulties. When you exec java -version, you get: java version "1.5.0_06" Java(TM) 2 Runtime... (5 Replies)
Discussion started by: flagman5
5 Replies

7. Shell Programming and Scripting

How to use Perl to join multi-line into single line

Hello, Did anyone know how to write a perl script to merge the multi-line into a single line where each line with start at timestamp Input--> timestamp=2009-11-10-04.55.20.829347; a; b; c; timestamp=2009-11-10-04.55.20.829347; aa; bb; cc; (5 Replies)
Discussion started by: happyday
5 Replies

8. Shell Programming and Scripting

Help on Merge multi-lines into one single line

Hello, Can anyone let me know how to use Perl script to Merge following multi-lines into one single line... ***** Multi-line***** FILE_Write root OK Tue Jul 01 00:00:00 2008 cl_get_path file descriptor = 1 FILE_Write root OK ... (5 Replies)
Discussion started by: happyday
5 Replies

9. Shell Programming and Scripting

How to use Perl to merge multi-line into single line

Hi, Can anyone know how to use perl to merge the following multi-line information which beginning with "BAM" into one line. For each line need to delete the return and add a space. Please see the red color line. ******Org. Multi-line) BAM admin 101.203.57.22 ... (3 Replies)
Discussion started by: happyday
3 Replies

10. Shell Programming and Scripting

Merge multi-lines into one single line

Hi, Can anyone help me for merge the following multi-line log which beginning with a number and time: into one line. For each line need to delete the return and add a space. Please see the red color line. *****Original Log*****... (4 Replies)
Discussion started by: happyday
4 Replies
Login or Register to Ask a Question