Sponsored Content
Top Forums UNIX for Beginners Questions & Answers How to append two fasta files? Post 303036081 by Scrutinizer on Thursday 13th of June 2019 05:03:02 PM
Old 06-13-2019
Post #7 probably did not work for you because there are excess trailing spaces that need to be removed in the file samples, but that are not going to be present in the actual FASTA files, see the note underneath...

--
Here they are without the spaces:
File1:
Code:
>Contig_1:90600-91187
GACCGTCATCAATTCCTGTTCCTTGCCCTTGACGACCTCATCCACGTCCTTGATGGCCTT
>Contig_24:26615-28387
TTCGCCGCGCTCCAAACGGGCGATCTCCTCGGCGCGGGCCGCCAGGATCAGCGCCG
>Contig_98:35323-35886
GACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCCTGGATCTGGTCGAAATCCAGCCGCAG               
AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGA

File2:
Code:
>Contig_1:90600-91187
AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGAACAGGAATTGATGACGGTC
AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGAACAGG
AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGA
>Contig_24:26615-28387
GCTGCGGCGCTGATCCTGGCGGCCCGCGCCGAGGAGATCGCCCGTTTGGAGCGCGGCGAA

Of course another thing is the order that is mixed up when because it is undefined in the array structure in awk. That could easily be fixed of course if need be.

Last edited by Scrutinizer; 06-13-2019 at 06:12 PM..
This User Gave Thanks to Scrutinizer For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

append two files

Hi, I have two files where 1 contains data and the other contains strings eg file 1 -0.00000 0.00000 0.00000 0.00000 0.00000 0.80000 0.50000 0.50000 0.60000 0.50000 0.50000 0.20000 -0.00000 0.00000 0.40000 file 2 F F F F F F T T T T T T T T T How to I append file2 to file 1 to... (1 Reply)
Discussion started by: princessotes
1 Replies

2. UNIX for Dummies Questions & Answers

grep FASTA files

I would like to extract the sequences larger than 10 bases but shorter than 18 along with the identifier from a FASTA file that looks like this: > Seq I ACGACTAGACGATAGACGATAGA > Seq 2 ACGATGACGTAGCAGT > Seq 3 ACGATACGAT I know I can extract the IDs alone with the following code grep... (3 Replies)
Discussion started by: Xterra
3 Replies

3. UNIX for Dummies Questions & Answers

renaming (renumbering) fasta files

I have a fasta file that looks like this: >Noname ACCAAAATAATTCATGATATACTCAGATCCATCTGAGGGTTTCACCACTTGTAGAGCTAT CAGAAGAATGTCAATCAACTGTCCGAGAAAAAAGAATCCCAGG >Noname ACTATAAACCCTATTTCTCTTTCTAAAAATTGAAATATTAAAGAAACTAGCACTAGCCTG ACCTTTAGCCAGACTTCTCACTCTTAATGCTGCGGACAAACAGA ... I want to... (2 Replies)
Discussion started by: Oyster
2 Replies

4. Shell Programming and Scripting

append to two files

I tried to write a script ( not working) to append first value from mylist to a file called my myfirstResult and to another called mysecondResult awk ' {print $1} >> myfirsResult ' < mylist awk ' {print $1} >> mysecondResult ' < mylist $ cat mylist A 02/16/2012 B 02/19/2012 C... (3 Replies)
Discussion started by: Sara_84
3 Replies

5. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done (1 Reply)
Discussion started by: Ann Mc Cartney
1 Replies

6. UNIX for Dummies Questions & Answers

Append Files

Hi All, I have to append 2 lines at the end of a text file. If those 2 lines are already there then do not append else append the 2 lines to the text file. Eg: I have a text file, file.txt This text file might look like this, /home/kp/make.jsp /home/pk/model.jsp I have to append... (1 Reply)
Discussion started by: pavan_test
1 Replies

7. UNIX for Dummies Questions & Answers

Append file name to fasta file headers in Linux

How do we append the file name to fasta file headers in multiple fasta-files in Linux? (10 Replies)
Discussion started by: Mauve
10 Replies

8. Shell Programming and Scripting

Unzip all the files with subdirectories present and append a part of string from the main .zip files

Hi frnds, My requirement is I have a zip file with name say eg: test_ABC_UH_ccde2a_awdeaea_20150422.zip within that there are subdirectories on each directory we again have .zip files and in that we have files like mama20150422.gz and so on. Iam in need of a bash script so that it unzips... (0 Replies)
Discussion started by: Ravi Kishore
0 Replies

9. Shell Programming and Scripting

Append string to all the files inside a directory excluding subdirectories and .zip files

Hii, Could someone help me to append string to the starting of all the filenames inside a directory but it should exclude .zip files and subdirectories. Eg. file1: test1.log file2: test2.log file3 test.zip After running the script file1: string_test1.log file2: string_test2.log file3:... (4 Replies)
Discussion started by: Ravi Kishore
4 Replies

10. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2 (4 Replies)
Discussion started by: patrick87
4 Replies
AMPLICONNOISE(1)					    AmpliconNoise Documentation 					  AMPLICONNOISE(1)

NAME
AmpliconNoise - remove noise from high throughput nucleotide sequence data VERSION
This documentation refers to version 1.22 SYNOPSIS
See /usr/share/doc/ampliconnoise/Doc.pdf.gz for details of how to run. DESCRIPTION
The following tools are included. Most of them have an MPI equivalent, for example SeqNoise has an equivalent SeqNoiseM which can be used with mpirun. FastaUnique - dereplicates fasta file -in string input file name Options: FCluster -in string distance input file name -out string output file stub Options: -r resolution -a average linkage -w use weights -i read identifiers -s scale dist. NDist - pairwise Needleman-Wunsch sequence distance matrix from a fasta file -in string fata file name Options: -i output identifiers Perseus - slays monsters -sin string seq file name Options: -tin string reference sequence file -a output alignments -d use imbalance -rin string lookup file name PyroDist - pairwise distance matrix from flowgrams -in string flow file name -out stub out file stub Options: -ni no index in dat file -rin string lookup file name PyroNoise - clusters flowgrams without alignments -din string flow file name -out string cluster input file name -lin string list file Options: -v verbose -c double initial cut-off -ni no index in dat file -s double precision -rin file lookup file name SeqDist - pairwise distance matrix from a fasta file -in string fasta file name Options: -i output identifiers -rin string lookup file name SeqNoise - clusters sequences -in string sequence file name -din string distance matrix file name -out string cluster input file name -lin string list file Options: -min mapping file -v verbose -c double initial cut-off -s double precision -rin string lookup file name SplitClusterEven -din string dat filename -min string map filename -tin string tree filename -s split size -m min size AUTHOR
All software by Chris Quince (quince@civil.gla.ac.uk) This manpage by Tim Booth (tbooth@ceh.ac.uk) LICENCE AND COPYRIGHT
Copyright (c) 2009 (quince@civil.gla.ac.uk). All rights reserved. Released under the Lesser GPL. Permission is granted for anyone to copy, use, or modify these programs and documents for purposes of research or education, provided this copyright notice is retained, and note is made of any changes that have been made. perl v5.12.4 2011-04-28 AMPLICONNOISE(1)
All times are GMT -4. The time now is 10:47 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy