Sponsored Content
Top Forums UNIX for Beginners Questions & Answers How to append two fasta files? Post 303036022 by RudiC on Wednesday 12th of June 2019 06:10:31 PM
Old 06-12-2019
May I paraphrase your request: You want file2's sequences to prevail and file1's sequences inserted only if there's no equivalent in file2. If so, try
Code:
tr $'\n>' $' \n' <file1|  grep -vf <(sed -n 's/>//p' file2) | cat  <(tr $'\n>' $' \n' <file2) - |  tr  -s $' \n' $'\n>'
>Contig_1:90600-91187
GACCGTCATCAATTCCTGTTCCTTGCCCTTGACGACCTCATCCACGTCCTTGATGGCCTT
>Contig_24:26615-28387
TTCGCCGCGCTCCAAACGGGCGATCTCCTCGGCGCGGGCCGCCAGGATCAGCGCCG
>Contig_98:35323-35886
GACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCCTGGATCTGGTCGAAATCCAGCCGCAG
>

with any length fasta sequences. The traiing ">" can be taken care of piping through | sed '$d' if need be. Please be aware that your desired output's "Contig_98" line is missing an "AG" at the end.

Last edited by RudiC; 06-13-2019 at 05:00 AM..
These 3 Users Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

append two files

Hi, I have two files where 1 contains data and the other contains strings eg file 1 -0.00000 0.00000 0.00000 0.00000 0.00000 0.80000 0.50000 0.50000 0.60000 0.50000 0.50000 0.20000 -0.00000 0.00000 0.40000 file 2 F F F F F F T T T T T T T T T How to I append file2 to file 1 to... (1 Reply)
Discussion started by: princessotes
1 Replies

2. UNIX for Dummies Questions & Answers

grep FASTA files

I would like to extract the sequences larger than 10 bases but shorter than 18 along with the identifier from a FASTA file that looks like this: > Seq I ACGACTAGACGATAGACGATAGA > Seq 2 ACGATGACGTAGCAGT > Seq 3 ACGATACGAT I know I can extract the IDs alone with the following code grep... (3 Replies)
Discussion started by: Xterra
3 Replies

3. UNIX for Dummies Questions & Answers

renaming (renumbering) fasta files

I have a fasta file that looks like this: >Noname ACCAAAATAATTCATGATATACTCAGATCCATCTGAGGGTTTCACCACTTGTAGAGCTAT CAGAAGAATGTCAATCAACTGTCCGAGAAAAAAGAATCCCAGG >Noname ACTATAAACCCTATTTCTCTTTCTAAAAATTGAAATATTAAAGAAACTAGCACTAGCCTG ACCTTTAGCCAGACTTCTCACTCTTAATGCTGCGGACAAACAGA ... I want to... (2 Replies)
Discussion started by: Oyster
2 Replies

4. Shell Programming and Scripting

append to two files

I tried to write a script ( not working) to append first value from mylist to a file called my myfirstResult and to another called mysecondResult awk ' {print $1} >> myfirsResult ' < mylist awk ' {print $1} >> mysecondResult ' < mylist $ cat mylist A 02/16/2012 B 02/19/2012 C... (3 Replies)
Discussion started by: Sara_84
3 Replies

5. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done (1 Reply)
Discussion started by: Ann Mc Cartney
1 Replies

6. UNIX for Dummies Questions & Answers

Append Files

Hi All, I have to append 2 lines at the end of a text file. If those 2 lines are already there then do not append else append the 2 lines to the text file. Eg: I have a text file, file.txt This text file might look like this, /home/kp/make.jsp /home/pk/model.jsp I have to append... (1 Reply)
Discussion started by: pavan_test
1 Replies

7. UNIX for Dummies Questions & Answers

Append file name to fasta file headers in Linux

How do we append the file name to fasta file headers in multiple fasta-files in Linux? (10 Replies)
Discussion started by: Mauve
10 Replies

8. Shell Programming and Scripting

Unzip all the files with subdirectories present and append a part of string from the main .zip files

Hi frnds, My requirement is I have a zip file with name say eg: test_ABC_UH_ccde2a_awdeaea_20150422.zip within that there are subdirectories on each directory we again have .zip files and in that we have files like mama20150422.gz and so on. Iam in need of a bash script so that it unzips... (0 Replies)
Discussion started by: Ravi Kishore
0 Replies

9. Shell Programming and Scripting

Append string to all the files inside a directory excluding subdirectories and .zip files

Hii, Could someone help me to append string to the starting of all the filenames inside a directory but it should exclude .zip files and subdirectories. Eg. file1: test1.log file2: test2.log file3 test.zip After running the script file1: string_test1.log file2: string_test2.log file3:... (4 Replies)
Discussion started by: Ravi Kishore
4 Replies

10. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2 (4 Replies)
Discussion started by: patrick87
4 Replies
diff3(1)						      General Commands Manual							  diff3(1)

Name
       diff3 - 3-way differential file comparison

Syntax
       diff3 [-ex3] file1 file2 file3

Description
       The command compares three versions of a file, and publishes the ranges of text that disagree, flagged with the following codes:

	  ====	      all three files differ

	  ====1       file1 is different

	  ====2       file2 is different

	  ====3       file3 is different

       The type of change needed to convert a given range of a given file to some other is indicated in one of these ways:

	  f : n1 a    Text is to be appended after line number n1 in file f, where f = 1, 2, or 3.

	  f : n1 , n2 c
		      Text is to be changed in the range line n1 to line n2.  If n1 = n2, the range may be abbreviated to n1.

       The original contents of the range follows immediately after a c indication.  When the contents of two files are identical, the contents of
       the lower-numbered file is suppressed.

Options
       -3   Produces an editor script containing the changes between file1 and file2 that are to be incorporated into file3.

       -e	   Produces an editor script containing the changes between file2 and file3 that are to be incorporated into file1.

       -x	   Produces an editor script containing the changes among all three files.

Examples
       Under the -e option, publishes a script for the editor that incorporates into file1 all changes between file2 and  file3  -  that  is,  the
       changes	that would normally be flagged ==== and ====3.	Option -x (-3) produces a script to incorporate only changes flagged ==== (====3).
       The following command applies the resulting script to `file1':
       (cat script; echo '1,$p') | ed - file1

Restrictions
       Text lines that consist of a single `.'	defeat -e.

Files
       /tmp/d3?????
       /usr/lib/diff3

See Also
       cmp(1), comm(1), diff(1), dffmk(1), join(1), sccsdiff(1), uniq(1)

																	  diff3(1)
All times are GMT -4. The time now is 04:50 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy