Sponsored Content
Top Forums UNIX for Beginners Questions & Answers How to append two fasta files? Post 303036080 by MadeInGermany on Thursday 13th of June 2019 04:57:26 PM
Old 06-13-2019
Post #5 is interesting.
I found it can be enhanced, by a ^ anchor (match at the beginning only), and by inserting a ^ character that becomes an anchor for the following grep
Code:
tr $'\n>' $' \n' <file1|  grep -vf <(sed -n 's/^>/^/p' file2) | cat  <(tr $'\n>' $' \n' <file2) - |  tr  -s $' \n' $'\n>'

Post #7 did not work for me.? Did not further analyze it.
Here is an embedded multi-line awk code that works for me:
Code:
#!/bin/sh
awk '
{
  ishead=($1~/^>/)
}
FNR==NR {
  # file1: store everything in F[head]
  if (ishead) {
     head=$1
     sep=""
  } else {
     F[head]=(F[head] sep $1)
     sep=ORS
  }
  next
}
{
  # file2: print everything, delete corresponding F[head] from file1
  if (ishead) {
    delete F[$1]
  }
  print
}
END {
  # print the remaining F[head] from file1
  for (head in F) {
    print head
    print F[head]
  }
}
' file1 file2

I chose $1 not $0 because I found some trailing spaces in the given examples; spaces are stripped if $1 is used.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

append two files

Hi, I have two files where 1 contains data and the other contains strings eg file 1 -0.00000 0.00000 0.00000 0.00000 0.00000 0.80000 0.50000 0.50000 0.60000 0.50000 0.50000 0.20000 -0.00000 0.00000 0.40000 file 2 F F F F F F T T T T T T T T T How to I append file2 to file 1 to... (1 Reply)
Discussion started by: princessotes
1 Replies

2. UNIX for Dummies Questions & Answers

grep FASTA files

I would like to extract the sequences larger than 10 bases but shorter than 18 along with the identifier from a FASTA file that looks like this: > Seq I ACGACTAGACGATAGACGATAGA > Seq 2 ACGATGACGTAGCAGT > Seq 3 ACGATACGAT I know I can extract the IDs alone with the following code grep... (3 Replies)
Discussion started by: Xterra
3 Replies

3. UNIX for Dummies Questions & Answers

renaming (renumbering) fasta files

I have a fasta file that looks like this: >Noname ACCAAAATAATTCATGATATACTCAGATCCATCTGAGGGTTTCACCACTTGTAGAGCTAT CAGAAGAATGTCAATCAACTGTCCGAGAAAAAAGAATCCCAGG >Noname ACTATAAACCCTATTTCTCTTTCTAAAAATTGAAATATTAAAGAAACTAGCACTAGCCTG ACCTTTAGCCAGACTTCTCACTCTTAATGCTGCGGACAAACAGA ... I want to... (2 Replies)
Discussion started by: Oyster
2 Replies

4. Shell Programming and Scripting

append to two files

I tried to write a script ( not working) to append first value from mylist to a file called my myfirstResult and to another called mysecondResult awk ' {print $1} >> myfirsResult ' < mylist awk ' {print $1} >> mysecondResult ' < mylist $ cat mylist A 02/16/2012 B 02/19/2012 C... (3 Replies)
Discussion started by: Sara_84
3 Replies

5. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done (1 Reply)
Discussion started by: Ann Mc Cartney
1 Replies

6. UNIX for Dummies Questions & Answers

Append Files

Hi All, I have to append 2 lines at the end of a text file. If those 2 lines are already there then do not append else append the 2 lines to the text file. Eg: I have a text file, file.txt This text file might look like this, /home/kp/make.jsp /home/pk/model.jsp I have to append... (1 Reply)
Discussion started by: pavan_test
1 Replies

7. UNIX for Dummies Questions & Answers

Append file name to fasta file headers in Linux

How do we append the file name to fasta file headers in multiple fasta-files in Linux? (10 Replies)
Discussion started by: Mauve
10 Replies

8. Shell Programming and Scripting

Unzip all the files with subdirectories present and append a part of string from the main .zip files

Hi frnds, My requirement is I have a zip file with name say eg: test_ABC_UH_ccde2a_awdeaea_20150422.zip within that there are subdirectories on each directory we again have .zip files and in that we have files like mama20150422.gz and so on. Iam in need of a bash script so that it unzips... (0 Replies)
Discussion started by: Ravi Kishore
0 Replies

9. Shell Programming and Scripting

Append string to all the files inside a directory excluding subdirectories and .zip files

Hii, Could someone help me to append string to the starting of all the filenames inside a directory but it should exclude .zip files and subdirectories. Eg. file1: test1.log file2: test2.log file3 test.zip After running the script file1: string_test1.log file2: string_test2.log file3:... (4 Replies)
Discussion started by: Ravi Kishore
4 Replies

10. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2 (4 Replies)
Discussion started by: patrick87
4 Replies
GCHEM3D(1)						       gnome-chemistry-utils							GCHEM3D(1)

NAME
gchem3d - a small chemical viewer application SYNOPSIS
gchem3d [OPTION(S)...] [FILE...] DESCRIPTION
gchem3d is a small chemical viewer application, which can show several chemical file formats. OPTIONS
The following options are accepted: -b COLOR, --bgcolor=COLOR Use the given color as background color. COLOR can be one of "black" (default), "white", "#rrggbb" (don't forget to escape the "#" character in the shell). -d MODEL, --display3d=MODEL Choose how molecules are displayed. MODEL can be one of "BallnStick" (default), "SpaceFill". -?, --help Show application help options. --help-all, --help-* Print all or just a group of help options. These options are not documented here. Instead see gtk-options(7) and gnome-options(7). -v, --version Print gchem3d version information. SEE ALSO
gnome-options(7), gtk-options(7) AUTHORS
Jean Brefort <jean.brefort@normalesup.org> Program author. Daniel Leidert <daniel.leidert@wgdd.de> Manpage author. COPYRIGHT
Copyright (C) 2002-2007 Jean Brefort Copyright (C) 2004-2007 Daniel Leidert Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 or any later version published by the Free Software Foundation. gcu 0.12 $Date: 2009-03-19 10:53:47 +0100 (jeu. 19 mars 2009) $ GCHEM3D(1)
All times are GMT -4. The time now is 07:13 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy