Sponsored Content
Full Discussion: Fasta header modification
Top Forums UNIX for Dummies Questions & Answers Fasta header modification Post 302922947 by Akshay Hegde on Wednesday 29th of October 2014 11:15:50 AM
Old 10-29-2014
Code:
akshay@nio:/tmp$ cat file1
>contig0001 length=11115 numreads=10777
agatgtagatctct
>contig0002 lenth=23412 numreads=2345
atcgtcat

Code:
akshay@nio:/tmp$ cat file2
1 contig0001 11115 20.5
2 contig0002 23412 13.5

Code:
akshay@nio:/tmp$ awk -F'[ =]' 'FNR==NR{A[$3]=$4;next}/^>/ && $3 in A{$0 = $1 OFS $3 OFS "[cov="A[$3]"]"}1' OFS="_" file2 file1
>contig0001_11115_[cov=20.5]
agatgtagatctct
>contig0002_23412_[cov=13.5]
atcgtcat

 

9 More Discussions You Might Find Interesting

1. Linux

Reading the header of a tar file(posix header)

say i have these many file in a directory named exam. 1)/exam/newfolder/link.txt. 2)/exam/newfolder1/ and i create a tar say exam.tar well the problem is, when i read the tar file i dont find any metadata about the directories,as you cannot create a tar containig empty directories. on the... (2 Replies)
Discussion started by: Tanvirk
2 Replies

2. Shell Programming and Scripting

Renaming all header to specific header pattern

Input #HAC0253 EFVHIJHIJEFVTHIJOPKOPKTEFVEFVEFVOPKHIJOPKOPKHIJTTEFVEFVTEFV #BASFS12 EFVEFVHIJEFVEFVTOPKEFVOPKTHIJTTHIJOPK #ACG5115 TEFVEFVOIJEFVHIJHIJOPKOPKHIJHIJTTEFVEFVOPKTTEFVEFVOPKHIJOPKOPKOPK #ECG5114 IJTOPKHIJEFVOEFVEFVOPKTTEFVEFVOPKHIJOPKOPKOPK . . Output (5 Replies)
Discussion started by: patrick87
5 Replies

3. UNIX for Dummies Questions & Answers

Merge all csv files in one folder considering only 1 header row and ignoring header of all others

Friends, I need help with the following in UNIX. Merge all csv files in one folder considering only 1 header row and ignoring header of all other files. FYI - All files are in same format and contains same headers. Thank you (4 Replies)
Discussion started by: Shiny_Roy
4 Replies

4. Shell Programming and Scripting

Add column header and row header

Hi, I have an input like this 1 2 3 4 2 3 4 5 4 5 6 7 I would like to count the no. of columns and print a header with a prefix "Col". I would also like to count the no. of rows and print as first column with each line number with a prefix "Row" So, my output would be ... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

5. Shell Programming and Scripting

Manipulate all rows except header, but header should be output as well

Hello There... I have a sample input file .. number:department:amount 125:Market:125.23 126:Hardware store:434.95 127:Video store:7.45 128:Book store:14.32 129:Gasolline:16.10 I will be doing some manipulations on all the records except the header, but the header should always be... (2 Replies)
Discussion started by: juzz4fun
2 Replies

6. Shell Programming and Scripting

Shorten header of protein sequences in fasta file

I have a fasta file as follows >sp|O15090|FABP4_HUMAN Fatty acid-binding protein, adipocyte OS=Homo sapiens GN=FABP4 PE=1 SV=3 MCDAFVGTWKLVSSENFDDYMKEVGVGFATRKVAGMAKPNMIISVNGDVITIKSESTFKN TEISFILGQEFDEVTADDRKVKSTITLDGGVLVHVQKWDGKSTTIKRKREDDKLVVECVM KGVTSTRVYERA >sp|L18484|AP2A2_RAT AP-2... (3 Replies)
Discussion started by: alexypaul
3 Replies

7. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2 (4 Replies)
Discussion started by: patrick87
4 Replies

8. Shell Programming and Scripting

Find header in a text file and prepend it to all lines until another header is found

I've been struggling with this one for quite a while and cannot seem to find a solution for this find/replace scenario. Perhaps I'm getting rusty. I have a file that contains a number of metrics (exactly 3 fields per line) from a few appliances that are collected in parallel. To identify the... (3 Replies)
Discussion started by: verdepollo
3 Replies

9. Shell Programming and Scripting

Shorten header of protein sequences in fasta file to only organism name

I have a fasta file as follows >sp|Q8WWQ8|STAB2_HUMAN Stabilin-2 OS=Homo sapiens OX=9606 GN=STAB2 PE=1 SV=3 MMLQHLVIFCLGLVVQNFCSPAETTGQARRCDRKSLLTIRTECRSCALNLGVKCPDGYTM ITSGSVGVRDCRYTFEVRTYSLSLPGCRHICRKDYLQPRCCPGRWGPDCIECPGGAGSPC NGRGSCAEGMEGNGTCSCQEGFGGTACETCADDNLFGPSCSSVCNCVHGVCNSGLDGDGT... (3 Replies)
Discussion started by: jerrild
3 Replies
BP_SEARCH2TRIBE(1p)					User Contributed Perl Documentation				       BP_SEARCH2TRIBE(1p)

NAME
search2tribe - Turn SearchIO parseable reports(s) into TRIBE matrix SYNOPSIS
Usage: search2tribe [-o outputfile] [-f reportformat] [-w/--weight] file1 file2 .. DESCRIPTION
This script is probably too slow for most people's uses. It is better to use something like scripts/searchio/fastam9_to_table, -m 9 output from BLAST, or the blast2table from the BLAST O'Reilly book to get a tabular output from these programs and then feed the table into MCL with the mcxdeblast script and the --m9 option. This script will turn a protein Search report (BLASTP, FASTP, SSEARCH) into a Markov Matrix for TribeMCL clustering. The options are: -o filename - the output filename [default STDOUT] -f format - search result format (blast, fasta) (ssearch is fasta format). default is blast. -w or --weight VALUE - Change the default weight for E(0.0) hits to VALUE (default=200 (i.e. 1e-200) ) -h - this help menu Additionally specify the filenames you want to process on the command-line. If no files are specified then STDIN input is assumed. You specify this by doing: search2tribe < file1 file2 file3 AUTHOR
Jason Stajich, jason-at-bioperl-dot-org perl v5.14.2 2012-03-02 BP_SEARCH2TRIBE(1p)
All times are GMT -4. The time now is 07:17 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy