Sponsored Content
Top Forums Shell Programming and Scripting can this been solved with awk and sed? Post 88632 by mskcc on Monday 7th of November 2005 01:04:40 PM
Old 11-07-2005
can this been solved with awk and sed?

Hi Masters,

Code:
___________________________________________________________________________________
Group of orthologs #1. Best score 3010 bits
Score difference with first non-orthologous sequence - yeast:3010   human:2754
YHR165C             	100.00%		PRP8_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #2. Best score 2100 bits
Score difference with first non-orthologous sequence - yeast:2033   human:1978
YLR106C             	100.00%		MDN1_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #3. Best score 2082 bits
Score difference with first non-orthologous sequence - yeast:997   human:593
YJL130C             	100.00%		PYR1_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #4. Best score 1959 bits
Score difference with first non-orthologous sequence - yeast:1959   human:1007
YKR054C             	100.00%		DYHC_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #5. Best score 1855 bits
Score difference with first non-orthologous sequence - yeast:1855   human:1022
YNR016C             	100.00%		Q6KE87_HUMAN        	100.00%
YMR207C             	19.86%		COA2_HUMAN          	90.52%
                    	       		COA1_HUMAN          	53.30%
___________________________________________________________________________________
Group of orthologs #6. Best score 1838 bits
Score difference with first non-orthologous sequence - yeast:1748   human:1767
YDL140C             	100.00%		RPB1_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #7. Best score 1768 bits
Score difference with first non-orthologous sequence - yeast:1768   human:1636
YJR066W             	100.00%		Q4LE76_HUMAN        	100.00%
YKL203C             	49.22%

Above records are part of a file. What I need to do is to extract the information from this file and put them into a speadsheet format, Like this:(examples from #5 and #7 above)

Group_number; Best_Score; S_one; P_one; S_two; P_two
5;1855;YNR016C;100.00%;Q6KE87_HUMAN;100.00%
5;1855;YMR207C;19.86%;COA2_HUMAN;90.52%
5;1855;;;COA1_HUMAN;53.30%
7;1768;YJR066W;100.00%;Q4LE76_HUMAN;100.00%
7;1768;YKL203C;49%;;

Thanks in Advance!

Last edited by Perderabo; 11-08-2005 at 11:41 AM.. Reason: Add code tags and disable smilies for readability
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Solved: AWK SED HELP

Hi, I need to process a file as below. Could you please help to achieve that using awk/sed commands. Input file: --------------- AB | "abcdef 12345" | 7r5561451.pdf PQRST | "fghfghf hgkjgtjhghb ghhgjhg hghjghg " | 76er6ry.pdf 12345 | "fghfgcv uytdywe bww76 jkh7dscbc 78 : nvchtry hbuyt"... (0 Replies)
Discussion started by: viveksr
0 Replies

2. Shell Programming and Scripting

[Solved] Sed/awk print between patterns the first occurrence

Guys, I am trying the following: i have a log file of a webbap which logs in the following pattern: 2011-08-14 21:10:04,535 blablabla ERROR blablabla bla bla bla bla 2011-08-14 21:10:04,535 blablabla ERROR blablabla bla bla bla ... (6 Replies)
Discussion started by: ppolianidis
6 Replies

3. Shell Programming and Scripting

[Solved] Find duplicate and add pattern in sed/awk

<Update> I have the solution: sed 's/\{3\}/&;&;---;4/' The thread can be marked as solved! </Update> Hi There, I'm working on a script processing some data from a website into cvs format. There is only one final problem left I can't find a solution. I've processed my file... (0 Replies)
Discussion started by: lolworlds
0 Replies

4. Shell Programming and Scripting

[solved] how to separate using sed !

dears, hope evryone doing good in his work , i have a question about something important : how can i use 'sed' so in a script automatically it will take an enter before the number 1 in this line so 2 commands will be taken insted of one big command ?... (0 Replies)
Discussion started by: semaan
0 Replies

5. UNIX for Dummies Questions & Answers

[solved]Help with a sed command

So I have a bunch of strings in a file. Example Line ./prcol/trt/conf/conf-app/jobdefinition/trt-pre-extr-trt-step.jdef Intended Result pre-extr-trt-step So far I have parsed it out to the last bit, echo $line | cut -d'/' -f7 | cut -d. -f1Result trt-pre-extr-trt-step So I added a... (2 Replies)
Discussion started by: J-Man
2 Replies

6. Shell Programming and Scripting

[SOLVED] sed command

Help request, I have tsted this line of code for hours. The first line works and the second line returns the message " sed: command garbled.....". This is running on solaris. The "${} variables all have good values when echoed. ## /bin/sed -n '1,25p' ${file} >> ${MailFile} ... (3 Replies)
Discussion started by: millerg225
3 Replies

7. Shell Programming and Scripting

[SOLVED] sed -i not available in solaris 5.10

Hi All, i'm writing a script where i have to grep for a pattern and the 3 lines after the pattern and comment them out. Note that i have to do this for multiple files, i am able to grep the pattern and the next 3 lines but since solaris does not recognize the -i option, i was wondering if... (11 Replies)
Discussion started by: Irishboy24
11 Replies

8. Shell Programming and Scripting

[Solved] sed

sed -e 's/console/raw/g' this command will replace the letter pradeep with rawat what if i want to replace a word like FRIENDS with a space simultaneously from the same file i m replacing pradeep. im doing this sed -e 's/console/raw/g' && sed 's/FRIENDS//g' but i dono why this is not happening. (2 Replies)
Discussion started by: console
2 Replies

9. UNIX for Dummies Questions & Answers

[Solved] How remove leading whitespace from xml (sed /awk?)

Hi again I have an xml file and want to remove the leading white space as it causes me issues later in my script I see sed is possible but cant seem to get it to work I tried sed 's/^ *//' file.xml output <xn:VsDataContainer id="1U104799" modifier="update"> ... (10 Replies)
Discussion started by: aniquebmx
10 Replies

10. UNIX for Dummies Questions & Answers

[Solved] sed command help

Hello all. Im trying very hard to figure this out, but Im a newbie. I have a file that looks like this.... 6315551234 NJ224 5162224567 SUFF Im trying to put a command together that will make it into this.... UM,6315551234,,,,,NJ224,0 UM,5162224567,,,,,SUFF,0 Im all over the... (7 Replies)
Discussion started by: jay11789
7 Replies
HHCONSENSUS(1)							   User Commands						    HHCONSENSUS(1)

NAME
hhconsensus - calculate the consensus sequence for an A3M/FASTA input file SYNOPSIS
hhconsensus -i <file> [options] DESCRIPTION
HHconsensus version 2.0.15 (June 2012) Calculate the consensus sequence for an A3M/FASTA input file. (C) Johannes Soeding, Michael Rem- mert, Andreas Biegert, Andreas Hauser Remmert M, Biegert A, Hauser A, and Soding J. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods 9:173-175 (2011). -i <file> query alignment (A2M, A3M, or FASTA), or query HMM Output options: -s <file> append consensus sequence in FASTA (default=<infile.seq>) -o <file> write alignment with consensus sequence in A3M -oa3m <file> same -oa2m <file> write alignment with consensus sequence in A2M -ofas <file> write alignment with consensus sequence in FASTA -v <int> verbose mode: 0:no screen output 1:only warings 2: verbose Filter input alignment (options can be combined): -id [0,100] maximum pairwise sequence identity (%) (def=100) -diff [0,inf[ filter most diverse set of sequences, keeping at least this many sequences in each block of >50 columns (def=0) -cov [0,100] minimum coverage with query (%) (def=0) -qid [0,100] minimum sequence identity with query (%) (def=0) -qsc [0,100] minimum score per column with query (def=-20.0) Input alignment format: -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted) -M first use FASTA: columns with residue in 1st sequence are match states -M [0,100] use FASTA: columns with fewer than X% gaps are match states Other options: -addss add predicted secondary structure information from PSIPRED Example: hhconsensus -i stdin -s stdout hhconsensus 2.0.15 June 2012 HHCONSENSUS(1)
All times are GMT -4. The time now is 02:11 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy