can this been solved with awk and sed? Post: 88632

Sponsored Content

Top Forums Shell Programming and Scripting can this been solved with awk and sed? Post 88632 by mskcc on Monday 7th of November 2005 01:04:40 PM

11-07-2005

Registered User

can this been solved with awk and sed?

Hi Masters,

Code:

___________________________________________________________________________________
Group of orthologs #1. Best score 3010 bits
Score difference with first non-orthologous sequence - yeast:3010   human:2754
YHR165C             	100.00%		PRP8_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #2. Best score 2100 bits
Score difference with first non-orthologous sequence - yeast:2033   human:1978
YLR106C             	100.00%		MDN1_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #3. Best score 2082 bits
Score difference with first non-orthologous sequence - yeast:997   human:593
YJL130C             	100.00%		PYR1_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #4. Best score 1959 bits
Score difference with first non-orthologous sequence - yeast:1959   human:1007
YKR054C             	100.00%		DYHC_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #5. Best score 1855 bits
Score difference with first non-orthologous sequence - yeast:1855   human:1022
YNR016C             	100.00%		Q6KE87_HUMAN        	100.00%
YMR207C             	19.86%		COA2_HUMAN          	90.52%
                    	       		COA1_HUMAN          	53.30%
___________________________________________________________________________________
Group of orthologs #6. Best score 1838 bits
Score difference with first non-orthologous sequence - yeast:1748   human:1767
YDL140C             	100.00%		RPB1_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #7. Best score 1768 bits
Score difference with first non-orthologous sequence - yeast:1768   human:1636
YJR066W             	100.00%		Q4LE76_HUMAN        	100.00%
YKL203C             	49.22%

Above records are part of a file. What I need to do is to extract the information from this file and put them into a speadsheet format, Like this:(examples from #5 and #7 above)

Group_number; Best_Score; S_one; P_one; S_two; P_two
5;1855;YNR016C;100.00%;Q6KE87_HUMAN;100.00%
5;1855;YMR207C;19.86%;COA2_HUMAN;90.52%
5;1855;;;COA1_HUMAN;53.30%
7;1768;YJR066W;100.00%;Q4LE76_HUMAN;100.00%
7;1768;YKL203C;49%;;

Thanks in Advance!

Last edited by Perderabo; 11-08-2005 at 11:41 AM.. Reason: Add code tags and disable smilies for readability

mskcc

View Public Profile for mskcc

Find all posts by mskcc

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Solved: AWK SED HELP

Hi, I need to process a file as below. Could you please help to achieve that using awk/sed commands. Input file: --------------- AB | "abcdef 12345" | 7r5561451.pdf PQRST | "fghfghf hgkjgtjhghb ghhgjhg hghjghg " | 76er6ry.pdf 12345 | "fghfgcv uytdywe bww76 jkh7dscbc 78 : nvchtry hbuyt"...

2. Shell Programming and Scripting

[Solved] Sed/awk print between patterns the first occurrence

Guys, I am trying the following: i have a log file of a webbap which logs in the following pattern: 2011-08-14 21:10:04,535 blablabla ERROR blablabla bla bla bla bla 2011-08-14 21:10:04,535 blablabla ERROR blablabla bla bla bla ...

3. Shell Programming and Scripting

[Solved] Find duplicate and add pattern in sed/awk

<Update> I have the solution: sed 's/\{3\}/&;&;---;4/' The thread can be marked as solved! </Update> Hi There, I'm working on a script processing some data from a website into cvs format. There is only one final problem left I can't find a solution. I've processed my file...

4. Shell Programming and Scripting

[solved] how to separate using sed !

dears, hope evryone doing good in his work , i have a question about something important : how can i use 'sed' so in a script automatically it will take an enter before the number 1 in this line so 2 commands will be taken insted of one big command ?...

5. UNIX for Dummies Questions & Answers

[solved]Help with a sed command

So I have a bunch of strings in a file. Example Line ./prcol/trt/conf/conf-app/jobdefinition/trt-pre-extr-trt-step.jdef Intended Result pre-extr-trt-step So far I have parsed it out to the last bit, echo $line | cut -d'/' -f7 | cut -d. -f1Result trt-pre-extr-trt-step So I added a...

6. Shell Programming and Scripting

[SOLVED] sed command

Help request, I have tsted this line of code for hours. The first line works and the second line returns the message " sed: command garbled.....". This is running on solaris. The "${} variables all have good values when echoed. ## /bin/sed -n '1,25p' ${file} >> ${MailFile} ...

7. Shell Programming and Scripting

[SOLVED] sed -i not available in solaris 5.10

Hi All, i'm writing a script where i have to grep for a pattern and the 3 lines after the pattern and comment them out. Note that i have to do this for multiple files, i am able to grep the pattern and the next 3 lines but since solaris does not recognize the -i option, i was wondering if...

8. Shell Programming and Scripting

[Solved] sed

sed -e 's/console/raw/g' this command will replace the letter pradeep with rawat what if i want to replace a word like FRIENDS with a space simultaneously from the same file i m replacing pradeep. im doing this sed -e 's/console/raw/g' && sed 's/FRIENDS//g' but i dono why this is not happening.

9. UNIX for Dummies Questions & Answers

[Solved] How remove leading whitespace from xml (sed /awk?)

Hi again I have an xml file and want to remove the leading white space as it causes me issues later in my script I see sed is possible but cant seem to get it to work I tried sed 's/^ *//' file.xml output <xn:VsDataContainer id="1U104799" modifier="update"> ...

10. UNIX for Dummies Questions & Answers

[Solved] sed command help

Hello all. Im trying very hard to figure this out, but Im a newbie. I have a file that looks like this.... 6315551234 NJ224 5162224567 SUFF Im trying to put a command together that will make it into this.... UM,6315551234,,,,,NJ224,0 UM,5162224567,,,,,SUFF,0 Im all over the...

LEARN ABOUT DEBIAN

hhconsensus

HHCONSENSUS(1)							   User Commands						    HHCONSENSUS(1)

NAME

       hhconsensus - calculate the consensus sequence for an A3M/FASTA input file

SYNOPSIS

       hhconsensus -i <file> [options]

DESCRIPTION

       HHconsensus  version  2.0.15  (June 2012) Calculate the consensus sequence for an A3M/FASTA input file.	(C) Johannes Soeding, Michael Rem-
       mert, Andreas Biegert, Andreas Hauser Remmert M, Biegert A, Hauser A, and Soding J.  HHblits:  Lightning-fast  iterative  protein  sequence
       searching by HMM-HMM alignment.	Nat. Methods 9:173-175 (2011).

       -i <file>
	      query alignment (A2M, A3M, or FASTA), or query HMM

   Output options:
       -s <file>
	      append consensus sequence in FASTA (default=<infile.seq>)

       -o <file>
	      write alignment with consensus sequence in A3M

       -oa3m <file>
	      same

       -oa2m <file>
	      write alignment with consensus sequence in A2M

       -ofas <file>
	      write alignment with consensus sequence in FASTA

       -v <int>
	      verbose mode: 0:no screen output	1:only warings	2: verbose

   Filter input alignment (options can be combined):
       -id    [0,100] maximum pairwise sequence identity (%) (def=100)

       -diff [0,inf[ filter most diverse set of sequences, keeping at least this

	      many sequences in each block of >50 columns (def=0)

       -cov   [0,100] minimum coverage with query (%) (def=0)

       -qid   [0,100] minimum sequence identity with query (%) (def=0)

       -qsc   [0,100] minimum score per column with query  (def=-20.0)

   Input alignment format:
       -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)

       -M first
	      use FASTA: columns with residue in 1st sequence are match states

       -M [0,100]
	      use FASTA: columns with fewer than X% gaps are match states

   Other options:
       -addss add predicted secondary structure information from PSIPRED

       Example: hhconsensus -i stdin -s stdout

hhconsensus 2.0.15						     June 2012							    HHCONSENSUS(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Solved: AWK SED HELP

Discussion started by: viveksr

2. Shell Programming and Scripting

[Solved] Sed/awk print between patterns the first occurrence

Discussion started by: ppolianidis

3. Shell Programming and Scripting

[Solved] Find duplicate and add pattern in sed/awk

Discussion started by: lolworlds

4. Shell Programming and Scripting

[solved] how to separate using sed !

Discussion started by: semaan