The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #1 (permalink)  
Old 11-07-2005
mskcc mskcc is offline
Registered User
  
 

Join Date: Jul 2005
Posts: 37
can this been solved with awk and sed?

Hi Masters,


Code:
___________________________________________________________________________________
Group of orthologs #1. Best score 3010 bits
Score difference with first non-orthologous sequence - yeast:3010   human:2754
YHR165C             	100.00%		PRP8_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #2. Best score 2100 bits
Score difference with first non-orthologous sequence - yeast:2033   human:1978
YLR106C             	100.00%		MDN1_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #3. Best score 2082 bits
Score difference with first non-orthologous sequence - yeast:997   human:593
YJL130C             	100.00%		PYR1_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #4. Best score 1959 bits
Score difference with first non-orthologous sequence - yeast:1959   human:1007
YKR054C             	100.00%		DYHC_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #5. Best score 1855 bits
Score difference with first non-orthologous sequence - yeast:1855   human:1022
YNR016C             	100.00%		Q6KE87_HUMAN        	100.00%
YMR207C             	19.86%		COA2_HUMAN          	90.52%
                    	       		COA1_HUMAN          	53.30%
___________________________________________________________________________________
Group of orthologs #6. Best score 1838 bits
Score difference with first non-orthologous sequence - yeast:1748   human:1767
YDL140C             	100.00%		RPB1_HUMAN          	100.00%
___________________________________________________________________________________
Group of orthologs #7. Best score 1768 bits
Score difference with first non-orthologous sequence - yeast:1768   human:1636
YJR066W             	100.00%		Q4LE76_HUMAN        	100.00%
YKL203C             	49.22%

Above records are part of a file. What I need to do is to extract the information from this file and put them into a speadsheet format, Like this:(examples from #5 and #7 above)

Group_number; Best_Score; S_one; P_one; S_two; P_two
5;1855;YNR016C;100.00%;Q6KE87_HUMAN;100.00%
5;1855;YMR207C;19.86%;COA2_HUMAN;90.52%
5;1855;;;COA1_HUMAN;53.30%
7;1768;YJR066W;100.00%;Q4LE76_HUMAN;100.00%
7;1768;YKL203C;49%;;

Thanks in Advance!

Last edited by Perderabo; 11-08-2005 at 11:41 AM.. Reason: Add code tags and disable smilies for readability