Unix/Linux Go Back    


Shell Programming and Scripting BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

Help making simple perl or bash script to create a simple matrix

Shell Programming and Scripting


Closed    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 04-25-2012
torchij torchij is offline
Registered User
 
Join Date: Apr 2012
Last Activity: 5 February 2017, 4:01 PM EST
Posts: 71
Thanks: 23
Thanked 1 Time in 1 Post
Help making simple perl or bash script to create a simple matrix

Hello all!
This is my first post and I'm very new to programming. I would like help creating a simple perl or bash script that I will be using in my work as a junior bioinformatician.

Essentially, I would like to take a tab-delimted or .csv text with 3 columns and write them to a "3D" matrix:

Input:



Code:
gene   sample   identifier
a         1        @
b         2        #
c         3,4      @
d         5        %  
d         5        *

Output:

Code:
          
        1         2          3         4         5

a       @        

b                 #

c                            @         @

d                                                %,*

Is this easy to do?
Thanks in advance

Jonathon

Last edited by torchij; 04-25-2012 at 01:06 PM..
Sponsored Links
    #2  
Old Unix and Linux 04-25-2012
neutronscott's Unix or Linux Image
neutronscott neutronscott is offline Forum Advisor  
script kiddie
 
Join Date: Jun 2011
Last Activity: 19 September 2017, 8:22 PM EDT
Location: South Carolina, USA
Posts: 941
Thanks: 31
Thanked 303 Times in 281 Posts
Quote:
Originally Posted by torchij View Post
Is this easy to do?
Not really, no. Linux Is this actual input? Are there any boundaries for these rows/columns? always single letter / single number?
Sponsored Links
    #3  
Old Unix and Linux 04-25-2012
torchij torchij is offline
Registered User
 
Join Date: Apr 2012
Last Activity: 5 February 2017, 4:01 PM EST
Posts: 71
Thanks: 23
Thanked 1 Time in 1 Post
Sorry I'm just realizing the format of my message messed up when I posted. I'm trying to put it in a table now. Or is there a quick way to add an example excel file?
Jon
    #4  
Old Unix and Linux 04-25-2012
neutronscott's Unix or Linux Image
neutronscott neutronscott is offline Forum Advisor  
script kiddie
 
Join Date: Jun 2011
Last Activity: 19 September 2017, 8:22 PM EDT
Location: South Carolina, USA
Posts: 941
Thanks: 31
Thanked 303 Times in 281 Posts
not sure about excel. just select the area and click "CODE"

is the input file actually an .xls? will you be needing this conversion to work with standard unix utilities right? I've already got it mocked up in awk assuming space or tab delimiters.
Sponsored Links
    #5  
Old Unix and Linux 04-25-2012
torchij torchij is offline
Registered User
 
Join Date: Apr 2012
Last Activity: 5 February 2017, 4:01 PM EST
Posts: 71
Thanks: 23
Thanked 1 Time in 1 Post
I updated my post, it should be more clear now.
The input will be a tab-delimited text file saved from excel.
Thank you so much for replying so fast. I'm loving this community already.
Sponsored Links
    #6  
Old Unix and Linux 04-25-2012
neutronscott's Unix or Linux Image
neutronscott neutronscott is offline Forum Advisor  
script kiddie
 
Join Date: Jun 2011
Last Activity: 19 September 2017, 8:22 PM EDT
Location: South Carolina, USA
Posts: 941
Thanks: 31
Thanked 303 Times in 281 Posts

Code:
$ ./gene  input
gene      1         2         3         4         5
a         @
b                   #
c                             @         @
d                                                 %,*

it's an awk script.

Code:
#!/usr/bin/awk -f

# keep list of unique genes, in order
!($1 in genes_uniq) { genes_uniq[$1]; genes[gene_idx++]=$1; }

{
        split($2, cols, /,/)
        for (col in cols) {
                if (cols[col] > max_col) max_col=cols[col]
                matrix[$1,cols[col]] = matrix[$1,cols[col]] "," $3
        }
}

END {
        # print header
        printf("%-9s ", "gene")
        for (col = 1; col <= max_col; col++)
                printf("%-9d ", col);
        printf("\n")

        for (i = 0; i < gene_idx; i++) {
                printf("%-9s ", genes[i]);
                for (col = 1; col <= max_col; col++)
                        printf("%-9s ", substr(matrix[genes[i],col],2));
                printf("\n");
        }
}

Sponsored Links
    #7  
Old Unix and Linux 04-25-2012
torchij torchij is offline
Registered User
 
Join Date: Apr 2012
Last Activity: 5 February 2017, 4:01 PM EST
Posts: 71
Thanks: 23
Thanked 1 Time in 1 Post
I'm assuming it can be run as


Code:
$ awk -f code.awk input.txt > output.txt

?
Sponsored Links
Closed

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Create simple script corfuitl Shell Programming and Scripting 6 04-17-2012 03:53 PM
Need to create a simple script using MD5, SSH... zixzix01 Shell Programming and Scripting 54 03-11-2011 01:06 AM
Hopefully a simple script, bash or perl... CudaPrime Shell Programming and Scripting 4 01-06-2011 06:46 PM
Simple Script to create folders ksk Shell Programming and Scripting 4 10-10-2009 09:10 AM
Create A Simple GUI For Shell Script Grizzly Shell Programming and Scripting 1 09-18-2009 10:28 AM



All times are GMT -4. The time now is 02:12 PM.