Creating a loop for multiplying columns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Creating a loop for multiplying columns
# 1  
Old 10-04-2013
Creating a loop for multiplying columns

I have 2 files, that look like this:

Code:
ID SNP1 SNP2 SNP3 SNP4
A1 1 2 0 2 
A2 2 0 1 1
A3 0 2 NA 1
A4 1 1 0 2

and this:
Code:
SNP score
SNP1 0.5
SNP2 0.7
SNP3 0.8
SNP4 0.2

Basically, all of the SNP-values are 0,1, 2 or NA, and they each have a score, listed in the second file. The total number of SNP-values are approx 70, so file one has 71 columns and approx 1000 rows, and file 2 has 2 columns and 71 rows.
I would like a script that for each row in file 1, takes the value of SNP1 and multiplies by the score of SNP1, creating a new colum for this sum, like this:

Code:
ID SNP1 SNP2 SNP3 SNP4 SNP1xscoreSNP1 SNP2xscoreSNP2 etc
A1 1 2 0 2 0.5 1.4 0 0.4
A2 2 0 1 1 1 0 0.8 0.2
A3 0 2 NA 1 0 1.4 NA 0.2

The NA is an indicator of missing data and can be changed into a numerical if needed, but it needs to be distinctly separated from 0, so maybe -9 would work.

I've tried creating a new file which looks like this:
Code:
ID SNP1 scoreSNP1 SNP2 ScoreSNP2 etc
A1 1 0.5 2 0.7 0 0.8 2 0.2
A2 2 0.5 0 0.7 1 0.8 1 0.2

and running this awk script:
Code:
awk '{for(a=2;a<=5;a+2)
print $0,$(6+a)=$a*(a+1)}' test > test1

but it doesn't work, it only gives one output colum, which is usually S2*$6 (I think).

Thank you for any help!

Moderator's Comments:
Mod Comment Use code tags, see PM.
# 2  
Old 10-04-2013
Try

Code:
$ cat snp
ID SNP1 SNP2 SNP3 SNP4
A1 1 2 0 2
A2 2 0 1 1
A3 0 2 NA 1
A4 1 1 0 2

Code:
$ cat score
SNP score
SNP1 0.5
SNP2 0.7
SNP3 0.8
SNP4 0.2

Code:
$ cat score.sh
  awk 'FNR==NR && NR>1{
             SNP[FNR-1]=$0
             next
                      }
  FNR==1{hdr=$0;split(hdr,H,FS)}
  FNR>1{
        printf $0 OFS
        for(i=2;i<=NF;i++)
  if($i~/[0-9]/){
        for(j in SNP){
                split(SNP[j],A,FS)
                if(H[i]~A[1])
                printf $i*A[2] OFS
                      }
                }
  else
            {
            printf "NA" OFS
            }
  printf "\n"
                        }' OFS=\\t score snp

Code:
$ sh score.sh 
A1 1 2 0 2    0.5    1.4    0    0.4    
A2 2 0 1 1    1    0    0.8    0.2    
A3 0 2 NA 1    0    1.4    NA    0.2    
A4 1 1 0 2    0.5    0.7    0    0.4


Last edited by Akshay Hegde; 10-04-2013 at 09:23 AM..
This User Gave Thanks to Akshay Hegde For This Post:
# 3  
Old 10-04-2013
You may also want to try
Code:
awk     'FNR==NR        {SCR[$1]=$2;next}
         FNR==1         {for (i=2; i<=NF;i++) COL[i]=$i; print; next}
                        {printf "%s", $0
                         for (i=2; i<=NF; i++)
                           if ($i == "NA")
                                printf "  NA " 
                           else
                                printf " %4.2f", $i * SCR[COL[i]]
                         printf "\n"
                        }
        ' file2 file1
ID SNP1 SNP2 SNP3 SNP4
A1 1 2 0 2  0.50 1.40 0.00 0.40
A2 2 0 1 1  1.00 0.00 0.80 0.20
A3 0 2 NA 1 0.00 1.40  NA  0.20
A4 1 1 0 2  0.50 0.70 0.00 0.40

EDIT: or, try
Code:
awk     'FNR==NR        {SCR[$1]=$2;next}
         FNR==1         {for (i=2; i<=NF;i++) COL[i]=$i; print; next}
                        {printf "%s", $0
                         for (i=2; i<=NF; i++)
                            printf "%s", $i=="NA"?" NA ":sprintf("%4.1f",$i * SCR[COL[i]])
                         printf "\n"
                        }
        ' file2 file1


Last edited by RudiC; 10-04-2013 at 02:27 PM..
# 4  
Old 10-04-2013
Try:
Code:
awk 'NR==FNR{A[NR]=$2; next} FNR==1{n=NF}{for(i=2; i<=n; i++) $(i+n-1)=$i=="NA"?$i:$i*A[i]}1' file2 file1

# 5  
Old 10-04-2013
short, terse, brilliant!

But - it depends on the file2 being in the right order and being complete. Which may not always be the case.
# 6  
Old 10-04-2013
Thanks RudiC, that is right, the assumption would be that file2 is in the right order..

Otherwise we would need something like this:
Code:
awk 'NR==FNR{A[$1]=$2; next} FNR==1{n=split($0,H)}{for(i=2; i<=n; i++) $(i+n-1)=$i=="NA"?$i:$i*A[H[i]]}1' file2 file1

Then it would be getting a bit too long, so:

Code:
awk '
NR==FNR {
  A[$1]=$2
  next
} 
FNR==1 {
  n=split($0,H)
}
{
  for(i=2; i<=n; i++) $(i+n-1)=$i=="NA"?$i:$i*A[H[i]]
}
1
' file2 file1

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Creating new file inside a for loop

Hi, I have list of files present in a folder. I want to search for a particular keyword sequentially and create a file which will be later used by some other program. Input files: $ ls a.dsx b.dsx c.dsx Dataline_.txt Dataline.txt loop.sh $ cat *.dsx help tyiis global for i in... (4 Replies)
Discussion started by: mac4rfree
4 Replies

2. Shell Programming and Scripting

Creating subset of a file based on specific columns

Hello Unix experts, I need a help to create a subset file. I know with cut comand, its very easy to select many different columns, or threshold. But here I have a bit problem as in my data file is big. And I don't want to identify the column numbers or names manually. I am trying to find any... (7 Replies)
Discussion started by: smitra
7 Replies

3. Shell Programming and Scripting

Help needed with multiplying two values of two columns in a file

Hi, I am trying to multiply column#1 with column#2 using a shell script. How can I make a for-loop script using 1st column as "i" and the second column as "j" from the following file? Please feel free to share any alternative ways to multiplying column#1 with column#2. .06 5.0000 .49 ... (6 Replies)
Discussion started by: momin
6 Replies

4. Shell Programming and Scripting

Creating variable by for loop

i have a file 'detail' which contains cat detail 111111 222222 333333 444444 but detail may be 4 line file.6 line file or 8 line file like cat detail 111111 222222 333333 444444 555555 666666 777777 888888 so i want a declare a loop which assign the value of first line in one... (11 Replies)
Discussion started by: rakeshtomar82
11 Replies

5. UNIX for Dummies Questions & Answers

Creating columns from a list

I have a list below, how can I have things separated nicely in columns mv browseDir.tcsh browseDir.csh mv checkSRDist.tcsh checkSRDist.csh mv create-data-tinv.tcsh create-data-tinv.csh mv createDocs.tcsh createDocs.csh mv createMisfit.tcsh createMisfit.csh mv createModel.tcsh... (4 Replies)
Discussion started by: kristinu
4 Replies

6. UNIX for Dummies Questions & Answers

Help creating well-formatted columns

Hi all I'm having a few issues with sorting some data into easily-readable columns. Original data in file: Number of visits IP Address 8 244.44.145.122 8 234.45.165.125 6 225.107.26.10 I firstly tried the column -t command which results in this: Number of ... (4 Replies)
Discussion started by: semaj
4 Replies

7. Shell Programming and Scripting

Creating a loop in csh

I have the following code and want to use a loop to output the results to the fparams file. if ($optparams == 1) then # Set the tdarwin parameters set txt01 = "Call to raytrac.csh" set txt02 = "" set txt03 = "./Scripts/raytrac.csh $*" set txt04 = "" set txt05 =... (0 Replies)
Discussion started by: kristinu
0 Replies

8. Solaris

Creating script adding 3 different variables in 3 columns

I have 3 variables with different information.. they look like this (row-wise aswell): Variable1 = Roland Kalle Dalius Variable2 = ake123 ler321 kaf434 Variable3 = Richardsen Sworden Lokthar How can I sort them by variable3 alphabetical and add them into the same output so... (0 Replies)
Discussion started by: Prantare
0 Replies

9. Programming

Creating a table like format with rows and columns

I have few files which have two columns in each. like e2 1 1 2694 2 4 2485 3 2 2098 5 1 2079 6 5 2022 9 4 1734 11 5 1585 13 2 1461 18 1 1092 21 2 1019 24 1 915 25 3 907 27 1 891 28 3 890 34 1 748 39 1 700 (1 Reply)
Discussion started by: kamuju
1 Replies

10. Shell Programming and Scripting

Creating loop for a script -Perl

Hi Guyz I designed a script that can compare 2 columns(values) of single file and gives the closest numbers to the first column by comparing the numbers in first column with second and it works in a single file. Now I'm trying to design a new script with 2 objectives for 2 files (not a single... (4 Replies)
Discussion started by: repinementer
4 Replies
Login or Register to Ask a Question