[Solved] Replace NA with column mean


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers [Solved] Replace NA with column mean
# 1  
Old 07-09-2013
[Solved] Replace NA with column mean

Hi experts, I'm looking for a unix solution to replacing "NA" within a matrix with the mean of the column:

Code:
$cat file
ID      a       b       c       d
day     10      5       100     50
cat     20      6       200     50
dog     NA      8       NA      50
car     15      NA      NA      NA

Code:
$cat output
ID      a       b       c       d
day     10      5       100     50
cat     20      6       200     50
dog     15      8       150     50
car     15      6.33    150     50

Many thanks for your help
# 2  
Old 07-09-2013
An approach using awk:
Code:
awk '
        BEGIN {
                CONVFMT = "%.2f"
        }
        NR == 1 {
                print $0
        }
        NR > 1 {
                for ( i = 1; i <= NF; i++ )
                {
                        A[NR,i] = $i
                        if ( $i != "NA" && i != 1 )
                        {
                                T[i] += $i
                                C[i]++
                        }
                }
        }
        END {
                j = 2
                while ( j <= NR )
                {
                        for ( i = 1; i <= NF; i++ )
                        {
                                if ( i == 1 )
                                        s = A[j,i]
                                else
                                        s = s ? s OFS ( A[j,i] == "NA" ? T[i] / C[i] : A[j,i] ) : A[j,i]
                        }
                        print s
                        ++j
                }
        }
' OFS='\t' file

This User Gave Thanks to Yoda For This Post:
# 3  
Old 07-09-2013
Another approach, reading the input file twice:
Code:
awk '
  NR==FNR {
    if(NR>1) for(i=2; i<=NF; i++) if( $i!="NA" ) {
      S[i]+=$i
      N[i]++
    }
    next 
  } 
  {
    if(NR>1) for(i=2; i<=NF; i++) if( $i=="NA" ) $i=S[i]/N[i]
  }
  1
' OFS='\t' CONVFMT="%.2f" file file

This User Gave Thanks to Scrutinizer For This Post:
# 4  
Old 07-09-2013
Thanks to both. Each solution appears to work fine.

Many thanks.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk script to extract a column, replace one of the header and replace year(from ddmmyy to yyyy)

I have a csv which has lot of columns . I was looking for an awk script which would extract a column twice. for the first occurance the header and data needs to be intact but for the second occurance i want to replace the header name since it a duplicate and extract year value which is in ddmmyy... (10 Replies)
Discussion started by: Kunalcurious
10 Replies

2. Shell Programming and Scripting

Get extract and replace column with link in a column where it exists

hi i have sample data a,b,c,d,e,g h http://mysite.xyx z,b,d,f,e,s t http://123124# a,b,c,i,m,nothing d,i,j,e,w,nothing output expected is a,b,c,d,e,http://mysite.xyx z,b,d,f,e,http://123124# a,b,c,i,m,nothing d,i,j,e,w,nothing i can get only links using grep -o 'http.*' i... (8 Replies)
Discussion started by: zozoo
8 Replies

3. Shell Programming and Scripting

[Solved] Replace character in 3rd column and leave 1rst and last

Hello to all, I have the following text where columns are separated by spaces. I want to have the 3rd column separating 3 strings with 2 "_" in the format below: LeftSring_CentralString_RightString So, in 3rd column I want to replace all "_" with "-", except the first and last "_" The... (5 Replies)
Discussion started by: Ophiuchus
5 Replies

4. Shell Programming and Scripting

[Solved] Sorting a column based on another column

hello, I have a file as follows: F0100010 A C F0100040 A G BTA-28763-no-rs 77.2692 F0100020 A G F0100030 A T BTA-29334-no-rs 11.4989 F0100030 A T F0100020 A G BTA-29515-no-rs 127.006 F0100040 A G F0100010 A C BTA-29644-no-rs 7.29827 F0100050 A... (9 Replies)
Discussion started by: Homa
9 Replies

5. Shell Programming and Scripting

[Solved] Sorting a column in a file based on a column in a second file

Hello, I have two files as the following: File1: F0100020 A G F0100030 A T F0100040 A G File2: F0100040 A G BTA-28763-no-rs 77.2692 F0100030 A T BTA-29334-no-rs 11.4989 F0100020 A G BTA-29515-no-rs 127.006 I want to sort the second file based on the... (6 Replies)
Discussion started by: Homa
6 Replies

6. UNIX for Dummies Questions & Answers

[Solved] Deleting all rows where the first column equals the second column

Hi, I have a tab delimited text file where the first two columns equal numbers. I want to delete all rows where the value in the first column equals the second column. How do I go about doing that? Thanks! Input: 1 1 ABC DEF 2 2 IJK LMN 1 2 ZYX OPW Output: 1 2 ZYX OPW (2 Replies)
Discussion started by: evelibertine
2 Replies

7. Shell Programming and Scripting

Replace column that matches specific pattern, with column data from another file

Can anyone please help with this? I have 2 files as given below. If 2nd column of file1 has pattern foo1@a, find the matching 1st column in file2 & replace 2nd column of file1 with file2's value. file1 abc_1 foo1@a .... abc_1 soo2@a ... def_2 soo2@a .... def_2 foo1@a ........ (7 Replies)
Discussion started by: prashali
7 Replies

8. Shell Programming and Scripting

Match column 3 in file1 to column 1 in file 2 and replace with column 2 from file2

Match column 3 in file1 to column 1 in file 2 and replace with column 2 from file2 file 1 sample SNDK 80004C101 AT XLNX 983919101 BB NETL 64118B100 BS AMD 007903107 CC KLAC 482480100 DC TER 880770102 KATS ATHR 04743P108 KATS... (7 Replies)
Discussion started by: rydz00
7 Replies

9. Shell Programming and Scripting

awk/sed column replace using column header - help

$ cat log.txt Name Age Sex Lcation nfld alias xsd CC 25 M XYZ asx KK Y BB 21 F XAS awe SS N SD 21 M AQW rty SD A How can I replace the column with header "Lcation" with the column with header "alias" and delete the "alias" column? so that the final output will become: Name Age Sex... (10 Replies)
Discussion started by: jkl_jkl
10 Replies

10. Shell Programming and Scripting

Replace 10th column with a new column--- Terriblly hurry

Hi Can any one tell me how to replace the 10th column in a file(comma delimted) with a new file with a single column. Can any one Help me out of the please as soon as possible as i am in a terribley hurry!!!!!! Many THanks, (2 Replies)
Discussion started by: ahmedwaseem2000
2 Replies
Login or Register to Ask a Question