Mean of the specific columns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Mean of the specific columns
# 1  
Old 09-28-2009
Mean of the specific columns

I have a input file that has some common values in 1st,2nd and 3rd columns. 4th and 5th are different. Now I would like to print the mean of the fourth column of similar values in 1st.2nd and 3rd columns along with all the values in 5th column.

input
Code:
NM_0    1.22    CR5    0.4    n_21663
NM_0    1.22    CR5    0.1    n_1664
NM_0    1.22    CR5    0.6    n_21665
NM_11    1.36    AK09   0.9    n_19168
NM_11    1.36    AK09    -0.02    n_19169

output
Code:
NM_0    1.22    CR5    0.366  n_21663  n_1664  n_21665  
NM_11    1.36    AK09   0.44  n_19168  n_19169

Thanx in advance
# 2  
Old 09-28-2009
Quote:
Originally Posted by repinementer
output
Code:
NM_0    1.22    CR5    0.366  n_21663  n_1664  n_21665  
NM_11    1.36    AK09   0.44  n_19168  n_19169

Can you explain how you get 0.366 & 0.44 base on your input data Smilie
Quote:
Originally Posted by repinementer
Code:
NM_0    1.22    CR5    0.4    n_21663
NM_0    1.22    CR5    0.1    n_1664
NM_0    1.22    CR5    0.6    n_21665
NM_11    1.36    AK09   0.9    n_19168
NM_11    1.36    AK09    -0.02    n_19169

# 3  
Old 09-28-2009
he has calculated the average...
Code:
(0.4+0.1+0.6)/3=0.3666
(0.9-0.02)/2=0.44

# 4  
Old 09-28-2009
Yes Mr.Vidya is right
# 5  
Old 09-28-2009
Hi,

Something like this?

Code:
awk '{
    ind=sprintf("%s %s %s",$1,$2,$3)
    t[ind]+=$4
    n[ind]++
    s[ind]=s[ind] " " $5
}
END{
    for(i in t) printf "%s %.3f %s\n",i,t[i]/n[i],s[i]
}' file

# 6  
Old 09-28-2009
Thanx ripat. That is excatly what I want

---------- Post updated at 10:01 PM ---------- Previous update was at 09:42 PM ----------

Is it possible to chamnge the out put like this
Thanx

output
Code:
NM_0    1.22    CR5    0.366  n_21663  0.4  n_1664  0.1  n_21665  0.64
NM_11    1.36    AK09   0.44  n_19168  0.9 n_19169 -0.02

# 7  
Old 09-28-2009
Simply change this line:

Code:
    s[ind]=s[ind] " " $5 " " $4

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Converting data from specific columns

i have a file (csv or txt or anything which has 4 columns (id,name,number,location) and it contains data. i want to convert the data of specific columns like name to ooooo and number to 88888 matching the field length of that columns. for example if name column has anthony which is 7, it should... (2 Replies)
Discussion started by: prajaktaraut
2 Replies

2. UNIX for Dummies Questions & Answers

Grep in specific columns

I am trying to search a list of strings from a file and display the string as well as the column in the search file it was found. I dont care about the row. what is wrong with my script? while read line; do awk -v var="$line" '{for(i=1;i<NF;i++) if ($NF==$var) break; print $var FS $NF' }'... (3 Replies)
Discussion started by: senhia83
3 Replies

3. UNIX for Dummies Questions & Answers

Intersection by specific columns

Hi, I'd like to intersect two files by the 4th col of the first file and 6th col of the second file. This is the code I use: awk 'NR==FNR{A;next}$6 File1 File2 However, this is only outputting the second file lines. I'd like to have both lines in a single line separated by a tab. Thanks in... (25 Replies)
Discussion started by: a_bahreini
25 Replies

4. UNIX for Dummies Questions & Answers

Printing lines with specific strings at specific columns

Hi I have a file which is tab-delimited. Now, I'd like to print the lines which have "chr6" string in both first and second columns. Could anybody help? (3 Replies)
Discussion started by: a_bahreini
3 Replies

5. Shell Programming and Scripting

Can't figure out how to find specific characters in specific columns

I am trying to find a specific set of characters in a long file. I only want to find the characters in column 265 for 4 bytes. Is there a search for that? I tried cut but couldn't get it to work. Ex. I want to find '9999' in column 265 for 4 bytes. If it is in there, I want it to print... (12 Replies)
Discussion started by: Drenhead
12 Replies

6. Shell Programming and Scripting

Deleting specific columns

Hi group, Can you please tell how to delete specific columns from a file. I know something like awk -F, '{ print $1" "$2" "15 }' input.txt > output.txt will delete all other columns. But this is in a way to copy some particular columns. But is there any other way to select just some... (11 Replies)
Discussion started by: smitra
11 Replies

7. Homework & Coursework Questions

ls in specific columns

Hello, i need to get the ls output in 2 columns.1st column the directories and 2nd the files... Also each column must be sorted by time... For example if the >>ls command gives me this : /dir2 /dir /dir1 /dir3 file1 file2 I need to take this : /dir file1 /dir1 ... (15 Replies)
Discussion started by: giampoul
15 Replies

8. UNIX for Dummies Questions & Answers

How to delete all columns that start with a specific value

I have this space delimited large text file with more than 1,000,000+ columns and about 100 rows. I want to delete all the columns that start with NA such that: File before modification aa bb cc NA100 dd aa b1 c2 NA101 de File after modification aa bb cc dd aa b1 c2 de How would I... (3 Replies)
Discussion started by: evelibertine
3 Replies

9. Shell Programming and Scripting

Replace specific columns in one file with columns in another file

HELLO! This is my first post here! By the way, I think it is great that people do this. My question: I have two files, one is a .dilm and one is a .txt. It is my understanding that the .dilm file can be treated as a .txt file. I wrote another program where I was able to manipulate it as if it... (3 Replies)
Discussion started by: mehdib
3 Replies

10. Shell Programming and Scripting

Replace specific columns

hi All, Thi sis very urgent. I have large files with pipe delimited. For example: 1.txt 1001024|120|9|-0.0|#| 1001025|120|9|#| 1001026|120|9|#| 1001032|120|2|-0.0|#| 1002026|110|9|#| 1002027|110|9|-0.0|#| 1002028|120|1|1.0|#| I need to replace the 4th filed if it is # by |-| my... (2 Replies)
Discussion started by: jisha
2 Replies
Login or Register to Ask a Question