awk variation question


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk variation question
# 1  
Old 10-05-2012
awk variation question

I have a file like this:
HTML Code:
ASSPASVFETQY,hTRBV12-4,hTRBJ2-5,2
ASSPASTGGDYGYT,hTRBV18,hTRBJ1-2,2
ASSPASGDGYT,hTRBV5-1,hTRBJ1-2,2
ASSPASFPEDTQY,hTRBV27,hTRBJ2-3,2
ASSPARVNYGYT,hTRBV5-1,hTRBJ1-2,2
ASSPARTSGGLNEQF,hTRBV6-4,hTRBJ2-1,2
ASSPARQSYNEQF,hTRBV11-1,hTRBJ2-1,2
ASSPARQGARVDQPQH,hTRBV18,hTRBJ1-5,2
ASSPAQPRTNEKLF,hTRBV18,hTRBJ1-4,2
ASSPAQGYHTEAF,hTRBV18,hTRBJ1-1,2
The first field is a unique sequence, the 2nd and 3ed fields are its property. There are 45 different property in 2nd field and 13 different property in 3ed field. The 4th field shows each sequence's frequency. The records of this file may have more the 100,000.

What I want to do is find out the total frequency of the record belongs to each 2nd coupled with 3ed property. Not single 2nd or 3ed property.

I can use a single line code to get each combination, but it has too many steps to be done. How can I do it using script with variation? And how should I to print it out into a file.
Code:
awk -F"," '$2 == "hTRBV6-3" && $3 == "hTRBJ2-2"{sum += $4} END {print sum}' file

Thanks !
# 2  
Old 10-05-2012
Thanks for the last example, it finally made it 'click' what you wanted.

awk lets you use strings as an array index, so:

Code:
awk '{ A[$2" "$3]+=$4 } END { for(X in A) print X, A[X] }' filename


Last edited by Corona688; 10-05-2012 at 02:38 PM.. Reason: removed extra '
This User Gave Thanks to Corona688 For This Post:
# 3  
Old 10-05-2012
Quote:
Originally Posted by Corona688
Thanks for the last example, it finally made it 'click' what you wanted.

awk lets you use strings as an array index, so:

Code:
awk '{ A[$2" "$3]+=$4 } END { for(X in A) print X, A[X] }' filename

NICE code! Thank you!

If there is no 4th field, how should I get the each sum records in each array?
# 4  
Old 10-05-2012
If there's no 4th field, what's it adding? 1?

Code:
awk '{ A[$2" "$3]++ } END { for(X in A) print X, A[X] }' filename

This User Gave Thanks to Corona688 For This Post:
# 5  
Old 10-06-2012
Quote:
Originally Posted by Corona688
If there's no 4th field, what's it adding? 1?

Code:
awk '{ A[$2" "$3]++ } END { for(X in A) print X, A[X] }' filename

The code works well! Thank you!

Can the array in "A" be printed in matrix style?

I want to input that into R, I don't know I should do it in awk or R?

Thank you!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk question : system output to awk variable.

Hi Experts, I am trying to get system output to capture inside awk , but not working: Please advise if this is possible : I am trying something like this but not working, the output is coming wrong: echo "" | awk '{d=system ("date") ; print "Current date is:" , d }' Thanks, (5 Replies)
Discussion started by: rveri
5 Replies

2. UNIX for Dummies Questions & Answers

[Solved] Slight variation from the desired results

Hello, I am writing a small one liner script to display the tables in my database. I am working with Centos 5.5 and postgresql the command is "psql -c "\dt" | awk '{print$3}'" I just want the 3rd column from the result set, but now the problem is I am getting the third column but with... (3 Replies)
Discussion started by: nnani
3 Replies

3. UNIX for Dummies Questions & Answers

scp shows size variation

Hi i have folder of 26 GB on server A and want to copy to server B .i used the below commands to check file size and scp copy du -h /folder : its shows 26G on server A from server B: scp -r user@serverA:/folder/* ./copying got initiated and i am checking the file size on server B... (7 Replies)
Discussion started by: rakeshkumar
7 Replies

4. UNIX for Dummies Questions & Answers

top's USER column width variation

hello, does anyone know how to expand the column width so it could contain full USER cell and not cut it in top ? Now it has eleven symbols but I can see only eight actualy I found only PID, PPID and %CPU columns variation possibilities in changelog (procps v.3.2.5). thanks in advance. (1 Reply)
Discussion started by: bugs_moran
1 Replies

5. UNIX for Dummies Questions & Answers

performance variation between two commands

does it make any difference in terms of performance while using any of the below mentioned code for the same requirement which processes continuously coming files in the I/P directory . Please provide ur viewws ls -tr $SAPRESPONSEGOFILE | sed "s/go/dat/g" | while read SAPRESPONSEFILES... (3 Replies)
Discussion started by: praviper
3 Replies

6. UNIX for Dummies Questions & Answers

Basic awk question...getting awk to act on $1 of the command itself

I have a script problem that I am not able to solve due my very limited understanding of unix/awk. This is the contents of test.sh awk '{print $1}' From the prompt if I enter: ./test.sh Hello World I would expect to see "Hello" but all I get is a blank line. Only then if I enter "Hello... (2 Replies)
Discussion started by: JasonHamm
2 Replies

7. Shell Programming and Scripting

awk question

I'm writing a script that gets and removes all directories except for certain predefined ones. I'm currently using awk, but I admit that I'm not terribly comfortable with it. My current code works for directories that don't contain any spaces, but for those that do, it just passes the final word... (7 Replies)
Discussion started by: sladuuch
7 Replies

8. Shell Programming and Scripting

awk question

Hello Peeps, How can I use awk to strip the filename off the end please ? /dir/dir/dir/dir/filename ? Thanks, Dave (34 Replies)
Discussion started by: d__browne
34 Replies

9. HP-UX

awk question

Hi, I have a awk script which takes the input as dat file and generates doc file as output. This script is running fine for long time. But lately users are complaining about one problem. If there are multiple parts number it should display the information by part numbers. like this: ... (12 Replies)
Discussion started by: isingh786
12 Replies

10. Shell Programming and Scripting

awk question

I have the following error: ls -lt | awk 'BEGIN NR > 1 { print $2, $9 }' Syntax Error The source line is 1. The error context is BEGIN >>> NR <<< > 1 { print $2, $9 } awk: 0602-500 Quitting The source line is 1. What I want to do is ls a directory, skip the first... (3 Replies)
Discussion started by: lesstjm
3 Replies
Login or Register to Ask a Question