Counting specific column and add result in output


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Counting specific column and add result in output
# 1  
Old 10-31-2013
Counting specific column and add result in output

Hi all,
I have a quick question:
I have a 4 column tab-separated file.
I want to count the number of times each unique value in column 2 appears and add that number in a 5th column.

I have the following input file:

Code:
waterline-n    below-sheath-v    14.8097    A
dock-n    below-sheath-v     14.5095    B
waterline-n    below-steel-n    11.0330    A
picnic-n    below-steel-n    12.2277    C
wavefront-n    at-part-of-variance-n    18.4888    L
wavefront-n    between-part-of-variance-n    17.0656    A
audience-b    between-part-of-variance-n    17.6346    B
game-n    between-part-of-variance-n    14.9652    C
whereabouts-n    become-rediscovery-n    11.3556    L
whereabouts-n    get-tee-n    10.9091    L


and the following is the desired output

Code:
waterline-n    below-sheath-v    14.8097    A   2
dock-n    below-sheath-v     14.5095    B   2
waterline-n    below-steel-n    11.0330    A   2
picnic-n    below-steel-n    12.2277    C   2
wavefront-n    at-part-of-variance-n    18.4888    L   1
wavefront-n    between-part-of-variance-n    17.0656    A   3
audience-b    between-part-of-variance-n    17.6346    B   3
game-n    between-part-of-variance-n    14.9652    C   3
whereabouts-n    become-rediscovery-n    11.3556    L   1
whereabouts-n    get-tee-n    10.9091    L  1

How can I combine sort and grep for the desired output?
Thank you.
# 2  
Old 10-31-2013
Try this:
Code:
awk 'NR==FNR{a[$2]++;next}{print $0 "\t" a[$2]}' file file

This User Gave Thanks to Franklin52 For This Post:
# 3  
Old 10-31-2013
Thanks franklin52:

I just realized that your script counts all of the unique instances in Column 2.
However, I need to count the number of different items in Column 4 that a value in Column 2 occurs with in --- in this case, real frequency does not matter.

I will provide more sample files, in case my question was not clear:

For instance this input file:

Code:
waterline-n    below-sheath-v    14.8097    A 
dock-n    below-sheath-v     14.5095    B 
waterline-n    below-steel-n    11.0330    A 
picnic-n    below-steel-n    12.2277    C 
game-n    below-steel-n    12.2277    D 
dock-n    below-steel-n    12.2277    D 
wavefront-n    at-part-of-variance-n    18.4888    L 
wavefront-n    between-part-of-variance-n    17.0656    A 
audience-b    between-part-of-variance-n    17.6346    B 
game-n    between-part-of-variance-n    14.9652    C 
whereabouts-n    become-rediscovery-n    11.3556    L 
whereabouts-n    get-tee-n    10.9091    L

should yield this output file:

Code:
waterline-n    below-sheath-v    14.8097    A   2
 dock-n    below-sheath-v     14.5095    B   2 
waterline-n    below-steel-n    11.0330    A    3
 picnic-n    below-steel-n    12.2277    C   3 
game-n    below-steel-n    12.2277    D 3 
dock-n    below-steel-n    12.2277    D 3 
wavefront-n    at-part-of-variance-n    18.4888    L    1 
wavefront-n    between-part-of-variance-n    17.0656    A   3
 audience-b    between-part-of-variance-n    17.6346    B    3 
game-n    between-part-of-variance-n    14.9652    C    3 
whereabouts-n    become-rediscovery-n    11.3556    L   1
 whereabouts-n    get-tee-n    10.9091    L  1

Where "below-steal-n" acutally occurs 4 times --- yet with 3 different items in Column 4 --- thus, in Column 5 its result is 3.

Last edited by owwow14; 10-31-2013 at 02:08 PM.. Reason: FIXED ERROR
# 4  
Old 10-31-2013
Code:
awk '
{ idx=($2 SUBSEP $4)}
FNR==NR {if (!(idx in c)) {c[idx]++;cf2[$2]++};next}
{ print $0, cf2[$2] }' inFile inFile

Not sure if your sample output is correct based on your description: the 'between-part-of-variance-n' fields are not counted correctly.
Pls validate/clarify
This User Gave Thanks to vgersh99 For This Post:
# 5  
Old 10-31-2013
Your are correct.
I revised the answer to reflect the change.
Human error!
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Add column and multiply its result to all elements of another column

Input file is as follows: 1 | 6 2 | 7 3 | 8 4 | 9 5 | 10 Output reuired (sum of the first column $1*$2) 1 | 6 | 90 2 | 7 | 105 3 | 8 | 120 4 |9 | 135 5 |10 | 150 Please enclose sample input, sample output, and code... (5 Replies)
Discussion started by: Sagar Singh
5 Replies

2. Shell Programming and Scripting

Overwrite specific column in xml file with the specific column from adjacent line

I have an xml file dumped from rrd file, that I want to "patch" so the xml file doesn't contain any blank hole in the resulting graph of the rrd file. Here is the file. <!-- 2015-10-12 14:00:00 WIB / 1444633200 --> <row><v> 4.0419731265e+07 </v><v> 4.5045912770e+06... (2 Replies)
Discussion started by: rk4k
2 Replies

3. Shell Programming and Scripting

Counting non-specific occurrences within a file.

I'm pretty new to scripting and didn't see an example of this issue yet. I am trying to count and print the total number of times each value is found within a file. Here is a short example of my starting file. value 3 value 3 value 3 value 3 value 4 value 6 value 6 value 6 value 6... (3 Replies)
Discussion started by: funkynmr
3 Replies

4. Shell Programming and Scripting

Counting specific words from the log

Hi, I need a shell script which can provide details from error logs like this Aug 23 21:19:41 red mountd: authenticated mount request from bl0110.bang.m pc.local:651 for /disk1/jobs (/disk1) Aug 23 08:49:52 red dhcpd: DHCPDISCOVER from 00:25:90:2b:cd:7c via eth0: unknown client Aug 24... (2 Replies)
Discussion started by: ratheeshp
2 Replies

5. Shell Programming and Scripting

Counting rows line by line from a specific column using Awk

Dear UNIX community, I would like to to count characters from a specific row and have them displayed line-by-line. I have a file called testAwk2.csv which contain the following data: rabbit penguin goat giraffe emu ostrich I would like to count in the middle row individually... (4 Replies)
Discussion started by: vnayak
4 Replies

6. Shell Programming and Scripting

4 column tsv file, output 1 specific column

Hello all siteexplorer.search.yahoo.com can output results in tsv format, when opened in excel I get 4 columns. I would like to wget that file, which I can do. I would then like to pull the 2nd column and output it only. I've searched around and found a few bits and pieces but nothing I've... (6 Replies)
Discussion started by: casphar
6 Replies

7. Shell Programming and Scripting

Counting the differences based on a specific rule

Hi, I've been trying to create a perl file to run something very specific. But I'm not getting any success. I'm not very good with hashing. I have a file with two columns (tab separated) (already sorted) 99890 + 100281 + 104919 - 109672 + 113428 - 114501 + 115357 + 115598 ... (7 Replies)
Discussion started by: labrazil
7 Replies

8. UNIX for Dummies Questions & Answers

egrep counting every 2 lines of result as 1

Hi, Can someone help me count this line: Say I have a file (file1.txt) that contains below: 11/16 13:08:19.5436 18096 --- Generating a <reading> event 11/16 13:08:19.7784 18096 ---- Sending a <writing> event 11/16 13:08:37.4516 18096 --- Generating a <reading> event 11/16... (1 Reply)
Discussion started by: Orbix
1 Replies

9. UNIX for Dummies Questions & Answers

Counting occurences of specific charachter in a file

Hi, I need to count the number of occurences of the character " in a file that contains huge number of records. What command could I use? Please specify in detail since I am new :| Thanks much. (3 Replies)
Discussion started by: GMMike
3 Replies
Login or Register to Ask a Question