Sponsored Content
Top Forums Shell Programming and Scripting sum of a column and selecting lines with value above threshold Post 302577952 by f_o_555 on Wednesday 30th of November 2011 10:33:38 AM
Old 11-30-2011
Quote:
Originally Posted by ahamed101
sort the output if that helps i.e. awk '{...}' input_file | sort
or
Try this...
Code:
awk '{sum+=$3;a[++j]=$0}END{for(i=1;i<=j;i++){split(a[i],arr);if(arr[3]/sum > val){print a[i]}}}' val="0.016" input_file

--ahamed
Hi this solution works better than the previous, which didn't work all the time.
e.g.
file1
Code:
eu154     1.51000000        1.70513841e+10
eu154     1.52200000        2.07052522e+09
eu154     1.52200000        2.07052522e+09
eu154     1.53140000        2.07052522e+10
eu154     1.53780000        1.70513841e+11
eu154     1.55400000        4.87182404e+09
eu154     1.59300000        3.53207243e+12
eu154     1.59650000        6.27264499e+12


Code:
awk '{sum+=$3;a[$0]=$3}END{for(i in a){if(a[i]/sum > val){print i}}}' val=0.0 file3

was giving out

Code:
eu154     1.53140000        2.07052522e+10
eu154     1.59300000        3.53207243e+12
eu154     1.52200000        2.07052522e+09
eu154     1.59650000        6.27264499e+12
eu154     1.55400000        4.87182404e+09
eu154     1.51000000        1.70513841e+10
eu154     1.53780000        1.70513841e+11

So it was missing one of the 2 identical lines

Still I have to sort the file with sort afterwards, but the command

Code:
awk '{sum+=$3;a[++j]=$0}END{for(i=1;i<=j;i++){split(a[i],arr);if(arr[3]/sum > val){print a[i]}}}' val="0.0" file3

gives

Code:
eu154     1.51000000        1.70513841e+10
eu154     1.52200000        2.07052522e+09
eu154     1.52200000        2.07052522e+09
eu154     1.53140000        2.07052522e+10
eu154     1.53780000        1.70513841e+11
eu154     1.55400000        4.87182404e+09
eu154     1.59300000        3.53207243e+12
eu154     1.59650000        6.27264499e+12

as expected
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

selecting column in perl

Dear all, I have a rather large file of numbers which i would like to read into a script and then do some maths on a specific column( e.g column). so far i have been using the following awk command awk '{print $4}' infile.txt > out.tmp to strip out the desired column within the in perl... (3 Replies)
Discussion started by: Mish_99
3 Replies

2. Shell Programming and Scripting

Select lines where at least x columns above threshold value

I have a file with 20 columns. I'd like to retain only the lines for which the values in at least x columns, looking only at columns 6-20, are above a threshold. For example, I'd like to retain only the lines in the file below that have at least 8 columns (again, looking only at columns 6-20)... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

3. UNIX for Dummies Questions & Answers

awk to sum column field from duplicate row/lines

Hello, I am new to Linux environment , I working on Linux script which should send auto email based on the specific condition from log file. Below is the sample log file Name m/c usage abc xxx 10 abc xxx 20 abc xxx 5 xyz ... (6 Replies)
Discussion started by: asjaiswal
6 Replies

4. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

5. Shell Programming and Scripting

Selecting a value of column through sqlplus

Hi All, The value of a column in my DB table is: LOAD DATA APPEND INTO TABLE MK9210.EG FIELDS TERMINATED BY '|' TRAILING NULLCOLS ( STUDY CHAR ,PATIENT CHAR ,CPEVENT CHAR ,NAMEG CHAR ,REFIDEG CHAR ,POSEG CHAR ,METHEG CHAR ,EGDAT CHAR ,EGTIM CHAR ,COMEG CHAR ,COM1EG CHAR (5 Replies)
Discussion started by: Pratiksha Mehra
5 Replies

6. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

7. UNIX for Dummies Questions & Answers

Match sum of values in each column with the corresponding column value present in trailer record

Hi All, I have a requirement where I need to find sum of values from column D through O present in a CSV file and check whether the sum of each Individual column matches with the value present for that corresponding column present in the trailer record. For example, let's assume for column D... (9 Replies)
Discussion started by: tpk
9 Replies

8. UNIX for Beginners Questions & Answers

Selecting lines based on the value in the 3rd column.

Hello, I have a sample data like this: A1 B1 100.00 B1 A1 100.00 A2 B2 90.80 B2 A2 90.80 A3 B3 99.07 B3 A3 99.07 A4 B4 99.00 B4 A4 99.00 A5 B5 97.13 B5 A5 99.53 . . Ax By i By Ax j each two lines are same comparison with opposite order. What I expected is... (3 Replies)
Discussion started by: nengcheng
3 Replies

9. Shell Programming and Scripting

awk to Sum columns when other column has duplicates and append one column value to another with Care

Hi Experts, Please bear with me, i need help I am learning AWk and stuck up in one issue. First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique. Second point : For... (1 Reply)
Discussion started by: as7951
1 Replies

10. Shell Programming and Scripting

Sum of a column as new column based on header in a script

Hello, I am trying to store sum of a column as a new column inside a file but have to find the column names dynamically I/p c1,c2,c3,c4,c5 10,20,30,40,50 20,30,40,50,60 If i want to find sum only column c1, c3 and output it as c6,c7 O/p c1,c2,c3,c4,c5,c6,c7 10,20,30,40,50,30,70... (6 Replies)
Discussion started by: mkathi
6 Replies
All times are GMT -4. The time now is 05:11 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy