Sum values of specific column in multiple files, considering ranges defined in another file


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
# 1  
Sum values of specific column in multiple files, considering ranges defined in another file

I have a file (let say file B) like this:

Code:
 File B:  
A1  3  5  
A1  7  9  
A2  2  5  
A3  1  3

The first column defines a filename and the other two define a range in that specific file. In the same directory, I have also three more files (File A1, A2 and A3). Here is 10 sample lines from each file:
Code:
File A1:           File A2:     File A3:   
    1   0.6           1  0.2       1  0.1       
    2   0.04          2  0.1       2  0.2 
    3   0.4           3  0.2       3  0.5 
    4   0.5           4  0.4       4  0.3 
    5   0.009         5  0.2       5  0.7
    6   0.2           6  0.3       6  0.3
    7   0.3           7  0.8       7  0.3   
    8   0.2           8  0.1       8  0.2 
    9   0.15          9  0.9       9  0.8 
    10  0.1           10 0.4       10 0.1

I need to add a new column to file B, which in each line gives the sum of values of column two in the defined range and file. For example, file B row 1 means that calculate the sum of values of line 3 to 5 in column two of file A1. The desired output is something like this:

Code:
File B:
    A1  3  5  0.909    
    A1  7  9  0.65  
    A2  2  5  0.9  
    A3  1  3  0.8

All files are in tabular text format. How can I perform this task? I have access to bash (ubuntu 14.04) and R but not an expert bash or R programmer. Any help would be greatly appreciated. Thanks in advance
# 2  
For above sample, this should do
Code:
awk     '       {printf "%s ", $0
                 FN=$1; ST=$2; EN=$3;
                 while ($1+0 < EN)      {getline < FN
                                         if ($1 >= ST) sum+=$2}
                                         printf "%s\n", sum
                                         sum=0}
        ' file
A1  3  5 0.909
A1  7  9 0.65
A2  2  5 0.9
A3  1  3 0.8

It is not aimed at nor tested for severely different input data.

Last edited by RudiC; 12-08-2014 at 05:37 PM..
This User Gave Thanks to RudiC For This Post:
# 3  
Code:
while read f1 f2 f3
do
 awk 'NR==start,NR==end {sum+=$2} END {print FILENAME,start,end,sum}' start="$f2" end="$f3" "$f1"
done < B > B.new

If B.new looks okay, you can rename it to B
Code:
mv B.new B

This User Gave Thanks to MadeInGermany For This Post:
# 4  
Thank you for your prompt reply RudiC and MadeInGermany. I will try your code today and post the results.

---------- Post updated at 03:10 PM ---------- Previous update was at 10:30 AM ----------

It solved the problem. Thank you for your clear and effective code and helping me.

---------- Post updated at 09:06 PM ---------- Previous update was at 03:10 PM ----------

sorry MadeInGermany, but I got an error running the code on real data. The code performs well in sample data (f1, f2, f3). but when I run it on multiple files I get this error:
Code:
read: f100: bad variable name

I renamed the file to '100' with no success. I also renamed to 'hundred' but the error was repeated for the next file (f101). Unfortunately googling wasn't helpful.
Thank you for your time and consideration.
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #338
Difficulty: Easy
telnet is considered safer than ssh for most remote login applications.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to sum value of a column by range defined in another file awk?

I have two files, file1.table is the count table, and the other is the range condition file2.range. file1.table chr start end count N1 0 48 1 N1 48 181 2 N1 181 193 0 N1 193 326 2 N1 326 457 0 N1 457 471 1 N1 471 590 2 N1 590 604 1 N1 604 752 1 N1 752 875 1 file2.range... (12 Replies)
Discussion started by: yifangt
12 Replies

2. UNIX for Beginners Questions & Answers

Copy columns from one file into another and get sum of column values and row count

I have a file abc.csv, from which I need column 24(PurchaseOrder_TotalCost) to get the sum_of_amounts with date and row count into another file say output.csv abc.csv- UTF-8,,,,,,,,,,,,,,,,,,,,,,,,, ... (6 Replies)
Discussion started by: Tahir_M
6 Replies

3. Shell Programming and Scripting

Do replace operation and awk to sum multiple columns if another column has duplicate values

Hi Experts, Please bear with me, i need help I am learning AWk and stuck up in one issue. First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique. Second point : For... (12 Replies)
Discussion started by: as7951
12 Replies

4. UNIX for Dummies Questions & Answers

Match sum of values in each column with the corresponding column value present in trailer record

Hi All, I have a requirement where I need to find sum of values from column D through O present in a CSV file and check whether the sum of each Individual column matches with the value present for that corresponding column present in the trailer record. For example, let's assume for column D... (9 Replies)
Discussion started by: tpk
9 Replies

5. Shell Programming and Scripting

Need to sum up a column value from multiple files into Credit and Debit categories using awk command

i have multiple files with Batch Header, Record detail & Batch trailer data in the files like : BH 20150225950050N8262 RD 20140918000000 99999999 unk Deferred Settlement -13950 BT01 -13950 *Above sample data donot have the spaces coorectly defined. I do have multiple batch trailer... (1 Reply)
Discussion started by: kcdg859
1 Replies

6. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

7. Shell Programming and Scripting

Sum of a column in multiple files

I am performing the following operation on a file that looks like this 1000 0 10 479.0 1115478.07497 0.0 0.0 0.0872665 1000 10 20 1500.0 3470012.29304 0.0 0.0 0.261799 1000 20 30 2442.0 5676346.87758 0.0 0.0 0.436332 1000 30 40 3378.0 7737905.30957 0.0 0.0 0.610865 1000 40 50 4131.0... (2 Replies)
Discussion started by: kayak
2 Replies

8. Shell Programming and Scripting

Divide data with specific column values into separate files

hello! i need a little help from you :) ... i need to split a file into separate files depending on two conditions using scripting. The file has no delimiters. The conditions are col 17 = "P" and col 81 = "*", this will go to one output file; col 17 = "R" and col 81 = " ". Here is an example. ... (3 Replies)
Discussion started by: chanclitas
3 Replies

9. Shell Programming and Scripting

extracting columns falling within specific ranges for multiple files

Hi, I need to create weekly files from daily records stored in individual monthly filenames from 1999-2010. my sample file structure is like the ones below: daily record stored per month: 199901.xyz, 199902.xyz, 199903.xyz, 199904.xyz ...199912.xyz records inside 199901.xyz (original data... (4 Replies)
Discussion started by: ida1215
4 Replies

10. Shell Programming and Scripting

print unique values of a column and sum up the corresponding values in next column

Hi All, I have a file which is having 3 columns as (string string integer) a b 1 x y 2 p k 5 y y 4 ..... ..... Question: I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
Discussion started by: amigarus
6 Replies

Featured Tech Videos