Sum elements of 2 arrays excluding labels


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sum elements of 2 arrays excluding labels
# 1  
Old 03-24-2018
Question Sum elements of 2 arrays excluding labels

I'm looking for an efficient way to sum elements from 2 arrays using AWK and preserve header as well as sample names in the output array. I have Ubuntu 16.04 LTS. For example;

ARRAY 1

Code:
SAMPLE    DERIVED    ANCESTRAL
Sample1    14352    0
Sample2    14352    0
Sample3    14352    0
Sample4    9880    4472
Sample5    9786    4566
Sample6    9846    4506
Sample7    9787    4565
Sample8    9800    4552
Sample9    9764    4588
Sample10    9760    4592
Sample11    9691    4661
Sample12    9798    4554
Sample13    9740    4612


ARRAY 2:

Code:
SAMPLE    DERIVED    ANCESTRAL
Sample1    14352    0
Sample2    14352    0
Sample3    14352    0
Sample4    13674    678
Sample5    13749    603
Sample6    13701    651
Sample7    13682    670
Sample8    13677    675
Sample9    13684    668
Sample10    13674    678
Sample11    13642    710
Sample12    13679    673
Sample13    13713    639


DESIRED OUTPUT ARRAY:
Code:
SAMPLE    TOTAL DERIVED    TOTAL ANCESTRAL
Sample1    28704    0
Sample2    28704    0
Sample3    28704    0
Sample4    23554    5150
Sample5    23535    5169
Sample6    23547    5157
Sample7    23469    5235
Sample8    23477    5227
Sample9    23448    5256
Sample10    23434    5270
Sample11    23333    5371
Sample12    23477    5227
Sample13    23453    5251

# 2  
Old 03-25-2018
Hello Geneanalyst,

Could you please try following and let me know if this helps you.

Code:
 awk 'BEGIN{print "SAMPLE    TOTAL DERIVED    TOTAL ANCESTRAL"} FNR==NR{a[$1]=$2;b[$1]=$3;next} FNR>1{$2=$1 in a?$2+a[$1]:$2;$3=$1 in b?$3+b[$1]:$3;print}' array1  array2 | column -t

Where I am considering that array1 and array2 are Input_file(s) names which you mentioned here.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 03-25-2018
Thanks Ravinder, works good!

What if instead of 2 arrays one needs to sum 3 arrays; array1 array2 array3.
# 4  
Old 03-25-2018
Hi, try:
Code:
awk '{split($0,F)} getline<f>0 && NR>1{$2+=F[2]; $3+=F[3]}1' f=array1 array2

Or you could try an approach like this which should work with 2 or more arrays:
Code:
awk '
  FNR==1 {
    if(NR==1)
      print
    next
  }
  {
    S[FNR]=$1
    D[$1]+=$2
    A[$1]+=$3
  }
  END {
    for(i=2; i<=FNR; i++)
      print S[i], D[S[i]], A[S[i]]
  }
' array1 array2 ... arrayn

This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 03-25-2018
Try also
Code:
awk '
BEGIN   {FMT = "%-13s%-17s%-17s\n"
         printf FMT, "SAMPLE", "TOTAL DERIVED", "TOTAL ANCESTRAL"
        }
FNR > 1 {SEQ[FNR]  = $1
         SUMD[$1] += $2
         SUMA[$1] += $3
        }
END     {for (i=2; i<=FNR; i++) printf FMT, SEQ[i], SUMD[SEQ[i]], SUMA[SEQ[i]]
        }
'  array[12]
SAMPLE       TOTAL DERIVED    TOTAL ANCESTRAL  
Sample1      28704            0                
Sample2      28704            0                
Sample3      28704            0                
Sample4      23554            5150             
Sample5      23535            5169             
Sample6      23547            5157             
Sample7      23469            5235             
Sample8      23477            5227             
Sample9      23448            5256             
Sample10     23434            5270             
Sample11     23333            5371             
Sample12     23477            5227             
Sample13     23453            5251

This User Gave Thanks to RudiC For This Post:
# 6  
Old 03-27-2018
Quote:
Originally Posted by RudiC
Try also
Code:
awk '
BEGIN   {FMT = "%-13s%-17s%-17s\n"
         printf FMT, "SAMPLE", "TOTAL DERIVED", "TOTAL ANCESTRAL"
        }
FNR > 1 {SEQ[FNR]  = $1
         SUMD[$1] += $2
         SUMA[$1] += $3
        }
END     {for (i=2; i<=FNR; i++) printf FMT, SEQ[i], SUMD[SEQ[i]], SUMA[SEQ[i]]
        }
'  array[12]
SAMPLE       TOTAL DERIVED    TOTAL ANCESTRAL  
Sample1      28704            0                
Sample2      28704            0                
Sample3      28704            0                
Sample4      23554            5150             
Sample5      23535            5169             
Sample6      23547            5157             
Sample7      23469            5235             
Sample8      23477            5227             
Sample9      23448            5256             
Sample10     23434            5270             
Sample11     23333            5371             
Sample12     23477            5227             
Sample13     23453            5251


Hey RudiC,

Could you explain the various steps of your code whenever you have time.
# 7  
Old 03-28-2018
Here is a copy of RudiC's code with comments added describing what each section of code is doing:
Code:
awk '	# Invoke awk and start script to be run by awk.
BEGIN	{# Before any lines are read from any input file, set the format string
	 # to be used by all print statements in this script and print the output
	 # header line.
	 FMT = "%-13s%-17s%-17s\n"
	 printf FMT, "SAMPLE", "TOTAL DERIVED", "TOTAL ANCESTRAL"
	}
FNR > 1	{# For all lines in each input file except the header line, save the
	 # sample name associated with that input line and accumulate the derived
	 # and ancestral values associated with that sample name.
	 SEQ[FNR]  = $1
	 SUMD[$1] += $2
	 SUMA[$1] += $3
	}
END	{# After all input files have been processed, for each line found in the
	 # last input file, print the sample name and the accumulated derived
	 # and ancestral totals.
	 for (i=2; i<=FNR; i++) printf FMT, SEQ[i], SUMD[SEQ[i]], SUMA[SEQ[i]]
	}
' array[12]	# End the awk script and list of files to be processed.

These 2 Users Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk sum of 2 arrays and compare

i'm new to awk, and i've been searching on the forum for sum of a column but all the scripts does sum a column of an entire file. I've a file like this: cat file.txt 1234 5678 5678 1234 I want to use awk to do sum of each column per line not entire file, compare the two then write the... (1 Reply)
Discussion started by: chofred
1 Replies

2. UNIX for Beginners Questions & Answers

Multiply elements of 2 arrays together into another array

So I need to Write an array processing program using a Linux shell programming language to perform the following. Load array X of 20 numbers from an input file X. Load array Y of 20 numbers from an input file Y. Compute array Z by multiply Xi * Yi then compute the square-root of this... (2 Replies)
Discussion started by: sarapham409
2 Replies

3. UNIX for Beginners Questions & Answers

Awk: count unique elements in a field and sum their occurence across the entire file

Hi, Sure it's an easy one, but it drives me insane. input ("|" separated): 1|A,B,C,A 2|A,D,D 3|A,B,B I would like to count the occurence of each capital letters in $2 across the entire file, knowing that duplicates in each record count as 1. I am trying to get this output... (5 Replies)
Discussion started by: beca123456
5 Replies

4. Shell Programming and Scripting

Compare multiple arrays elements using awk

I need your help to discover missing elements for each box. In theory each box should have 4 items: ITEM01, ITEM02, ITEM08, and ITEM10. Some boxes either have a missing item (BOX02 ITEM08) or might have da duplicate item (BOX03 ITEM02) and missing another one (BOX03 ITEM01). file01.txt ... (2 Replies)
Discussion started by: alex2005
2 Replies

5. Shell Programming and Scripting

Help reading the array and sum of the array elements

Hi All, need help with reading the array and sum of the array elements. given an array of integers of size N . You need to print the sum of the elements in the array, keeping in mind that some of those integers may be quite large. Input Format The first line of the input consists of an... (1 Reply)
Discussion started by: nishantrefound
1 Replies

6. UNIX for Dummies Questions & Answers

Labels in VI

Hi, Is there a concept of lables in vi editor. In mainframes ISPF editor there is a concept of labels where one can label a line say ".a" and after that wherever you are in the file, if one want to go back to that particular line where the label was set...he could do by "l .a"....Is there... (1 Reply)
Discussion started by: whoami191
1 Replies

7. Shell Programming and Scripting

How do I find the sum of values from two arrays?

Hi I have redc containing the values 3, 6, 2, 8, and 1. I have work containing the values 8, 2, 11, 7, and 9. Is there a way to find the sum of redc and work? I need to compare the sum of those two arrays to something else, so is it okay to put that into my END? TY! (4 Replies)
Discussion started by: razrnaga
4 Replies

8. Programming

question about int arrays and file pointer arrays

if i declare both but don't input any variables what values will the int array and file pointer array have on default, and if i want to reset any of the elements of both arrays to default, should i just set it to 0 or NULL or what? (1 Reply)
Discussion started by: omega666
1 Replies

9. Shell Programming and Scripting

How to access the elements of two arrays with a single loop using the inbuilt index.

Hi all, I wanted to access two arrays (of same size) using one for loop. Ex: #!/bin/bash declare -a num declare -a words num=(1 2 3 4 5 6 7) words=(one two three four five six seven) for num in ${num} do echo ":$num: :${words}:" done Required Output: :1: :one: (11 Replies)
Discussion started by: 14341
11 Replies

10. Shell Programming and Scripting

PHP arrays as array elements

PHP question...I posted this on the Web Development forum, but maybe this is a better place! I have an SQL query that's pulled back user IDs as a set of columns. Rather than IDs, I want to use their names. So I have an array of columns $col with values 1,7,3,12 etc and I've got an array $person... (3 Replies)
Discussion started by: JerryHone
3 Replies
Login or Register to Ask a Question