Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 2,288
Thanks Given: 430
Thanked 480 Times in 395 Posts
Hi.
I like awk solutions. However, I also like packaged solutions. In this case GNU datamash can do the grouping and summing with:
which will sum fields 2 and 3 for items in groups of field 1.
However simple as this appears, there are additional complexities. First, datamash, as with many standard utilities, likes TAB-delimited files by default. Although headers can be ignored, we can combine replacing runs of spaces with a TAB as well as deleting headers with a sed operation. So we can append all modified input files to a single input file, which is also what datamash likes.
As you can imagine, it is best and easiest when the lines for the group operation are collected together. There is a datamash option for such sorting, but your choice of group names are mixed alphabetic and numeric -- perhaps called a hybrid string. A program that can handle that is msort.
This data preparation can be combined into a loop that can handle a number of data files. Here we have added 3 additional data files as an illustration. The script uses as input all file names that begin with the string data -- data1, data2, etc.
Then we can run the command as noted above.
If we want to make the output pretty, we can add a header, and use a simple perl script called align, which aligns fields automatically, but can also be directed to align left, center, right, etc.
With all that in mind, here is a script that shows these operations and the results:
producing:
Here are some details about the utilities used:
Best wishes ... cheers, drl
I like awk solutions. However, I also like packaged solutions. In this case GNU datamash can do the grouping and summing with:
which will sum fields 2 and 3 for items in groups of field 1.
However simple as this appears, there are additional complexities. First, datamash, as with many standard utilities, likes TAB-delimited files by default. Although headers can be ignored, we can combine replacing runs of spaces with a TAB as well as deleting headers with a sed operation. So we can append all modified input files to a single input file, which is also what datamash likes.
As you can imagine, it is best and easiest when the lines for the group operation are collected together. There is a datamash option for such sorting, but your choice of group names are mixed alphabetic and numeric -- perhaps called a hybrid string. A program that can handle that is msort.
This data preparation can be combined into a loop that can handle a number of data files. Here we have added 3 additional data files as an illustration. The script uses as input all file names that begin with the string data -- data1, data2, etc.
Then we can run the command as noted above.
If we want to make the output pretty, we can add a header, and use a simple perl script called align, which aligns fields automatically, but can also be directed to align left, center, right, etc.
With all that in mind, here is a script that shows these operations and the results:
producing:
Here are some details about the utilities used:
Best wishes ... cheers, drl
Awesome drl !Great work, I appreciate all the effort
This User Gave Thanks to Geneanalyst For This Post:
i'm new to awk, and i've been searching on the forum for sum of a column but all the scripts does sum a column of an entire file.
I've a file like this:
cat file.txt
1234 5678
5678 1234
I want to use awk to do sum of each column per line not entire file, compare the two then write the... (1 Reply)
So I need to Write an array processing program using a Linux shell programming language to perform the following.
Load array X of 20 numbers from an input file X.
Load array Y of 20 numbers from an input file Y.
Compute array Z by multiply Xi * Yi then compute the square-root of this... (2 Replies)
Hi,
Sure it's an easy one, but it drives me insane.
input ("|" separated):
1|A,B,C,A
2|A,D,D
3|A,B,B
I would like to count the occurence of each capital letters in $2 across the entire file, knowing that duplicates in each record count as 1.
I am trying to get this output... (5 Replies)
I need your help to discover missing elements for each box.
In theory each box should have 4 items: ITEM01, ITEM02, ITEM08, and ITEM10.
Some boxes either have a missing item (BOX02 ITEM08) or might have da duplicate item (BOX03 ITEM02) and missing another one (BOX03 ITEM01).
file01.txt
... (2 Replies)
Hi All,
need help with reading the array and sum of the array elements.
given an array of integers of size N . You need to print the sum of the elements in the array, keeping in mind that some of those integers may be quite large.
Input Format
The first line of the input consists of an... (1 Reply)
Hi,
Is there a concept of lables in vi editor. In mainframes ISPF editor there is a concept of labels where one can label a line say ".a" and after that wherever you are in the file, if one want to go back to that particular line where the label was set...he could do by "l .a"....Is there... (1 Reply)
Hi
I have redc containing the values 3, 6, 2, 8, and 1.
I have work containing the values 8, 2, 11, 7, and 9.
Is there a way to find the sum of redc and work?
I need to compare the sum of those two arrays to something else, so is it okay to put that into my END?
TY! (4 Replies)
if i declare both but don't input any variables what values will the int array and file pointer array have on default, and if i want to reset any of the elements of both arrays to default, should i just set it to 0 or NULL or what? (1 Reply)
Hi all,
I wanted to access two arrays (of same size) using one for loop.
Ex:
#!/bin/bash
declare -a num
declare -a words
num=(1 2 3 4 5 6 7)
words=(one two three four five six seven)
for num in ${num}
do
echo ":$num: :${words}:"
done
Required Output:
:1: :one: (11 Replies)
PHP question...I posted this on the Web Development forum, but maybe this is a better place!
I have an SQL query that's pulled back user IDs as a set of columns. Rather than IDs, I want to use their names.
So I have an array of columns $col with values 1,7,3,12 etc and I've got an array $person... (3 Replies)