Summing column value - using PERL


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Summing column value - using PERL
# 1  
Old 02-24-2010
Java Summing column value - using PERL

I'm new to perl programming. I've a csv file as below.

Code:
20100221, abc_1, 200
20100221, abc_4, 350
20100221, opq_3, 200
20100221, abc_5, 220
20100221, xyz_1, 500
20100221, abc_2, 500
20100221, abc_3, 100
20100221, xyz_2, 700
20100221, opq_2, 350
20100221, xyz_3, 100
20100221, opq_1, 230

I'm trying to get output as below, by adding matching values:

Code:
Output:

20100221, abc, 1370
20100221, xyz, 1300
20100221, opq, 780

I've started coding as below, but not able to continue due to lack of experience.

Code:
use strict;
use warnings;

my $inputfilename  = $ARGV[0]; ### CSV file name. ###

open( my $in_fh, "<", $inputfilename ) or die "Can't open $inputfilename : $!";

while (my $line = <$in_fh>) {
  chomp($line);
  my @columns = split(',', $line, 2);


Could any of you help me to accomplish my requirement?

Thanks and Regards / Lokesha
# 2  
Old 02-24-2010
Try this
Code:
use strict;
use warnings;

my $inputfilename  = "file1.txt" ;

open( my $in_fh, "<", $inputfilename ) or die "Can't open $inputfilename : $!";
my @columns  ;
my %store ;
while (my $line = <$in_fh>) {
  chomp($line);
#@columns = split(',', $line, 2);
        $line =~ /^(.*?)\s*,(.*?)\s*,(.*?)\s*$/ ;
        my $no = $1 ;
        my $cut = $2 ;
        my $cnt = $3 ;
        $cut =~ s/^(.*?)_[0-9]*$/$1/;
        $store{$no}{$cut}+=$cnt ;
  }

foreach my $no ( keys%store)
{

foreach my $cut ( keys%{$store{$no}})
{
    print "$no $cut $store{$no}{$cut} \n";
}

}

# 3  
Old 02-24-2010
One using awk:

Code:
awk -F, '{ A[$1 substr($2,1,4)]+=$3; }END{ for(i in A) print i,A[i]; }' file

One in perl,

Code:
while (<DATA>)
{
@A=split ",|_",$_;
$h{"$A[0],$A[1]"} += $A[3];
}
while (($k,$v)= each(%h)) { print join ",",( $k,$v,"\n"); }

__DATA__
20100221, abc_1, 200
20100221, abc_4, 350
20100221, opq_3, 200
20100221, abc_5, 220
20100221, xyz_1, 500
20100221, abc_2, 500
20100221, abc_3, 100
20100221, xyz_2, 700
20100221, opq_2, 350
20100221, xyz_3, 100
20100221, opq_1, 230


Last edited by dennis.jacob; 02-24-2010 at 05:26 AM..
# 4  
Old 02-24-2010
Thanks for the reply to all, and for the provided different solutions.

I tried abubacker's solution and got the below result with warning messages and one unwanted additional line, as below:

Code:
Use of uninitialized value in substitution (s///) at GEN_Applications_Duration.pl line 17, <$in_fh> line 12.
Use of uninitialized value in hash element at GEN_Applications_Duration.pl line 18, <$in_fh> line 12.
Use of uninitialized value in hash element at GEN_Applications_Duration.pl line 18, <$in_fh> line 12.
Use of uninitialized value in addition (+) at GEN_Applications_Duration.pl line 18, <$in_fh> line 12.
 opq  0  ## Extra outpu line, which is unwanted ##
20100221  xyz 1300
20100221  opq 780
20100221  abc 1370

How to eliminate warninngs and extra output line. Also, output should be comma seperated as below:

Code:
20100221,  xyz, 1300

Could you please helm me again?
# 5  
Old 02-24-2010
I guess that you're file would have some empty lines ,

Code:
 
use strict;
use warnings;

my $inputfilename  = "file1.txt" ;

open( my $in_fh, "<", $inputfilename ) or die "Can't open $inputfilename : $!";
my @columns  ;
my %store ;
while (my $line = <$in_fh>) {
  chomp($line);
#@columns = split(',', $line, 2);
  if ( $line !~ /^$/ )   # checks for the non empty line 
  {
        $line =~ /^(.*?)\s*,(.*?)\s*,(.*?)\s*$/ ;
        my $no = $1 ;
        my $cut = $2 ;
        my $cnt = $3 ;
        $cut =~ s/^(.*?)_[0-9]*$/$1/;
        $store{$no}{$cut}+=$cnt ;
  }
  }

foreach my $no ( keys%store)
{

foreach my $cut ( keys%{$store{$no}})
{
    print "$no $cut $store{$no}{$cut} \n";
}

}

So please try this ,
If you're very sure that your file does not have empty line , then make
sure it always has the proper format ex :20100221, abc_3, 100
# 6  
Old 02-24-2010
MySQL

Wareh wahh!!! I'm thrilled Smilie
Tons of thanks to abubacker, it works exactly as expected. Smilie

Added the feature to this code, is it possible to handle multiple input files with different dates as below?

Code:
input file name: input_20100221.csv
20100221, abc_1, 200
20100221, abc_4, 300
20100221, opq_3, 200
20100221, abc_5, 200
20100221, xyz_1, 500
20100221, abc_2, 500
20100221, abc_3, 100
20100221, xyz_2, 700
20100221, opq_2, 300
20100221, xyz_3, 100
20100221, opq_1, 200

input file name: input_20100222.csv
20100222, abc_1, 100
20100222, abc_4, 200
20100222, opq_3, 200
20100222, abc_5, 200
20100222, xyz_1, 100
20100222, abc_2, 200
20100222, abc_3, 100
20100222, xyz_2, 200
20100222, opq_2, 800
20100222, xyz_3, 600
20100222, opq_1, 700

input file name: input_20100224.csv
20100224, abc_1, 600
20100224, abc_4, 400
20100224, opq_3, 200
20100224, abc_5, 300
20100224, xyz_1, 300
20100224, abc_2, 200
20100224, abc_3, 700
20100224, xyz_2, 200
20100224, opq_2, 200
20100224, xyz_3, 900
20100224, opq_1, 800

Output should be exactly as below:

Code:
Rundate,  abc,   opq, xyz  ### header, only one time
20100221, 1300,  700, 1300
20100222,  800, 1700,  900
20100224, 2200, 1200, 1400

Cheers ~~
# 7  
Old 02-24-2010
What if you read from stdin instead a named file? I mean

Code:
cat file1 file2 | perl -e '.......'

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Mismatch in summing a column in UNIX

Hello, I am facing issue in summing up a column in unix.I am displaying a column sum up to 4 decimal places and below is the code snippet sed '1d' abc.csv | cut -d',' -f7 | awk '{s+=$1}END{ printf("%.4f\n",s)}' -170552450514.8603 example of data values in the column(not... (3 Replies)
Discussion started by: karthik adiga
3 Replies

2. Shell Programming and Scripting

awk split columns after matching on rows and summing the last column

input: chr1 1 2 3 chr1 1 2 4 chr1 2 4 5 chr2 3 6 9 chr2 3 6 10 Code: awk '{a+=$4}END{for (i in a) print i,a}' input Output: chr112 7 chr236 19 chr124 5 Desired output: chr1 1 2 7 chr2 3 6 19 chr1 2 4 5 (1 Reply)
Discussion started by: jacobs.smith
1 Replies

3. Shell Programming and Scripting

Summing a number column

hi All, i have a file in which only one column is there., test.txt ====== -900.01 -900.02 -900.03 -900.04 -900.05 -900.06 -900.07 -900.08 -900.09 900.01 900.02 900.03 900.04 900.05 (4 Replies)
Discussion started by: mechvijays
4 Replies

4. Shell Programming and Scripting

Please Help!!!! Awk for summing columns based on selected column value

a,b,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,aa,bb,cc,dd,ee,ff,gg,hh,ii a thru ii are digits and strings.... The awk needed....if coloumn 9 == i (coloumn 9 is string ), output the sum of x's(coloumn 22 ) in all records and sum of y's (coloumn 23 ) in all records in a file (records.txt).... (6 Replies)
Discussion started by: BrownBob
6 Replies

5. Shell Programming and Scripting

Summing up rows data regarding 1st column

Dear all, I have one file like LABEL A B C D E F G H I J K L M N G02100 64651.3 25630.7 8225.21 51238 267324 268005 234001 52410.9 18598.2 10611 10754.7 122535 267170 36631.4 G02100 12030.3 8260.15 8569.91 ... (4 Replies)
Discussion started by: AAWT
4 Replies

6. Shell Programming and Scripting

Compare Two Files(Column By Column) In Perl or shell

Hi, I am writing a comparator script, which comapre two txt files(column by column) below are the precondition of this comparator 1)columns of file are not seperated Ex. file1.txt 8888812341181892 1243548895685687 8945896789897789 1111111111111111 file2.txt 9578956789567897... (2 Replies)
Discussion started by: kumar96877
2 Replies

7. Shell Programming and Scripting

summing values of a column

I have a file which contains data as below: ----------------------------------------------------------------------------------------------- GSPWeb Statistics for the period of last 20 days... (3 Replies)
Discussion started by: mohsin.quazi
3 Replies

8. Shell Programming and Scripting

selecting column in perl

Dear all, I have a rather large file of numbers which i would like to read into a script and then do some maths on a specific column( e.g column). so far i have been using the following awk command awk '{print $4}' infile.txt > out.tmp to strip out the desired column within the in perl... (3 Replies)
Discussion started by: Mish_99
3 Replies

9. UNIX for Dummies Questions & Answers

summing according to the column

I have a text file with two columns the first column is an integer and the second column is date how do i sum up the first column according to the date example 123 jan1 232 jan1 473 jan2 467 jan2 356 jan3 376 jan3 my result should be 355 jan1 940 jan2 732 jan3 how do i... (2 Replies)
Discussion started by: ramky79
2 Replies

10. Shell Programming and Scripting

Summing on column

Hi Friends How to do sum on a column? I have a file like: FRED 500.01 TX SMITH 50.10 NY HARRY 5.00 CA 555.11 Sum on second column. I am trying using nawk like nawk 'BEGIN {FS="|"}; {printf $1"+"}' Thanks a lot for your help S :) (2 Replies)
Discussion started by: sbasetty
2 Replies
Login or Register to Ask a Question