Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Match sum of values in each column with the corresponding column value present in trailer record Post 302942107 by tpk on Friday 24th of April 2015 09:47:41 AM
Old 04-24-2015
Match sum of values in each column with the corresponding column value present in trailer record

Hi All,

I have a requirement where I need to find sum of values from column D through O present in a CSV file and check whether the sum of each Individual column matches with the value present for that corresponding column present in the trailer record.

For example, let's assume for column D excluding Header and Trailer in the csv find the sum of all data records in column D and check whether that sum is equal to value present in column D in the trailer record. This same process needs to be done for all the columns from D through O.

For this I have developed one shell script which does the same (I know you experts can do it in better way instead of creating so many temp files. But as I am very new to shell scripting I have just applied my thought in my way).

This shell is behaving differently for each file, For file pf_20150127.csv it is working perfectly because the temp files which I am comparing are giving the same results, PFA the snapshot of values match (Sum_Match.jpb) in temp files.

If I execute the same script for file pf_20150325.csv, The counts does not match. The trailer record value in the original file now is being displayed with 2 decimal places and my sum output does not have decimal values. I don't understand whether it's a file problem or unix has some internal mechanism which reads files and displays values in different manner. PFA the temp file outputs of this file (Sum_mismatch.jpg).

I believe it's not a file problem, Now where is the problem in my script. How can I read and compare the sum with that of the value in the trailer record irrespective of original tariler record have decimals or whole numbers.

I have attached the actual test csv files which I have mentioned and temp files output of both files. Please help me out as I am in real help and I could not think of any other way of doing it. Please suggest if I have to change my design entirely to achieve my requirement, If yes please provide me the solution.

Thanks is advance!!!!

Code:
#!/usr/bin/sh
#
cd /var/datastage/FRPDEVL/work/source/landing/dspf
for fname in pf_*.csv;do
#Check for files existence in the corresponding directory and perform validation
if [ -f "$fname" ]
then
echo "Expected file(s) found, Performing Validations for file: "$fname
filename=`basename $fname`
fdate=`echo $filename|tr -dc '[:digit:]'`
echo $filename","$fdate

  #Validation 1: Sum of all the columns from D to O (numeric data type) respectively should be equal
  #to the value present in trailer row against the respective column.
  if [ $filename = 'pf_'$fdate'.csv' ]
  then
  echo "------------------------------------------------------------------------------------"
  echo "Checking Specific Validations 2 for File: $filename"
  echo "------------------------------------------------------------------------------------"
  #Trim Header and Trailer record and create temporary file temp1_$fdate.tmp
  sed '1d;$d' $filename >temp1_$fdate.tmp
  
  #Trim the trailer record only from original file and create another temporary file temp_original_$fdate.tmp
  #which will be used for comparison after finding sum from D to O column
  tail -1 $filename|cut -d "," -f 4->temp_original_$fdate.tmp
  
  #Perform sum from column D to O on temporary file temp1_$fdate.tmp and create another temporary file temp_sum_$fdate.tmp
  awk -F, -v OFS="," -v OFMT="%.2E" '{s1+=$4;s2+=$5;s3+=$6;s4+=$7;s5+=$8;s6+=$9;s7+=$10;s8+=$11;s9+=$12;s10+=$13;s11+=$14;s12+=$15}END{print s1,s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12}' temp1_$fdate.tmp>temp_sum_$fdate.tmp
  #awk -F, -v OFS="," '{s1+=$4;s2+=$5;s3+=$6;s4+=$7;s5+=$8;s6+=$9;s7+=$10;s8+=$11;s9+=$12;s10+=$13;s11+=$14;s12+=$15}END{print s1,s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12}' temp1_$fdate.tmp>temp_sum_$fdate.tmp
  
  #Now compare the sum that is present in trailer record of original file with that of the sum taken from Column D to O,
  #If both the values match in the two files, then the matching record will be printed and a count will be taken which will be 
  #always one. If both the data does not match then the count will be 0
  val=`awk 'NR==FNR{a[$0];next}$0 in a{print $0}' temp_original_$fdate.tmp temp_sum_$fdate.tmp|wc -l`
  
  #If $val is =0, which means the sum is not matching with Trailer record sum, hence kill the job
  if [ "$val" -eq "0" ]
  then
  echo "The sum of either or all columns is not matching with last row sum value of corresponding column. Hence exiting the Job"
  
  #If the validation fails remove all the temporary files before exiting from further processing
  #rm -f temp1_$fdate.tmp
  #rm -f temp_original_$fdate.tmp
  #rm -f temp_sum_$fdate.tmp
  
  #Exit with code 16, If the sums are not matching
  exit 16  
  else
  echo "Sums are matching"  
  fi  
  echo "------------------------------------------------------------------------------------"
  echo "Specific Validations check for File: $filename completed"
  echo "------------------------------------------------------------------------------------"  
  fi
  
  #Remove all temp files if all the validations pass
  #rm -f temp_$fdate.tmp
  #rm -f temp1_$fdate.tmp
  #rm -f temp_original_$fdate.tmp
  #rm -f temp_sum_$fdate.tmp

#If files are not there in landing directory, will not perform validations and exit with normal status  
else
	echo "Expected files not found, Hence not performing any validations"
	exit 0
fi

#End of main For loop
done

With Regards,
TPK
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to sum column 1 values

I have a file file like this. I want to sum all column 1 values. input A 2 A 3 A 4 B 4 B 2 Out put A 9 B 6 (3 Replies)
Discussion started by: suresh3566
3 Replies

2. Shell Programming and Scripting

print unique values of a column and sum up the corresponding values in next column

Hi All, I have a file which is having 3 columns as (string string integer) a b 1 x y 2 p k 5 y y 4 ..... ..... Question: I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
Discussion started by: amigarus
6 Replies

3. Shell Programming and Scripting

Getting a sum of column values

I have a file in the following layout: 201008005946873001846130058030701006131840000000000000000000 201008006784994001154259058033001009527844000000000000000000 201008007323067002418095058034801002418095000000000000000000 201008007697126001722141058029101002214158000000000000000000... (2 Replies)
Discussion started by: jclanc8
2 Replies

4. Shell Programming and Scripting

Sum up the column values group by using some field

12-11-2012,PNL,158406 12-11-2012,RISK,4564 12-11-2012,VAR_1D,310101 12-11-2012,VAR_10D,310101 12-11-2012,CB,866 12-11-2012,STR_VAR_1D,298494 12-11-2012,STR_VAR_10D,309623 09-11-2012,PNL,1024106 09-11-2012,RISK,4565 09-11-2012,VAR_1D,317211 09-11-2012,VAR_10D,317211 09-11-2012,CB,985... (7 Replies)
Discussion started by: manas_ranjan
7 Replies

5. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

6. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

7. Shell Programming and Scripting

Sum if line match with first column

Hi, i have log like below: A 2 5 B 4 1 C 6 8 B 0 1 C 1 0 B 2 3 A 0 0 i want to make result if match with A then sum from column 2 and 3 so the results: A 2 5 (5 Replies)
Discussion started by: justbow
5 Replies

8. Shell Programming and Scripting

Sum column values matching other field

this is part of a KT i am going thru. i am writing a script in bash shell, linux where i have 2 columns where 1st signifies the nth hour like 00, 01, 02...23 and 2nd the file size. sample data attached. Desired output is 3 columns which will give the nth hour, number of entries in nth hour and... (3 Replies)
Discussion started by: alpha_1
3 Replies

9. Shell Programming and Scripting

Help with calculate the total sum of record in column one

Input file: 101M 10M10D20M1I70M 10M10D39M4I48M 10M10D91M 10M10I13M2I7M1I58M 10M10I15M1D66M Output file: 101M 101 0 0 10M10D20M1I70M 100 1 10 10M10D39M4I48M 97 4 10 10M10D91M 101 0 10 10M10I13M2I7M1I58M 88 13 0 10M10I15M1D66M 91 10 1 I'm interested to count how many total of... (6 Replies)
Discussion started by: perl_beginner
6 Replies

10. UNIX for Beginners Questions & Answers

Sum the values in the column using date column

I have a file which need to be summed up using date column. I/P: 2017/01/01 a 10 2017/01/01 b 20 2017/01/01 c 40 2017/01/01 a 60 2017/01/01 b 50 2017/01/01 c 40 2017/01/01 a 20 2017/01/01 b 30 2017/01/01 c 40 2017/02/01 a 10 2017/02/01 b 20 2017/02/01 c 30 2017/02/01 a 10... (6 Replies)
Discussion started by: Booo
6 Replies
All times are GMT -4. The time now is 05:54 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy