HELP with Unix scripts in summing columns in a file

 
Thread Tools Search this Thread
Homework and Emergencies Homework & Coursework Questions HELP with Unix scripts in summing columns in a file
# 1  
Old 08-21-2012
HELP with Unix scripts in summing columns in a file

1. The problem statement, all variables and given/known data:
Hi guys, i'm a new guy here, and it's my first time creating a unix script.
can you guys help me out here? i'd really appreciate it. Smilie

Here's my problem:

This is the file i'm using, it has 6 columns, the first three columns are like the key columns, and i put one header row. I need to sum the values of 4th, 5th and 6th row, by grouping each row by date, prod id and store id.

1st file:
Code:
date		prod id	store id	amt 1	      amt 2	    amt 3
20111017	45671	11801	-12	      -12348	    1080
20111017	45671	11801	12	      12348	    1030
20111018	45672	11802	7	      7868	    990
20111018	45672	11802	4	      4496	    950
20111019	45673	11803	-12	      -12348	    910
20111019	45674	11804	12	      12348	    866

then i have this second file:
this file contains a list of "prod id's" that should be ignored in the first file.

2nd file:
Code:
prod id
45671

The third file consists of a list of "store id's" that should be ignored from the 1st file.

3rd file:
Code:
store id
11803

So basically, i need to come up with a script to sum the 4th, 5th and 6th columns of the first file, while taking consideration, those that are present in the 2nd and 3rd file, so that those that should be ignored should not be included when columns are summed.

I would really appreciate a lot of love and help and mercy from you guys.

Thanks a lot! Smilie



2. Relevant commands, code, scripts, algorithms:
UNIX scripts, awk, printf, $ and all those stuff


3. The attempts at a solution (include all code and scripts):
i tried tweaking some scripts i found across the net, but im basically lost.


4. Complete Name of School (University), City (State), Country, Name of Professor, and Course Number (Link to Course):
i'm a computer engineering student from the phil., a noob one at that matter, my professor's engr mendoza.

Note: Without school/professor/course information, you will be banned if you post here! You must complete the entire template (not just parts of it).

Last edited by Corona688; 08-21-2012 at 12:56 PM..
This User Gave Thanks to ramneim For This Post:
# 2  
Old 08-21-2012
We can't just hand you an answer. We can help you build something but can't do the whole thing.

You can sum in awk like this:
Code:
awk '{
        for(N=4; N<=NF; N++) A[$1","$2","$3","N]+=$N
        B[$1","$2","$3]++
 }
END {
        for(X in B)
        for(N=4; N<=8; N++)
                print X, A[X","N]
}' datafile

If you make a BEGIN section, you can load lines from your other files with getline <"filename" statements to load the variables you want to skip later.
This User Gave Thanks to Corona688 For This Post:
# 3  
Old 08-21-2012
i have this script that basically sums two columns in a file, but does not group data.. and does not consider records to be ignored, so i wanted to add that function to improve this script..

it goes like this:

Code:
FILENAME=$1
DELIMITER=$2
FIRST_COL=$3
SECOND_COL=$4

for FILE in ${FILENAME}
do
  echo ${FILE}

  #########################################
  #GET SUMS OF THE 2 COLUMNS
  #########################################
  awk -v col=$FIRST_COL -v col2=$SECOND_COL -F "$DELIMITER" '
  {
    if(NR!=1) #comment this line out if you want the first row included
    {
        col1Sum += $col
        col2Sum += $col2
    }
  }
  END{
    printf("\t%s\t%s\t%s\n","", "Col1", "Col2")
    printf("\t%s\t%d\t%d\n","Sums:", col1Sum, col2Sum)
    printf("\t%s\t%s\t%s\n","", "", "")
  }' ${FILE}

  #########################################
  #GET COLUMNS
  #########################################
  awk -F  "${DELIMITER}" '{print "\tNo of Columns: " NF}' ${FILE} | sort | uniq
  
  #########################################
  #GET ROWS
  #########################################
  let ROWS=`cat ${FILE} | wc -l`-1
  echo "\tNo of Rows: ${ROWS}"

done


Last edited by joeyg; 08-21-2012 at 01:06 PM.. Reason: Please wrap data and sripts with CodeTags
# 4  
Old 08-21-2012
You can use arrays to group the data. Instead of X += Y, you can do X[index] += Y, where 'index' can be one column or several columns, like X[$1","$2","$3] += $4

Then you can loop over the array like for(Z in X) to get every array index in X, and access the contents like X[Z]
This User Gave Thanks to Corona688 For This Post:
# 5  
Old 08-21-2012
You can also make a BEGIN { } section that works sort of like the END { } section but opposite, and use it to load data from other files before the rest of the program starts processing lines.

You can load lines from those files likewhile(getline<"filename" >0) { ... } and lines will appear in the $1 $2 ... variables like you'd expect.
# 6  
Old 08-21-2012
hmm okay, how about the scenario where i want some rows to be ignored when they are summed?

how can i add to the script, where in it also references to the 2nd and 3rd file, to look for some values and compare it with 1st file, and if there's a match, those matches will be excluded from summation.. .?
# 7  
Old 08-21-2012
Load data from the files in the BEGIN section and put them in an array.

When reading lines of data, check then if the array for that entry is blank like if(SKIP[$col]) { next } where 'next' will cause that line to be skipped.
This User Gave Thanks to Corona688 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Mismatch in summing a column in UNIX

Hello, I am facing issue in summing up a column in unix.I am displaying a column sum up to 4 decimal places and below is the code snippet sed '1d' abc.csv | cut -d',' -f7 | awk '{s+=$1}END{ printf("%.4f\n",s)}' -170552450514.8603 example of data values in the column(not... (3 Replies)
Discussion started by: karthik adiga
3 Replies

2. Shell Programming and Scripting

awk split columns after matching on rows and summing the last column

input: chr1 1 2 3 chr1 1 2 4 chr1 2 4 5 chr2 3 6 9 chr2 3 6 10 Code: awk '{a+=$4}END{for (i in a) print i,a}' input Output: chr112 7 chr236 19 chr124 5 Desired output: chr1 1 2 7 chr2 3 6 19 chr1 2 4 5 (1 Reply)
Discussion started by: jacobs.smith
1 Replies

3. Shell Programming and Scripting

Summing columns over group of lines

I have an input file that looks like: ID1 V1 ID2 V2 P1 P2 P3 P4 ..... n no. of columns 1 1 1 1 1.0000 1.0000 1.0000 1.0000 1 1 1 2 0.9999 0.8888 0.7777 0.6666 1 2 1 1 0.8888 0.7777 0.6666 0.5555 1 2 1 2 0.7777 0.6666 0.5555 0.4444 2 1 1 1 0.6666 0.5555 0.4444 0.3333 2 1 1 2 0.5555 0.4444... (4 Replies)
Discussion started by: sdp
4 Replies

4. Shell Programming and Scripting

Summing columns in line

I have a file with the following format AAAAA 1.34B 0.76B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.90B 0.00B 0.00B 0.46B 0.00B 0.03B 0.00B ... (4 Replies)
Discussion started by: ncwxpanther
4 Replies

5. UNIX for Dummies Questions & Answers

Summing lines in a file

Can anyone tell me how sum values in each record of a file and append that value to the end? For instance a typical record will be: FY12,Budget,771100,,,,,,,,,250,-250 I'd like the record to become FY12,Budget,771100,,,,,,,,,250,-250,0 which can be put into another file. Thank you. (6 Replies)
Discussion started by: LearningLinux2
6 Replies

6. Shell Programming and Scripting

Please Help!!!! Awk for summing columns based on selected column value

a,b,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,aa,bb,cc,dd,ee,ff,gg,hh,ii a thru ii are digits and strings.... The awk needed....if coloumn 9 == i (coloumn 9 is string ), output the sum of x's(coloumn 22 ) in all records and sum of y's (coloumn 23 ) in all records in a file (records.txt).... (6 Replies)
Discussion started by: BrownBob
6 Replies

7. Homework & Coursework Questions

HELP with Unix scripts in summing columns in a file.

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Hi guys, i'm a new guy here, and it's my first time creating a unix script. can you guys help me out here? i'd... (3 Replies)
Discussion started by: ramneim
3 Replies

8. Shell Programming and Scripting

Summing values in columns

Basically I have to process a text file which has been sorted this way: John 12 John 13 John 10 John 900 Peter 20 Peter 30 Peter 32 The first column is a name, and the second an arbitrary value, both delimited by a space. How can I sum them up such that it would become: John 935... (2 Replies)
Discussion started by: Dwee
2 Replies

9. Shell Programming and Scripting

Summing the columns of a file

Hi All, I have a file like - num.txt 12, 34, 65, line1 34, 65, 89, line2 43, 65, 77, line3 I want to do two things - 1. Add first three columns of each line and print the line with largest value. i.e. (12+34+65) for 1st line and so on. 2. Add middle column of each line i.e.... (3 Replies)
Discussion started by: asahlot
3 Replies

10. Shell Programming and Scripting

Grouping and summing data through unix

Hi everyone, I need a help on Unix scripting. I have a file is like this Date Amt 20071205 10 20071204 10 20071203 200 20071204 300 20071203 400 20071205 140 20071203 100 20071205 100... (1 Reply)
Discussion started by: pcharanraj
1 Replies
Login or Register to Ask a Question