HELP with Unix scripts in summing columns in a file
1. The problem statement, all variables and given/known data:
Hi guys, i'm a new guy here, and it's my first time creating a unix script.
can you guys help me out here? i'd really appreciate it.
Here's my problem:
This is the file i'm using, it has 6 columns, the first three columns are like the key columns, and i put one header row. I need to sum the values of 4th, 5th and 6th row, by grouping each row by date, prod id and store id.
1st file:
then i have this second file:
this file contains a list of "prod id's" that should be ignored in the first file.
2nd file:
The third file consists of a list of "store id's" that should be ignored from the 1st file.
3rd file:
So basically, i need to come up with a script to sum the 4th, 5th and 6th columns of the first file, while taking consideration, those that are present in the 2nd and 3rd file, so that those that should be ignored should not be included when columns are summed.
I would really appreciate a lot of love and help and mercy from you guys.
Thanks a lot!
2. Relevant commands, code, scripts, algorithms:
UNIX scripts, awk, printf, $ and all those stuff
3. The attempts at a solution (include all code and scripts):
i tried tweaking some scripts i found across the net, but im basically lost.
4. Complete Name of School (University), City (State), Country, Name of Professor, and Course Number (Link to Course):
i'm a computer engineering student from the phil., a noob one at that matter, my professor's engr mendoza.
Note: Without school/professor/course information, you will be banned if you post here! You must complete the entire template (not just parts of it).
Last edited by Corona688; 08-21-2012 at 12:56 PM..
We can't just hand you an answer. We can help you build something but can't do the whole thing.
You can sum in awk like this:
If you make a BEGIN section, you can load lines from your other files with getline <"filename" statements to load the variables you want to skip later.
i have this script that basically sums two columns in a file, but does not group data.. and does not consider records to be ignored, so i wanted to add that function to improve this script..
it goes like this:
Last edited by joeyg; 08-21-2012 at 01:06 PM..
Reason: Please wrap data and sripts with CodeTags
You can use arrays to group the data. Instead of X += Y, you can do X[index] += Y, where 'index' can be one column or several columns, like X[$1","$2","$3] += $4
Then you can loop over the array like for(Z in X) to get every array index in X, and access the contents like X[Z]
You can also make a BEGIN { } section that works sort of like the END { } section but opposite, and use it to load data from other files before the rest of the program starts processing lines.
You can load lines from those files likewhile(getline<"filename" >0) { ... } and lines will appear in the $1 $2 ... variables like you'd expect.
hmm okay, how about the scenario where i want some rows to be ignored when they are summed?
how can i add to the script, where in it also references to the 2nd and 3rd file, to look for some values and compare it with 1st file, and if there's a match, those matches will be excluded from summation.. .?
Load data from the files in the BEGIN section and put them in an array.
When reading lines of data, check then if the array for that entry is blank like if(SKIP[$col]) { next } where 'next' will cause that line to be skipped.
Hello,
I am facing issue in summing up a column in unix.I am displaying a column sum up to 4 decimal places and below is the code snippet
sed '1d' abc.csv | cut -d',' -f7 | awk '{s+=$1}END{ printf("%.4f\n",s)}'
-170552450514.8603
example of data values in the column(not... (3 Replies)
I have a file with the following format
AAAAA 1.34B 0.76B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.90B 0.00B 0.00B 0.46B 0.00B 0.03B 0.00B ... (4 Replies)
Can anyone tell me how sum values in each record of a file and append that value to the end? For instance a typical record will be:
FY12,Budget,771100,,,,,,,,,250,-250
I'd like the record to become
FY12,Budget,771100,,,,,,,,,250,-250,0
which can be put into another file.
Thank you. (6 Replies)
a,b,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,aa,bb,cc,dd,ee,ff,gg,hh,ii
a thru ii are digits and strings....
The awk needed....if coloumn 9 == i (coloumn 9 is string ), output the sum of x's(coloumn 22 ) in all records and sum of y's (coloumn 23 ) in all records in a file (records.txt).... (6 Replies)
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
Hi guys, i'm a new guy here, and it's my first time creating a unix script.
can you guys help me out here? i'd... (3 Replies)
Basically I have to process a text file which has been sorted this way:
John 12
John 13
John 10
John 900
Peter 20
Peter 30
Peter 32
The first column is a name, and the second an arbitrary value, both delimited by a space. How can I sum them up such that it would become:
John 935... (2 Replies)
Hi All,
I have a file like -
num.txt
12, 34, 65, line1
34, 65, 89, line2
43, 65, 77, line3
I want to do two things -
1. Add first three columns of each line and print the line with largest value.
i.e. (12+34+65) for 1st line and so on.
2. Add middle column of each line i.e.... (3 Replies)
Hi everyone,
I need a help on Unix scripting.
I have a file is like this
Date Amt
20071205 10
20071204 10
20071203 200
20071204 300
20071203 400
20071205 140
20071203 100
20071205 100... (1 Reply)