HELP with Unix scripts in summing columns in a file

 
Thread Tools Search this Thread
Homework and Emergencies Homework & Coursework Questions HELP with Unix scripts in summing columns in a file
# 8  
Old 08-21-2012
okay, i'm taking all your suggestions into consideration, coz I basically don't know what's the best thing to do here, so

in that BEGIN section, is it okay if the file that i will load there is coming from other directories?

and also, there's a possibility that there's no 2nd or 3rd file, so that BEGIN section should be in an if statement, right?
# 9  
Old 08-21-2012
You can't put a BEGIN statement in an if statement. Like END, it has to be outside by itself. You can put an if statement in a begin statement though.

Yes, you can specify paths to getline. while(getline<"/path/to/filename" > 0) { }

If the files don't exist, the loops will just be skipped...
This User Gave Thanks to Corona688 For This Post:
# 10  
Old 08-21-2012
oh yeah, got it, forgot about that, so i'll just put an if statement in the begin section, to check if there's a 2nd or 3rd file.

i'm actually getting the hang of it now, that getline suggestion is really cool.

but let's say, there's a 2nd and 3rd file, and then their data got loaded before doing the processing/summation, so how do i exclude/skip those lines so that they won't be included in summation?

like what should i put in the codes to exclude those from being summed?

PS i'm really grateful to your replies, it's like i'm getting the hang of it somehow on how to proceed, thanks alot
# 11  
Old 08-21-2012
Quote:
Originally Posted by ramneim
oh yeah, got it, forgot about that, so i'll just put an if statement in the begin section, to check if there's a 2nd or 3rd file.
The getline loop won't run if the file doesn't exist. An if might not be necessary.
Quote:
but let's say, there's a 2nd and 3rd file, and then their data got loaded before doing the processing/summation, so how do i exclude/skip those lines so that they won't be included in summation?
Put them in the array, too. If you somehow arrange for B["12345"]=1 to be set before the real data gets checked, you can check if B["12345"] is true and skip it if it is.
Quote:
like what should i put in the codes to exclude those from being summed?
Quote:
Originally Posted by Corona688
When reading lines of data, check then if the array for that entry is blank like if(SKIP[$col]) { next } where 'next' will cause that line to be skipped.
This User Gave Thanks to Corona688 For This Post:
# 12  
Old 08-21-2012
hmm okay, thanks a lot for your help! it's getting pretty late here already, it's 1 in the morning, and im already sleepy, i'll add these in my script, and maybe show it to you tomorrow, if i bump into problems again Smilie

but thanks a lot! you are a great help to everyone in this forum! Smilie
# 13  
Old 08-21-2012
I think you already get the best advice possible in terms of implementation, so i will concentrate on the intrinsics of the problem:

Basically you have to parse a file, do a group change and reduce your input set by a set of key values. All these procedures/algorithms are basic tasks which you will have to handle on a daily basis in one or the other form in your future job. Probably this is why your professor designed the task this way.

Here is what a group change is and how it is carried out:

You have a file/table/datastructure with basically two columns: a "key" column and a "value" column. You sort that file for the key, then read it line by line and every time the key value changes you have to carry out some action.

For instance: Every customer of a business, instead of paying, will have written a record in a book (=file): his name and the value of the good he takes. At the end it looks like:

Code:
customerA  50,-
customerA  25,-
customerB  70,-
customerA  15,-
customerC  60,-
customerB  25,-
...

The task is to sum up the total for every customer and print a list with the totals. This is how it is done:

First sort the file for the key value, which is the first field in this case (the customer name):

Code:
customerA  50,-
customerA  25,-
customerA  15,-
customerB  70,-
customerB  25,-
customerC  60,-
...

Now read that file line by line. Store the last value of your key into a buffer. Optionally you might have an default action for every line, which be carried out here - in this case sum up the value. Whenever the key changes the comparison of the new key to the last key in this buffer will fail. Then you have to do your action (in this case print the total) and start over with the new key stored in the buffer. (Don't forget that at files end you have to end-process your last key value.)

Here is it in pseudo-code:

Code:
LAST=""
while (lines to process)
     KEY := KEY value of the line
     VAL := VALUE value of the line
     line_process( KEY, VALUE )
     if ( KEY differs from LAST )
          group_end_process( KEY [, ..... ] )
          LAST := KEY
     endif
     next line
end while
end_process( KEY [, ..... ] )

This is called a "single group change". There are also multiple group changes: suppose the customers would want a detailed statement, where the goods are sorted in groups and a subtotal should be given for every group of goods and a grand total for the customer (dual group change). It will need a second key field and i am sure you could write the pseudo code for this one already yourself.

I hope this helps in understanding.

(corollary: read carefully the man page of "awk", especially the part about how awk works and how "awk" scripts are formed. Does that ring a bell?)

bakunin

Last edited by bakunin; 08-21-2012 at 05:02 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Mismatch in summing a column in UNIX

Hello, I am facing issue in summing up a column in unix.I am displaying a column sum up to 4 decimal places and below is the code snippet sed '1d' abc.csv | cut -d',' -f7 | awk '{s+=$1}END{ printf("%.4f\n",s)}' -170552450514.8603 example of data values in the column(not... (3 Replies)
Discussion started by: karthik adiga
3 Replies

2. Shell Programming and Scripting

awk split columns after matching on rows and summing the last column

input: chr1 1 2 3 chr1 1 2 4 chr1 2 4 5 chr2 3 6 9 chr2 3 6 10 Code: awk '{a+=$4}END{for (i in a) print i,a}' input Output: chr112 7 chr236 19 chr124 5 Desired output: chr1 1 2 7 chr2 3 6 19 chr1 2 4 5 (1 Reply)
Discussion started by: jacobs.smith
1 Replies

3. Shell Programming and Scripting

Summing columns over group of lines

I have an input file that looks like: ID1 V1 ID2 V2 P1 P2 P3 P4 ..... n no. of columns 1 1 1 1 1.0000 1.0000 1.0000 1.0000 1 1 1 2 0.9999 0.8888 0.7777 0.6666 1 2 1 1 0.8888 0.7777 0.6666 0.5555 1 2 1 2 0.7777 0.6666 0.5555 0.4444 2 1 1 1 0.6666 0.5555 0.4444 0.3333 2 1 1 2 0.5555 0.4444... (4 Replies)
Discussion started by: sdp
4 Replies

4. Shell Programming and Scripting

Summing columns in line

I have a file with the following format AAAAA 1.34B 0.76B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.00B 0.90B 0.00B 0.00B 0.46B 0.00B 0.03B 0.00B ... (4 Replies)
Discussion started by: ncwxpanther
4 Replies

5. UNIX for Dummies Questions & Answers

Summing lines in a file

Can anyone tell me how sum values in each record of a file and append that value to the end? For instance a typical record will be: FY12,Budget,771100,,,,,,,,,250,-250 I'd like the record to become FY12,Budget,771100,,,,,,,,,250,-250,0 which can be put into another file. Thank you. (6 Replies)
Discussion started by: LearningLinux2
6 Replies

6. Shell Programming and Scripting

Please Help!!!! Awk for summing columns based on selected column value

a,b,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,aa,bb,cc,dd,ee,ff,gg,hh,ii a thru ii are digits and strings.... The awk needed....if coloumn 9 == i (coloumn 9 is string ), output the sum of x's(coloumn 22 ) in all records and sum of y's (coloumn 23 ) in all records in a file (records.txt).... (6 Replies)
Discussion started by: BrownBob
6 Replies

7. Homework & Coursework Questions

HELP with Unix scripts in summing columns in a file.

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Hi guys, i'm a new guy here, and it's my first time creating a unix script. can you guys help me out here? i'd... (3 Replies)
Discussion started by: ramneim
3 Replies

8. Shell Programming and Scripting

Summing values in columns

Basically I have to process a text file which has been sorted this way: John 12 John 13 John 10 John 900 Peter 20 Peter 30 Peter 32 The first column is a name, and the second an arbitrary value, both delimited by a space. How can I sum them up such that it would become: John 935... (2 Replies)
Discussion started by: Dwee
2 Replies

9. Shell Programming and Scripting

Summing the columns of a file

Hi All, I have a file like - num.txt 12, 34, 65, line1 34, 65, 89, line2 43, 65, 77, line3 I want to do two things - 1. Add first three columns of each line and print the line with largest value. i.e. (12+34+65) for 1st line and so on. 2. Add middle column of each line i.e.... (3 Replies)
Discussion started by: asahlot
3 Replies

10. Shell Programming and Scripting

Grouping and summing data through unix

Hi everyone, I need a help on Unix scripting. I have a file is like this Date Amt 20071205 10 20071204 10 20071203 200 20071204 300 20071203 400 20071205 140 20071203 100 20071205 100... (1 Reply)
Discussion started by: pcharanraj
1 Replies
Login or Register to Ask a Question