awk arrays can do this better - but how?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk arrays can do this better - but how?
# 1  
Old 11-24-2008
awk arrays can do this better - but how?

Hi,

I have spent the afternoon trawling Google, Unix.com and Unix in a Nutshell for information on how awk arrays work, and I'm not really getting too far.

I ahve a batch of code that I am pretty sure can be better managed using awk, but I'm not sure how to use awk arrays to do what I'm trying to do effieciently.

I have used 'find' to produce a list of file sizes and last access time in seconds:
Code:
find . -path './.snapshot' -prune -o -type f -exec stat -c "%X %s" {} \;

1227265466 5108
1224230970 1685
1225974287 6502
1224237225 105532
1224239208 4125
1225974287 6552
1224240021 1066
1225974287 8399

I now want to sum the value in bytes for different date ranges:
Code:
cat asecs_bytes.txt | while read asecs bytes
do 
	#convert days to months
	monthsago=$(( (now - asecs) / 2592000  ))
	
	if [ $monthsago -gt 36 ] ; then 
		
		col_4_bytes=$(( col_4_bytes + bytes ))
		col_4_count=$(( col_4_count + 1 ))
	
	elif [ $monthsago -gt 12 ] || [ $monthsago -le 36 ] ; then 
		
		col_3_bytes=$(( col_3_bytes + bytes ))
		col_3_count=$(( col_3_count + 1 ))
	
	elif [ $monthsago -gt 1 ] || [ $monthsago -le 12 ] ; then
	
		col_2_bytes=$(( col_2_bytes+ bytes ))
		col_2_count=$(( col_2_count + 1 ))

	elif [ $monthsago -eq 0 ] || [ $monthsago -eq 1 ] ; then
	
		col_1_bytes=$(( col_1_bytes + bytes ))
		col_1_count=$(( col_1_count + 1 ))
	else
		# should not be possible
		col_x_bytes=$(( col_x_bytes + bytes ))
		col_x_count=$(( col_x_count + 1 ))
	fi	
	

done

I'm pretty sure I could use awk to read my file and output the information in a much better manner. Any suggestions?

I have thought about reading the file in then evaluating each asecs value, and detemrine if it is in a certian range - but I think awk arrays can be much smarter than that.

I'll also need to convert the bytes into a more logical value (GB/MB/KB) at some point, but can do this in a separate step if necessary.

for completness, my output is going to be emailed ot users of a file sytem to inform them of how much data they have on different disk areas of different date ranges, like this:
HTML Code:
<div class="tabletopBLUE" style="width:100%;">/full/toplevel/file/path/here/ &nbsp;&nbsp;&nbsp;[username]</div> 
<div style="background-color:White; color:White; line-height:4px; width:100%;">_</div>
<div class="outline" >
<table class="cboxTXT1" width="100%" border="0" align="center" cellpadding="0px" cellspacing="0px">
 <tr style="font-weight: bold;">
  <td width="200"id="Rowheader" colspan="1" align=left>sub-directory</td>
  <td width="150" id="Rowheader" colspan="2" align=center>0 - 1 month</td>
  <td width="150" id="Rowheader" colspan="2" align=center>1 - 12 months</td>
  <td width="150" id="Rowheader" colspan="2" align=center>1 - 3 years</td>
  <td width="150" id="Rowheader" colspan="2" align=center>3 years +</td>
 </tr>
 <tr> 
 <td align=right>/</td>
  <td align=right></td><td align=left></td> 
  <td align=right></td><td align=left></td> 
  <td align=right></td><td align=left></td> 
  <td align=right></td><td align=left></td> 
 </tr>
 <tr> 
 <td align=right>/example_one &nbsp;&nbsp;&nbsp; [username]</td>
  <td align=right>1602760</td><td align=left>1</td> 
  <td align=right></td><td align=left></td> 
  <td align=right>19141123</td><td align=left>72</td> 
  <td align=right></td><td align=left></td> 
 </tr>
 <tr> 
 <td align=right>/example_two &nbsp;&nbsp;&nbsp; [username]</td>
  <td align=right></td><td align=left></td> 
  <td align=right>666854</td><td align=left>3</td> 
  <td align=right>27799028</td><td align=left>67</td> 
  <td align=right></td><td align=left></td> 
 </tr>
 <tr> 
 <td align=right>/example_three &nbsp;&nbsp;&nbsp; [username]</td>
  <td align=right></td><td align=left></td> 
  <td align=right>485</td><td align=left>1</td> 
  <td align=right>249226085</td><td align=left>438</td> 
  <td align=right></td><td align=left></td> 
 </tr>
 <tr> 
 <td align=right>/example_four &nbsp;&nbsp;&nbsp; [username]</td>
  <td align=right></td><td align=left></td> 
  <td align=right>130095309</td><td align=left>1</td> 
  <td align=right>74821761</td><td align=left>18</td> 
  <td align=right></td><td align=left></td> 
 </tr>
 <tr> 
 <td align=right>/example_five &nbsp;&nbsp;&nbsp; [username]</td>
  <td align=right></td><td align=left></td> 
  <td align=right></td><td align=left></td> 
  <td align=right>2572753103</td><td align=left>73</td> 
  <td align=right></td><td align=left></td> 
 </tr>
   </table>
  </div>
 <div>
Cheers,
littleIdiot
# 2  
Old 02-03-2009
Sorry for the late reply, but you should understand that Thanksgiving time (end of November) is a good time for lots of us to now get on the computer....
Quote:
I'm pretty sure I could use awk to read my file and output the information in a much better manner. Any suggestions?
Yeah, though only slightly better:
Code:
#!/usr/bin/awk -f
{
   # figure out monthsago here....
   # ...
   if (monthsago <= 0)
     bucket=0;
   else if (monthsago <= 1)
     bucket=1;
   else if (monthsago <= 2)
     bucket=2;
   else if (monthsago <= 12)
     bucket=12   
   else if (monthsago <= 36)
     bucket=36
   else 
     bucket="inf"

   sum[bucket]+=$2
   count[bucket]++;
}
END {
   for (bucket in sum) {
      print sum[bucket],count[bucket],sum[bucket]/count[bucket];
   }
}

Doing kb,mb,gb, etc is similar:
Code:
BEGIN {
  split(",kb,mb,gb,tb,xb,pb",units,","); 
}

END {
  magnitude=1;
  while (val >= 1024) { 
       val/=1024;
       magnitude++;
   }
   print val,units[magnitude];
}

Hope that helps.

Last edited by otheus; 02-03-2009 at 05:21 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk Arrays

So I'm back once again beating my head off a wall trying to figure out how to get this to work. My end goal is to take input such as what's below, which will be capture in real time with a tail -f from a file or piped output from another command: ... (5 Replies)
Discussion started by: ShadowBlade72
5 Replies

2. Shell Programming and Scripting

Using arrays with awk

I'm a little stuck and would be grateful of some advice! I have three files, two of which contain reference data that I want to add to a line of output in the third file. I can't seem to get awk to print array contents as I would expect. The input files are: # Input file AAA,OAA,0313... (2 Replies)
Discussion started by: maccas17
2 Replies

3. Shell Programming and Scripting

help in awk arrays!

Hi, buddies I am new to shell scripting and trying to solve a problem. I read about arrays in awk that they are quite powerful and are associative in nature. Awk Gurus Please help! I have a file: Id1 pp1 0t4 pp8 xy2 Id43 009y black Id6 red xy2 Id12 new pp1 black I have... (5 Replies)
Discussion started by: geli21
5 Replies

4. UNIX for Dummies Questions & Answers

awk arrays

Hi Can someone please explain the logic of awk arrays. I have been doing some reading but I dont understand this: #!/usr/bin/gawk -f { arr++; } end { for(i in arr) { print arr,i } } As I understand arr refs the arrays index, so while $2 is a string that cant... (2 Replies)
Discussion started by: chronics
2 Replies

5. Shell Programming and Scripting

arrays in awk

Hi, I have the following data in a file for example: Name="Fred","Bob","Peterson","Susan","Weseley" Age="24","30","28","23","45" Study="English","Engineering","Physics","Maths","Psychology" Code="0","0","1","1","0" Name="Fred2","Bob2","Peterson2","Susan2","Weseley2"... (14 Replies)
Discussion started by: james2009
14 Replies

6. Shell Programming and Scripting

Arrays in awk

Hi, I've written the following code to manipulate the first 40 lines of a data file into my desired order: #!/bin/awk -f { if (NR<=(4)){ a=a$0" "} else { if ((NR >= (5)) && (NR <= (13))) { b=b$0" " } else {if ((NR >= (14)) && (NR <= (25))){ c=c$0" "} ... (5 Replies)
Discussion started by: catwoman
5 Replies

7. Shell Programming and Scripting

Need Help with awk and arrays

now its owkring - thanks fo rthe help all . (7 Replies)
Discussion started by: fusionX
7 Replies

8. Shell Programming and Scripting

awk arrays

Guys, OK so i have been trying figure this all all day, i guess its a pretty easy way to do it. Right, so i have to column of data which i have gotten from one huge piece of data. What i would like to do is to put both of these into one array using awk. Is this possible?? If so could... (1 Reply)
Discussion started by: imonthejazz
1 Replies

9. Shell Programming and Scripting

arrays in awk???

Been struggling with a problem, I have been trying to do this in awk, but am unable to figure this out, I think arrays have to be used, but unsure how to accomplish this. I have a input file that looks like this: 141;ny;y;g 789;ct;e;e 23;ny;n;u 45;nj;e;u 216;ny;y;u 7;ny;e;e 1456;ny;e;g... (3 Replies)
Discussion started by: craigsky
3 Replies

10. UNIX for Advanced & Expert Users

Two or more arrays in Awk

Hi All, I have been working on awk and arrays. I have this small script: cat maillog*|awk -F: '$2=="SMTP-Accept" && $5~/string/ {lastdate=substr($1,1,8); internaluser=$5; v++} END {for (j in v) {print lastdate, v, j}'| sort>> mail.list This gives me the number of mails users are getting. ... (1 Reply)
Discussion started by: nitin
1 Replies
Login or Register to Ask a Question