Detailed disk usage versus age summary


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Detailed disk usage versus age summary
# 1  
Old 10-15-2008
Detailed disk usage versus age summary

Hi,

I'm posting my question here as I fele that what I am about to try to do must have been done already, and I don't want to re-invent the wheel.

I have recently become responsible for monitoring disk space usage for a large file system.

I would like to geenrate reports that will summise the amount of disk space used by directories at a certain level, grouped into date ranges.

e.g. results

Code:
Last modified : file path          : total
 0 - 1 months : /foo/foo_01/bar_01 : 101 GB
                /foo/foo_01/bar_02 :  98 GB
                /foo/foo_02/bar_03 : 202 GB
                /bar/bar_01/etc    : 203 GB
 1 - 6 months : /foo/foo_01/bar_04 : 405 GB
                /bar/bar_02/etc    : 203 GB
                /bar/bar_03/etc    : 203 GB
6 - 12 months : /bar/bar_03/tmp    :  20 GB
                /bar/bar_01/tmp    :  22 GB
12 months +   : /bar/bar_02/tmp    : 203 GB

I hope that gives some idea of what I am trying to achive. Basically, I want to highlight large areas of the filesystem that can be archived off because they have not been accessed for some time.

If anyone can point me towards any scripts already written that would do this or something I can modify to do it I would appreciate it.

At the moment I am loking at starting from scratch, which I'd enjoy, but will take some time.

I can not install any software - it must be script based.

Thanks for any tips/advice! Smilie
# 2  
Old 10-17-2008
So normally you can just use "du -sh $DIR" to get the summary information. The trick is figuring out which ones you want to sum. What does "Last modified" really mean? Does it refer to the directory itself (which means any change to any filename)? Or does it mean a file in that directory? If so, does it mean the oldest modified or newest modified?
# 3  
Old 10-17-2008
Thanks for your reply. I've worked on this a fair bit yesterday and got a lot further than I thought I would.

I am using the find command to look through all files and directories, look at the modified time (-mtime) and report back all files that are modified between set time frames ... so last week, 1 to 4 weeks ago, 1 to 6 months ago etc ... all the way up to over a year ago.

Code:
find . -path './.snapshot' -prune -o  -type f -mtime -8 -ls

I have then put piped the output to awk to sum the number of bytes and number of files:
Code:
 | awk '{bytes += $7; count++} END print bytes, count}'

I am running this as 2 loops - so that I get all the subdirectories in the top level as supplied at the command line.

I am passing all output through several echo commands to output in html format so I can put the output in a table.

Sample output so far:

HTML Code:
<table width="960" border="1" bordercolor="gray" align="center" cellpadding="0" cellspacing="0">
<tr align="center"><td align="left" width="160">foobar</td>
<td colspan="2" width="160">0 - 7 days</td>
<td colspan="2" width="160">1 - 4 weeks</td>
<td colspan="2" width="160">1 - 6 months</td>
<td colspan="2" width="160">7 - 12 months</td>
<td colspan="2" width="135">1 year +<td></tr>
<tr align="right"><td align="right">/basecase</td>
<td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>187307</td><td>2</td> <td>&nbsp;</td><td>&nbsp;</td> <td>160477762</td><td>132</td> </tr>
<tr align="right"><td align="right">/cbabble</td>
<td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>120476297</td><td>82</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> </tr>
<tr align="right"><td align="right">/hi_STOIIP</td>
<td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>5561429</td><td>15</td> </tr>
<tr align="right"><td align="right">/libra</td>
<td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>30312</td><td>18</td> </tr>
<tr align="right"><td align="right">/lowestcase</td>
<td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>17828811</td><td>26</td> </tr>
<tr align="right"><td align="right">/region</td>
<td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>108363878</td><td>105</td> </tr>
<tr align="right"><td align="right">/with_XYZ</td>
<td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>&nbsp;</td><td>&nbsp;</td> <td>35384975</td><td>43</td> </tr>
</table>
I will have a table as above for each directory in the file path supplied at the command line - one table after another.

I allows me to see which directories have not been modified for a long time - so in the above example I could possible archive off the last 5 directories as they have not been modified in the past year.

Does that makes sense?

I wonder if my find command is good enough?
Is -mtime reliable?
Should I use -atime?

(Sorry for the wide page!)
# 4  
Old 10-17-2008
Uh, that's what I was going to do, except I would use "-printf %s %p\n" instead of "-ls". Use mtime. atime is for access time, which you don't really want, do you? Maybe you do... maybe you want when the file was last used, not just modified.
# 5  
Old 10-17-2008
Hmm - I'm not sure. I'm looking at a large file system with many users. They use the system for generating files but also some users just access files that are used by differtent software packages to run large processing jobs.

So I imagine that there will be some files that are accessed but not modified - i.e. read only. I was worried about using the -atime as I think find itself changes the -atime by looking at it - doesn't it? I thought I saw that on a man page, but can't see it just now.
# 6  
Old 10-17-2008
No, the find with with -atime doesn't change the files just for doing a "stat", which is what find does. (Bbut find will change the atime of any directory it reads).
# 7  
Old 10-17-2008
Thanks for your help on this - much appreicated! Maybe I'll post my final script here (is that the done thing/pssible?)

So as long as I concentrate my find command to search files only, then maybe I should be doing a -atime to bring out the date that files that are being accessed rather than modified.

I asussume files will never have a 'younger' modified time than accessed time?
Logic would tell me no, but my logic and unix logic are not always compatible ;o)

Also, I'm running this as root but still getting "Permission denied" errors. I've had this before - something to do with my root access only being a semi-root access, via LRAM. I think the groups permissions of my user.

I'll need to catch these errors somehow.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Disk usage monitoring and record the disk used in last 24 hour

HI I am Trying to edit the below code to send email every day with difference of disk utilized in for last 24 hours but instead getting same usage everyday. can you please help me to point out where my calculation is going wrong. Thank you. ================= #!/bin/bash TODAY="at $(date... (0 Replies)
Discussion started by: Mi4304
0 Replies

2. AIX

Question on NMON - disable detailed disk statistics

Hi All, The NMON file is getting generated with file size of more than 70MB for just 40 minute duration on AIX 6.1 box, due to more number of disks (Disk0, Disk1..., Disk22). With respect to disk related details, I need to have only the disk summary, and disable the detailed disk statistics... (6 Replies)
Discussion started by: tssr_2001
6 Replies

3. AIX

Need to know %age disk busy on AIX

Hi , Following alerts are coming for %busy device on a server Disk Device hdisk5 is 100% busy Please assist how do I analyse this and also how do I check the %age busy for hdisk5. Best regards, Vishal (4 Replies)
Discussion started by: Vishal_dba
4 Replies

4. Shell Programming and Scripting

Parse diff output into very detailed & summary report

Hello all; I'll try an explain my dilemma as best I can. But first some background: 1- I am suppose to compare a database to itself before and after changes; basically generate audit trail report. 2- This database contains "RULES" (the id field) that we use for transmitting files. 3 - The... (0 Replies)
Discussion started by: gvolpini
0 Replies

5. UNIX for Dummies Questions & Answers

Print summary or the total disk usage of conf file

hey i want to print the summary or the total disk usage of the configuration files that are in the /etc directory printed in human-readable format. i think i got somewhere right as am using wc *.conf commands but i am unsure how to use to put it in human-readable format with the wc command. ... (13 Replies)
Discussion started by: stefanere2k9
13 Replies

6. Shell Programming and Scripting

Age of file in storage / disk

Hello all, Below is scripts to find the file following by: 30 days <- How many total file space within 30 days and not quantity 90 days 120 days 1 year From here also I can get data space to put on PIE Chart. Following this scripts can I do some enhance from this scripts like do... (1 Reply)
Discussion started by: sheikh76
1 Replies

7. Red Hat

CPU usage: PS versus TOP - Different output

CPU usage: PS versus TOP - Different output Hi When monitoring a Linux environment with PS command, reported CPU value for a certain process differs, sometimes greatly, from the value seen in TOP. I understood this is not a bug, they report different data. I can't understand the nature... (4 Replies)
Discussion started by: liav
4 Replies

8. Solaris

current CPU usage, memory usage, disk I/O oid(snmp)

Hi, I want to monitor the current cpu usage, monitor usage , disk I/o and network utlization for solaris using SNMP. I want the oids for above tasks. can you please tell me that Thank you (2 Replies)
Discussion started by: S_venkatesh
2 Replies

9. HP-UX

How to summary one command's cpu usage?

I want to record one application's(like oracle etc...) CPU usage summary. I can filter by "ps". But how to sum? Thanks (1 Reply)
Discussion started by: jiarong.lu
1 Replies

10. Filesystems, Disks and Memory

How do you display summary of disk usage?

I am trying to create a command string that makes use of the du or df utilities to show block count in kilobytes (1024 bytes) instead of multiples of 512 bytes, any suggestions? Thanks..... (3 Replies)
Discussion started by: klannon
3 Replies
Login or Register to Ask a Question