script to check large directory--help


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting script to check large directory--help
# 1  
Old 04-14-2011
script to check large directory--help

All,
I have one script which gives me the O/P of "percentage of filesystems utilization". we have four filesystem for which i want to check and get the mail when utilization is more than 40%. below are the filesystems.
Code:
/AB/Filesy1 
/AB/Filesy2
/AB/Filesy3
/AB/Filesy4

Below script is working fine and I am getting mail as expected.

Quote:
#!/bin/bash
df -k /AB/* |grep % |awk '{print $4,$5}' |sed 's/%//g'| while read OP;
do
echo $OP
uses=$(echo $OP | awk '{ print $1}' | cut -d'%' -f1 )
utilization=$(echo $OP | awk '{ print $2 }' )
if [ $uses -ge 40 ]; then
echo "Running out of space \"$utilization ($uses%)\" on $(hostname) as on $(date)" |
mail -s "Alert: Almost out of disk space $uses%" abc@xyz.com
fi
done
Now I want information of large directories for that filesystem which utilization is more than 40% by using same script. I mean to say.. suppose filesystem /AB/Filesy1 is utilized more than 40%, so in the same mail I want to see the list of directories which is occupying more space under /AB/Filesy1. like by using du -ks /AB/Filesy1/* |sort -n ...
Could any one please help me to edit the same script and in the same mail I can get the details of filesystem utilization and list of big directories.

Last edited by Yogesh Sawant; 04-17-2011 at 06:43 AM.. Reason: added code tags
# 2  
Old 04-14-2011
You processing might be simpler if you interrogate each file system, one at at time: "for fs in /AB/* . . . .", as then you can grep -q for % you dislike. For finding biggies, I like a mixed report of big dir/ and big file. You can get du to do most of it, and in ksh on /dev/fd/# systems or in bash, this runs in pipeline parallel, with about a 100Kb cutoff between sed pattern and file options. You need a few more options if you want to keep find and du on one filesystem, but you can read the man pages, too:
Code:
/usr/bin/bash
 
sort -nrm <(
  du -k $1 | sed '
    /^[0-9]\{3\}/d
    s/$/\//
   ' | sort -nr
 ) <(
  find $1 -type f -size +102400c |xargs -n999 du -k | sort -nr
 )

# 3  
Old 04-14-2011
Code:
#!/bin/bash
big=100M ; mount=/AB/
df -k $mount |grep [0-9]% |awk '{print $4,$5}' |sed 's/%//g'| while read utilization uses;
 do
  if [ $uses -ge 30 ]; then
   echo -e "Big Files More Than $big\n" >results
   find $mount -size +$big -exec ls -lh {} \; >>results
   mail -s "Alert: Almost out of disk space $uses%" abc@xyz.com <results
  fi
 done

# 4  
Old 04-14-2011
"find ... -exec xxx" is far less scalable than "find ...|xargs -n999 xxx"

From a functional point of view, you might send the summary every time, but then break out the biggies. You might even send a separate email for each file system that is over.
# 5  
Old 04-14-2011
Why -n999 anyway? Doesn't xargs know the maximum argument size for the system?
# 6  
Old 04-14-2011
xargs has more advantages but always is not best solution.
xargs has some limits for functional point also.

for example xargs may have problems with files that contain embedded spaces.
for this you must add this in your script
Code:
-print0|xargs -0

and xargs does not support the -0 option in solaris..(as additional,gnu find works)

if you think usage of arg list with exec, so problem is argument list, as far as I know, Linus has worked on this issue (especially exec.c and mm.h and other related files) and removed arg_max since 2.6.23 (and also 2.6.23.rc1).
so the total size for argument list is limited to 1/4 the allowed of stack size.

However in your script , i dont want to use -n999 like Corona688 said.
Maybe we can use less than 999 for some systems that has argv + envp limits.

But functional point is argumentative for unix and linux variants (and architectures has no mmu)

I can prefer maybe shell internals like
Code:
for f in `find ..` ; do .. ; done

Although, xargs is much faster than exec always Smilie

regards
ygemici
# 7  
Old 04-14-2011
Quote:
Originally Posted by ygemici
xargs has more advantages but always is not best solution.
xargs has some limits for functional point also.

for example xargs may have problems with files that contain embedded spaces.
Easily solved with -d '\n' for the most part.
Quote:
and xargs does not support the -0 option in solaris..(as additional,gnu find works)
It does have -d though.
Quote:
if you think usage of arg list with exec, so problem is argument list, as far as I know, Linus has worked on this issue (especially exec.c and mm.h and other related files) and removed arg_max since 2.6.23 (and also 2.6.23.rc1).
so the total size for argument list is limited to 1/4 the allowed of stack size.
Oh, goodie. 300 miles more rope to hang ourselves with. Smilie
Quote:
However in your script , i dont want to use -n999 like Corona688 said.
Maybe we can use less than 999 for some systems that has argv + envp limits.
I think you missed my point -- xargs would know the maximum size of args for the system already and split accordingly.

Code:
$ cat >argc.c <<EOF
> #include <stdio.h>
> int main(int argc, char *argv[])
> { printf("argc=%d\n", argc); return(0); }
> EOF
$ gcc argc.c
$ while true ; do echo -e "a\na\na\na\na\na\na\na" ; done | xargs ./a.out
argc=65533
argc=65533
argc=65533
argc=65533
argc=65533
^C
$

...so the -n999 is redundant.

Quote:
I can prefer maybe shell internals like
Code:
for f in `find ..` ; do .. ; done

Shoving the too many args into backticks and for doesn't make too many args not be too many args. You have to do while read FILENAME ; do stuff ; done

Last edited by Corona688; 04-14-2011 at 05:20 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script to check the files existence inside a directory.

Hello Folks, On Below Script, I want to apply condition. Condition that it check the directory for files if not found keep checking. Once directory have files and founded do the calculation and finish the code. Script to check the files existence inside a directory, If not then keep... (16 Replies)
Discussion started by: sadique.manzar
16 Replies

2. UNIX for Dummies Questions & Answers

Shell script to check the file sitting in the directory more than 10 hours

Hi, I require shell script to check for any pending files which are sitting in the particular directory for more than 10 hours. Please help me on this...Thank you. (5 Replies)
Discussion started by: kiruthiish
5 Replies

3. Shell Programming and Scripting

How to check whether directory has files in it or not in shell script?

hi, I am having script in which i want to check if directory has any file in it or not. If directory contains a single or more files then and only then it should proceed to further operations... P.S.: Directory might have thousand number of files. so there might be chance of getting error... (4 Replies)
Discussion started by: VSom007
4 Replies

4. Shell Programming and Scripting

How to copy very large directory trees

I have constant trouble with XCOPY/s for multi-gigabyte transfers. I need a utility like XCOPY/S that remembers where it left off if I reboot. Is there such a utility? How about a free utility (free as in free beer)? How about an md5sum sanity check too? I posted the above query in another... (3 Replies)
Discussion started by: siegfried
3 Replies

5. UNIX for Dummies Questions & Answers

Is it better/possible to pause the rsyncing of a very large directory?

Possibly a dumb question, but I'm deciding how I'm going to do this. I'm currently rsyncing a 25TB directory (with several layers of sub directories most of which have video files ranging from 500 megs to 4-5 gigs), from one NAS to another using rsync -av. By the time I need to act ~15TB should... (3 Replies)
Discussion started by: DeCoTwc
3 Replies

6. Red Hat

Empty directory, large size and performance

Hi, I've some directory that I used as working directory for a program. At the end of the procedure, the content is deleted. This directory, when I do a ls -l, appears to still take up some space. After a little research, I've seen on a another board of this forum that it's not really taking... (5 Replies)
Discussion started by: bdx
5 Replies

7. Shell Programming and Scripting

Using find in a directory containing large number of files

Hi All, I have searched this forum for related posts but could not find one that fits mine. I have a shell script which removes all the XML tags including the text inside the tags from some 4 million XML files. The shell script looks like this (MODIFIED): find . "*.xml" -print | while read... (6 Replies)
Discussion started by: shoaibjameel123
6 Replies

8. Shell Programming and Scripting

script to check for a directory in /home for all users

Following on from this post: https://www.unix.com/shell-programming-scripting/150201-simple-script-mount-folder-all-users-home.html and getting told off for bumping the thread:( Please could someone help me with a short script to check is a certain directory is present in /home for all users... (8 Replies)
Discussion started by: barrydocks
8 Replies

9. Shell Programming and Scripting

check the file in multiple Directory in a script

hi, i want to write the script to list atleast one file inside that directory eg: /home/Log/amp01 /home/log/amp02 . . . /home/log/amp..N want to see atleast one file inside the /home/log/amp01 .... amp(N) if it not there.. need to give that no file exists inside... (3 Replies)
Discussion started by: mail2sant
3 Replies
Login or Register to Ask a Question