Two questions on find with rm command


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Two questions on find with rm command
# 1  
Old 06-26-2015
Two questions on find with rm command

Version: Oracle Linux 6.4

In the below directory, we had 1.6 million audit log files with the extention .aud which are older than 20 days .

I ran a find with rm command as shown below. But, I had to cancel the execution of the below command after 4 hours as I don't want to run a long running command during peak hours

find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +20 -exec rm -f {} \;

Question1.Is there a way I could tweak this command to run faster ?

This server has no shortage of CPUs. It has 40 CPU cores ( 80 cores when Hyperthreading is considered ). It has 256G RAM too.

Later I realized that this directory has files which are older than 500 days. So, I should have done this in chunks ; 100 at time as shown below.

Code:
find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +500 -exec rm -f {} \;
find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +400 -exec rm -f {} \;
find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +300 -exec rm -f {} \;
find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +200 -exec rm -f {} \;
find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +100 -exec rm -f {} \;
find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +20 -exec rm -f {} \;

So, I ran the following command to delete files older than 500 days. But it still took 32 minutes to delete 76,000 files !!

Code:
$ find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +500 | wc -l
76099
$ find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +500 -exec rm -f {} \;

So, I would like to know if there is a way I could tweak this command to run faster ?

Question2.

find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +20 -exec rm -f {} \;

During the above mentioned find and rm command execution , from another session, I tried to run an ls command on this directory to see how many files are remaining. But, I got the below error

Code:
$ pwd
/u01/product/11.2.0/rdbms/audit
$ ls -alrdt *.aud | wc -l
-bash: /bin/ls: Argument list too long
0

Is there a way I could see the progress of this command with something like a progress bar?

Last edited by John K; 06-26-2015 at 11:54 AM..
# 2  
Old 06-26-2015
Hello John,

There are 2 points here.
1st: You can use following command which will be faster than current command.
Code:
 find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +20 -exec rm -rf {} +

2nd: If you want to see progress of command, means which content has deleted you can use following command, but yes it will slower than above because it is using while with find.
Code:
 find /u01/product/11.2.0/rdbms/audit -name '*.aud' -mtime +20  | while  read file; do rm -rf $file;if [[ $? == 0 ]]; then echo $file " has been deleted"; else echo $file " has NOT been deleted"; fi;done

Thanks,
R. Singh
These 2 Users Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 06-26-2015
The + , you will benefit the most, since rm command will be executed for group of files find finds, opposing to \; which will work one by one.

Linux find has a switch -delete .
You might check performance running that (should be faster since no external program is passed to exec, in your case rm).
This User Gave Thanks to Peasant For This Post:
# 4  
Old 06-26-2015
Thank You Ravinder, Peasant

Following is from the man page of find in Oracle Linux 6.4

Code:
 -exec command {} +
              This  variant  of the -exec action runs the specified command on
              the selected files, but the command line is built  by  appending
              each  selected file name at the end; the total number of invoca-
              tions of the command will  be  much  less  than  the  number  of
              matched  files.   The command line is built in much the same way
              that xargs builds its command lines.  Only one instance of  ‘{}'
              is  allowed  within the command.  The command is executed in the
              starting directory.

This is what I understand from the above paragraph of find's man page.

When you use -exec rm -rf {} \; , the rm command is executed for each file (making it slower) .
and when you use -exec rm -rf {} + , the rm command is executed once in a while (although the frequency of rm execution is not mentioned in man page).
Is my assumption right ?
# 5  
Old 06-26-2015
Hello John,

-exec...\; will run one item after another. So if you have three files, the exec line will run three times. -exec ... {} + is for commands that can take more than one file at a time (eg cat, stat, ls). The files found by find are chained together like an xargscommand. This means less forking out and for small operations, can mean a substantial speedup.
Code:
 $ mkdir testdir
$ touch testdir/{0000..9999}
 $ time find testdir/ -type f -exec cat {} \;
real    0m8.622s
user    0m0.452s
sys     0m8.288s
 $ time find testdir/ -type f -exec cat {} +
real    0m0.052s
user    0m0.015s
sys     0m0.037s

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 6  
Old 06-26-2015
Quote:
Originally Posted by John K
When you use -exec rm -rf {} \; , the rm command is executed for each file (making it slower) .
and when you use -exec rm -rf {} + , the rm command is executed once in a while (although the frequency of rm execution is not mentioned in man page).
This is basically correct. Notice that you can delete several files at once because rm takes not a single file name but a file list as an argument. Suppose you have 4 files, "a", "b", "c" and "d" you could use:

Code:
rm -f a b c d

and have them deleted in one call of rm. This is why a call like
Code:
rm -f *

works: the shell will expand "*" to such a list of files prior to even call rm and it will happily take it.

On the other hand, command lines have a limited length and the aforementioned "*" might make the command fail once there are too many file names it expands to. Furthermore, every command can only take so many arguments. You may want to try this (in a non-destructive way): execute

Code:
ls *

in the directory with the 1.6 million files of yours you will perhaps see either a "command line too long" or a "too many arguments" error. The same would happen with rm for the same reason.

So this is why creating such a list by find and then feed it to a program (regardless of this program being rm or something else) is a bad idea. This is why the command xargs was developed and for the same reason there is the "+" device in find. Both these are designed to cut a big, unmanageable list into smaller pieces and feed these pieces to a program, one at a time.

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
# 7  
Old 06-26-2015
Thank You very much Bakunin, Ravinder

Question3.

I ran an ls command from within find using \; and + variants as shown below.
Both seems to return same results. If \; variant is slow , then why do people even use it ?
Code:
[root@emeatst179 test3]# ls -alrdt
drwxr-xr-x. 2 root root 20480 Jun 26 13:04 .
[root@emeatst179 test3]#
[root@emeatst179 test3]# touch {0..9}
[root@emeatst179 test3]#
[root@emeatst179 test3]# find /tmp/test3 -exec ls -alrdt {} \; | wc -l
11
[root@emeatst179 test3]# find /tmp/test3 -exec ls -alrdt {} + | wc -l
11
[root@emeatst179 test3]# find /tmp/test3 -exec ls -alrdt {} +
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/9
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/8
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/7
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/6
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/5
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/4
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/3
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/2
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/1
-rw-r--r--. 1 root root     0 Jun 26 13:04 /tmp/test3/0
drwxr-xr-x. 2 root root 20480 Jun 26 13:04 /tmp/test3
[root@emeatst179 test3]#
[root@emeatst179 test3]#
[root@emeatst179 test3]# find /tmp/test3 -exec ls -alrdt {} \;
drwxr-xr-x. 2 root root 20480 Jun 26 13:04 /tmp/test3
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/3
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/5
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/8
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/2
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/1
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/0
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/9
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/4
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/7
-rw-r--r--. 1 root root 0 Jun 26 13:04 /tmp/test3/6
[root@emeatst179 test3]#

Question4.
Ravinder's quick test shows that \; variant takes 8 seconds and + variant takes less than a second. Significant improvement.

So, let me rephrase my interpretation of the excerpt from find's man page which I pasted above.

When we use -exec rm -rf {} \; , the rm command is executed for each file .
When we use -exec rm -rf {} + , the rm command will process several files in each execution.

For example: If you have 5 files in a directory named a b c d e

-exec rm -rf {} \; variant will be executing 5 times
So, internally it will be executing something like below
Code:
-exec rm -rf {a} \;
-exec rm -rf {b} \;
-exec rm -rf {c} \;
-exec rm -rf {d} \;
-exec rm -rf {e} \;

-exec rm -rf {} \; variant will be executing maybe once for every 5 files. So, internally it will be executing something like below

-exec rm -rf {a,b,c,d,e} \;

Regarding how often rm command will be executed when you use -exec rm -rf {} + variant , the documentation is not very clear. It just says "the total number of invocations of the command will be much less than the number of matched files" as shown above

Are my above assumptions correct ?

Last edited by John K; 06-26-2015 at 12:20 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

find command, "basic" questions

find $HOME \ ( \( -name ´*.bak´ -ctime +20 \) -o \ \( -size 0 -user kurs00 \) \) -exec rm -i {} \; -print this is the syntax, i know what -name, -ctime and so on means, but i don't know what the -o or the \\ or the () or the {} mean. Can someone please explain? I searched the internet, I... (4 Replies)
Discussion started by: Dr. Nick
4 Replies

2. Shell Programming and Scripting

Questions about the crypt command

hi all, My aim is to encrypt a file using 'crypt' command. Which is the package I need to install to get this command work? (because it says, crypt: command not found ) I'm working on a NetBSD 3.1 machine.. please help (1 Reply)
Discussion started by: renjumc
1 Replies

3. Solaris

2 questions regarding the WGET command

I'm using the "wget" command to get the date from Yahoo.com. So this is what I use on Solaris: /usr/sfw/bin/wget --timeout=3 -S Yahoo! This works well when my computer is linked to the Net. But when it's not, this command just hangs. I thought putting the timemout = 3 will make this... (2 Replies)
Discussion started by: newbie09
2 Replies

4. UNIX for Dummies Questions & Answers

More find command questions

Last week I was helped in finding certain filenames and removing them using the following command and it worked fine. find /path/to/files -name 'WQ*' -type f -exec rm -f {} \; This week, I need to find certain characters within a certain file. For example, I need to find scripts that... (2 Replies)
Discussion started by: NycUnxer
2 Replies

5. UNIX for Dummies Questions & Answers

command questions

Hi, can anyone answer the following questions? 1.How do you check for particular exception in a growing log file? 2.How do you terminate a long running process? What if there are multiple instances running? Thanks James (1 Reply)
Discussion started by: james94538
1 Replies

6. UNIX for Dummies Questions & Answers

Questions on GREP command

Hi, Is it possible to display a specific number of lines starting from a line having a particular text using grep command? e.g. I have a text file with the contents below: AAA BBB CCC DDD EEE FFF I want to display 3 lines starting with the line having "BBB" to get the result below:... (11 Replies)
Discussion started by: stevefox
11 Replies

7. UNIX for Dummies Questions & Answers

I have a questions about mail command

Any possibility to send a mail to internal mail using mail command? i am using fedora7. Example: username@192.168.1.1 (1 Reply)
Discussion started by: btech_raju
1 Replies

8. Shell Programming and Scripting

Dummy questions about how to get the size of a directory by command

Hi, 'ls -ld' is no use .... I want to get the total size of a directory including subdir. Any advice? Thanks in advance (2 Replies)
Discussion started by: GCTEII
2 Replies

9. UNIX for Dummies Questions & Answers

unix command questions?

I read the description of the less command and I'm puzzled that it says you can go backwards while using more(1). I created a large file and when I run the more command on it I can move forward with the spacebar and move backward with the letter 'b'. Granted, the less command has more command... (2 Replies)
Discussion started by: wmosley2
2 Replies
Login or Register to Ask a Question