Conditional delete


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Conditional delete
# 1  
Old 08-28-2018
Conditional delete

Hi Friends,

I have somefiles like
Code:
20180720_1812.tar.gz
20180720_1912.tar.gz 
20180720_2012.tar.gz 
20180720_2112.tar.gz 
20180721_0012.tar.gz 
20180721_0112.tar.gz 
20180721_0212.tar.gz  
20180721_0312.tar.gz

in a directory and so on..these files gets created every 3 hours where as first part of file name is date and second part is time.

however as its occupying more space on disk, its required to delete old files which we manually doing now...in such a way that only last 30 days backup files should be there, any file before that should be deleted

however we have a challenge even after clearing files before 30 days, still lot of disc filled, now idea is to keep only 1 file for any given day which is being last time stamp on that day and delete rest of files for that day , in such way I can retain one file for any day and free up some space.

for example , for 2018-7 -20, i can retain 20180720_2112.tar.gz -- assuming this the last backup for that day, delete rest 3 files ..in that way, i will have atleast one file of backup for a day and free up some space by deleting rest of the copies for that given day

Code:
20180720_1812.tar.gz
20180720_1912.tar.gz 
20180720_2012.tar.gz 
20180720_2112.tar.gz

Any idea how can i do it conditionally. Appreciate any help




Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 08-28-2018 at 10:10 AM.. Reason: Added CODE tags.
# 2  
Old 08-28-2018
How about


Code:
ls 2018* | sort -ur -k1,1 -t_ | cut -d_ -f1 | while read TS; do echo rm $(ls -r $TS* | tail -n +2); done
rm 20180721_0212.tar.gz 20180721_0112.tar.gz 20180721_0012.tar.gz
rm 20180720_2012.tar.gz 20180720_1912.tar.gz 20180720_1812.tar.gz


Remove the echo when happy with the proposed result.
This User Gave Thanks to RudiC For This Post:
# 3  
Old 08-29-2018
Quote:
Originally Posted by RudiC
How about


Code:
ls 2018* | sort -ur -k1,1 -t_ | cut -d_ -f1 | while read TS; do echo rm $(ls -r $TS* | tail -n +2); done
rm 20180721_0212.tar.gz 20180721_0112.tar.gz 20180721_0012.tar.gz
rm 20180720_2012.tar.gz 20180720_1912.tar.gz 20180720_1812.tar.gz


Remove the echo when happy with the proposed result.
Thank you Rudic, let me try. my apologies for not including Code tags.






Hi Rudic,


Its working fine. thank you...now i get little bit complicated requirement...

we have a mount point called /backup (300 GB)under which these files will be places every 3 hours...

now if disc space crosses 60 % of /backup and also for example if 90% is filled up in that 300 gb then our logic /program should stop at a point where deletion of files brought down to 60 % (starting from old date )

for example,
if /backup mount reaches 90 % , now it has to brought down to 60 % which is threshold. Now let us assume files are from
Code:
20180701_0112.tar.gz...to 20180831_2112.tar.gz..

now solution should start from 2018-07-01 and start deleting files for that day except last timestamp file per day (your proposed solution is already doing this) and it should continue only till a point where deleting till for example till 2018-07-03 and if this range is enough to disc usage to 60 % then our logic should exit..some thing like...that, next time when again disc cross beyond 60% then, when we run our command it should start from
Code:
2018-07-04

as

Code:
2018-7-01
2018-7-02,
2018-7-03 already has only 1 file per day...

Sorry if my explanation not clear or complicated requirement.. If possible please help . the idea is to not deleting files for all dates, rather limit it till disc usage comes to 60 %

Last edited by onenessboy; 08-29-2018 at 02:15 AM..
# 4  
Old 08-29-2018
I would be tempted to try a slightly simpler pipeline for what RudiC suggested:
Code:
ls -1 2[0-9][0-9][0-9][01][0-9][0-3][0-9]_[0-2][0-9][0-5][0-9].tar.gz |
awk -F_ '
$1 == last {
	print "echo rm " file
	file = $0
	next
}
{	last = $1
	file = $0
}' | sh

If the above prints a list of the rm commands you want to run, remove the echo from the script and run it again.

Note that you should also tell us what operating system and shell you're using when you start a thread in the Shell Programming and Scripting forum so we don't suggest things that can't work in your environment. If you are using a Solaris/SunOS system, change awk in the above script to /usr/xpg4/bin/awk or nawk.

If I create the files you named in post #1 in a directory and run the above script in that same directory, the output produced is:
Code:
rm 20180720_1812.tar.gz
rm 20180720_1912.tar.gz
rm 20180720_2012.tar.gz
rm 20180721_0012.tar.gz
rm 20180721_0112.tar.gz
rm 20180721_0212.tar.gz

On most systems you can omit the -1 option (that is the digit one; not the letter ell), but on some old systems the ls utility doesn't produce one name per line of output when output is directed to a pipe (as required by the standards).

Obviously, you can add a df on your source filesystem and check for the desired level of free space before or after each file is deleted or each time the date changes. Since your requirements aren't clear as to when this testing should be performed, I'll leave that as an exercise for the reader. (The output format produced by df also varies somewhat depending on what options you use and what operating system you're using. And, I'm not going to try to guess what OS you're using.)
This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 08-29-2018
Quote:
Originally Posted by Don Cragun
I would be tempted to try a slightly simpler pipeline for what RudiC suggested:
Code:
ls -1 2[0-9][0-9][0-9][01][0-9][0-3][0-9]_[0-2][0-9][0-5][0-9].tar.gz |
awk -F_ '
$1 == last {
	print "echo rm " file
	file = $0
	next
}
{	last = $1
	file = $0
}' | sh

If the above prints a list of the rm commands you want to run, remove the echo from the script and run it again.

Note that you should also tell us what operating system and shell you're using when you start a thread in the Shell Programming and Scripting forum so we don't suggest things that can't work in your environment. If you are using a Solaris/SunOS system, change awk in the above script to /usr/xpg4/bin/awk or nawk.

If I create the files you named in post #1 in a directory and run the above script in that same directory, the output produced is:
Code:
rm 20180720_1812.tar.gz
rm 20180720_1912.tar.gz
rm 20180720_2012.tar.gz
rm 20180721_0012.tar.gz
rm 20180721_0112.tar.gz
rm 20180721_0212.tar.gz

On most systems you can omit the -1 option (that is the digit one; not the letter ell), but on some old systems the ls utility doesn't produce one name per line of output when output is directed to a pipe (as required by the standards).

Obviously, you can add a df on your source filesystem and check for the desired level of free space before or after each file is deleted or each time the date changes. Since your requirements aren't clear as to when this testing should be performed, I'll leave that as an exercise for the reader. (The output format produced by df also varies somewhat depending on what options you use and what operating system you're using. And, I'm not going to try to guess what OS you're using.)
Hi Don Cragun,

Thanks for your reply. Apologies , i shall try describe requirement better. We are using RHEL 7.4 as OS.

as for "Since your requirements aren't clear as to when this testing should be performed, I'll leave that as an exercise for the reader"

We used to get alert from network team that particular node is having high disc utilisation, at the point we manually login into that box perform this housekeeping activity (by deleting files and freeup space ) to bring down to 60 %. There is no need of program to automatically run when disc usage is high.. only thing is when we get alert that disc space is high, then when we execute solution it should delete files from start date based on file name in ascending order (for example like i mention if we have files from 20180701 to 20180831 then it has to start deleting files from date 20180701(keep only last copy for that day ,delete rest) ---> then check if discspace came down to 60 % if not --> continue with next date i.e. 20180702 (keep only last copy for that day ,delete rest)--df check if space is below 60 % --if not take next date i.e 20170703 (keep only last copy for that day ,delete rest) --> df check if its 60 % then exit the program...--> now after some days again if we get notification disc space ---> when we run program --> it should start from 20170704 start doing deletes till such date space equals 60%

apologies if its not still clear...

Last edited by onenessboy; 08-29-2018 at 03:19 AM..
# 6  
Old 08-29-2018
Talking about simplifying, why not


Code:
ls -r1 2[0-9][0-9][0-9][01][0-9][0-3][0-9]_[0-2][0-9][0-5][0-9].tar.gz | awk -F_ 'T[$1]++ {print "echo rm " $0}'  | sh

This User Gave Thanks to RudiC For This Post:
# 7  
Old 08-29-2018
Hi onenessboy,
You can start by showing us the complete, exact output produced by the command:
Code:
df -P /backup

If the output from the above command doesn't complain about an unknown -P option, the percentage of space used on the filesystem containing /backup should be in field #5 on line #2 of the output from the above command.

If it does complain about an unknown -P option, show us the complete, exact output from the command:
Code:
df /backup

so we can figure out which field and line we need to examine to determine if you have reached your goal.

Hi RudiC,
Why not:
Code:
ls -r1 2[0-9][0-9][0-9][01][0-9][0-3][0-9]_[0-2][0-9][0-5][0-9].tar.gz | awk -F_ 'T[$1]++ {print "echo rm " $0}'  | sh

Because that will remove old files from the most recent date first. And onenessboy wants to remove old backup files from the oldest date first. And he wants to add code to exit the script when the capacity on that filesystem drops below 60% after each date change when one or more backup files were removed for any given date.
These 2 Users Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Conditional delete -- New glitch

Hi Please dont consider this as duplicated post.. I am using below pattern to find delete files to bringdown disc size.. however how i can make sure ist going to correct folder and searching for files... while print "echo rm " LastFile correctly print files names for deletion, but when i... (7 Replies)
Discussion started by: onenessboy
7 Replies

2. UNIX for Beginners Questions & Answers

(g)awk conditional substitution issues when attempting to delete character

A portion of my input is as follows: 1087 IKON01,49 A WA- -1 . -1 . 0 W WA- -1 . -1 . 0 . -1 . -1 -1 -1 -1 -1 -1 W 1088 IKON01,49 A J.@QU80MW. 2... (2 Replies)
Discussion started by: jvoot
2 Replies

3. Shell Programming and Scripting

Conditional execution

Here's an interesting (to me, anyway) little puzzle. Background: I have a process that restores a number of large(ish) files from tape backup. As an individual file is being written, it is owned by root, then the ownership is changed when that file is complete. Since this process can take... (3 Replies)
Discussion started by: edstevens
3 Replies

4. Shell Programming and Scripting

Conditional search and delete using SED / Shell script

Hi, I want to perform a conditional search and remove my search string. Input string: "abcdaabcadgfaarstab" Character to search: "a" Condition: Remove all "a" in the input string except if it is "aa" Output string: "bcdaabcdgfaarstb" Can you please help me in this? (5 Replies)
Discussion started by: dominiclajs
5 Replies

5. Shell Programming and Scripting

if conditional statement

Hi, I have a script like this: sample.sh mapping=$1 if then echo "program passed" fi I'm running the above script as ./sample.sh pass The script is not getting executed and says "integer expression expected" Could anyone kindly help me? (2 Replies)
Discussion started by: badrimohanty
2 Replies

6. Shell Programming and Scripting

conditional statement

Hi all, The following code is to find if a list of numbers from one file are within the range in another file. awk -F, '\ BEGIN { while ((getline < "file2") > 0) file2=$3 } {for (col1 in file2) if ($0>=30 && $1<=45) print $0} ' FILE1 But where I have the number 30 and 45, I... (3 Replies)
Discussion started by: dr_sabz
3 Replies

7. Shell Programming and Scripting

If conditional

Hi, I am new to unix and shell scripting.In my script,there is a line using the "if" conditional - if && ; then do something Here "x" is a variable holding string value.If it is not equal to a comma or a string,only then I want to enter the "if" loop. But I am getting error while... (12 Replies)
Discussion started by: abhinavsinha
12 Replies

8. UNIX for Dummies Questions & Answers

If conditional

Hi, I am new to unix and shell scripting.In my script,there is a line using the "if" conditional - if && ; then do something Here "x" is a variable holding string value.If it is not equal to a comma or a string,only then I want to enter the "if" loop. But I am getting error while... (1 Reply)
Discussion started by: abhinavsinha
1 Replies

9. UNIX for Dummies Questions & Answers

conditional

conditional is not wworking can any one figure out what goes wrong xx1=`$ORACLE_HOME/bin/sqlplus -s apps/ostgapps1 2>/dev/null << EOF WHENEVER SQLERROR EXIT 1 set head off feedback off ; WHENEVER SQLERROR EXIT SQL.SQLCODE; select count(*) from CMS_INVOICE_ALL... (2 Replies)
Discussion started by: u263066
2 Replies

10. Shell Programming and Scripting

Conditional Statements

How can I compare two decimal values within a function using Bash? Function fun2 isn't comparing the decimal values. Is there a way to do this using Bash or Korn? #!/bin/bash set -x x=1 z=110 function fun1() { i=`bc << EOF 2>> /dev/null scale=3 ... (1 Reply)
Discussion started by: cstovall
1 Replies
Login or Register to Ask a Question