How to find all files other than last two dates per month and year?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to find all files other than last two dates per month and year?
# 8  
Old 04-29-2014
Quote:
Originally Posted by Makarand Dodmis
till now i only get expected result by below script
Code:
nawk '$6!=m{m=$6; c=0} {if($7!=d){if(c++>n)print b p; b=x} else b=b p ORS} {p=$0; d=$7}' n=2

this was created by Scritinizer but frankly speaking i am not able to reuse it for this thread.

I also currently i have solution for this thread but it is taking 40-45 mins and it includes for loops.

So if i get any better solution then it would be good.
So please show us your working solution.
# 9  
Old 04-29-2014
Code:
function criteria_purge
{
    tradecheck=`pwd`
    echo "Files eligible for purging...."
    $1| while read dm_file
    do
 
  day=`perl -MPOSIX -le 'print strftime "%d", localtime((lstat)[9]) for @ARGV' "$dm_file"`
        year=`perl -MPOSIX -le 'print strftime "%Y ", localtime((lstat)[9]) for @ARGV' "$dm_file"`
  mon=`perl -MPOSIX -le 'print strftime "%b", localtime((lstat)[9]) for @ARGV' "$dm_file"`        
        dday=`expr $day + 0`
        cntr=0
                                cntr2=0
                                flag1=0
                                flag2=0
 
         ls -ltr * |nawk -v mon=$mon -v year=$year '{if ($8 == year && $6 == mon) {print $9}}'| while read inner_file
         do
            imfile=`find $tradecheck -name $inner_file`
            i_day=`perl -MPOSIX -le 'print strftime "%d", localtime((lstat)[9]) for @ARGV' "$imfile"`
            i_dday=`expr $i_day + 0`
 
            purge_last "$dday" "$i_dday"
         done
         if [ $flag1 -eq 0 ] && [ $flag2 -eq 0 ]; then
         echo $dm_file $day $mon $year
         fi
    done
}
function purge_last
{
 if [ $1 -ge $2 ]; then
  flag1=1
 else
  if [ $cntr2 -eq 0 ]; then
   check_day=$2
   cntr2=`expr $cntr2 + 1`
  fi
 
     if [ ! $check_day -eq $2 ]; then
   cntr=`expr $cntr + 1`
  fi
 fi
 if [ $cntr -eq 0 ]; then
  flag2=1
 else
     flag1=0
     flag2=0
 fi
}
# purge the files which are 90 days older
cd /purge_dir
var3="find . ! -name -prune -type f -mtime +90"
criteria_purge "$var3"

Also i am new to unix so need some time and guidance to learn awk & perl

Last edited by Makarand Dodmis; 04-29-2014 at 01:26 PM..
# 10  
Old 04-30-2014
Your script is slow because it is invoking several utilities (perl (multiple times), awk, and find) for each file it is processing.

And although it is invoking perl three times to get the month, day, and year for each file and again for each file that it is being compared to, the awk statement that is looking for a match on the month and year is still using the ls timestamp or year field to compare against the year field for the current file. Therefore, it is not listing all of the files eligible for purging that are in months that contain days that are 90 to 180 days ago. For example, in a directory that contains the files:
Code:
-rw-r--r--  2 dwc  staff     0 Oct 31 12:00 z.txt
-rw-r--r--  2 dwc  staff     0 Oct 30 13:00 z10.2.txt
-rw-r--r--  2 dwc  staff     0 Oct 30 12:00 z10.txt
-rw-r--r--  2 dwc  staff     0 Oct  1  2013 b.txt

your script will not list b.txt as a purge candidate.

---------

One of your find statements:
Code:
find . ! -name -prune -type f -mtime +90

is weird. Are you really trying to exclude a file named -prune? Were you, perhaps, trying to exclude files in subdirectories instead? That would be:
Code:
find . ! -name . -prune -type f -mtime +90

but it still won't work because you have another find statement nested inside the loop that doesn't ignore subdirectories. So, assuming that /purge_dir doesn't contain any subdirectories, you just need:
Code:
find . -type f -mtime +90

(Note that you can process directories with subdirectories as long as there aren't any files in the subdirectories with the same names as files in /purge_dir if you make the change suggested above.)

Assuming that you are using a Solaris system (since you're script contains nawk instead of awk) and that you're using an old Bourne shell (rather than ksh or bash since you're using the `command` form of command substitution rather than $(command)), the following should work for you. In a test on a small directory with one subdirectory containing the files:
Code:
ls -lR
total 24
-rwxr-xr-x   1 dwc  staff  1512 Apr 29 16:07 Makarand.sh
-rw-r--r--   2 dwc  staff     0 Feb 21  2012 a.txt
-rw-r--r--   3 dwc  staff     0 Oct  1  2013 b.txt
-rw-r--r--   3 dwc  staff     0 Mar 19  2012 c.txt
-rw-r--r--   3 dwc  staff     0 Mar 21  2012 d.txt
-rw-r--r--   3 dwc  staff     0 Apr 12 01:02 e.txt
-rw-r--r--   3 dwc  staff     0 Mar 22  2012 f.txt
-rw-r--r--   3 dwc  staff     0 Apr 21 03:04 g.txt
-rw-r--r--   3 dwc  staff     0 Mar 24  2012 h.txt
-rw-r--r--   3 dwc  staff     0 Apr 22 05:06 i.txt
-rw-r--r--   2 dwc  staff     0 Feb 27  2012 j.txt
-rw-r--r--   2 dwc  staff     0 Feb 23  2012 k.txt
-rw-r--r--   3 dwc  staff     0 Apr 23 07:08 m.txt
-rw-r--r--   3 dwc  staff     0 Apr 27 09:10 n.txt
-rw-r--r--   1 dwc  staff  2636 Apr 29 10:01 problem
-rw-r--r--   2 dwc  staff     0 Feb 12  2012 q.txt
-rw-r--r--   2 dwc  staff     0 Feb 22  2012 s.txt
drwxr-xr-x  16 dwc  staff   544 Apr 29 13:32 sub
-rwxr-xr-x   1 dwc  staff   832 Apr 29 16:43 tester
-rw-r--r--   3 dwc  staff     0 Mar  1  2013 y.txt
-rw-r--r--   3 dwc  staff     0 Oct 31 12:00 z.txt
-rw-r--r--   3 dwc  staff     0 Oct 30 13:00 z10.2.txt
-rw-r--r--   3 dwc  staff     0 Oct 30 12:00 z10.txt

./sub:
total 0
-rw-r--r--  3 dwc  staff  0 Oct  1  2013 b.txt
-rw-r--r--  3 dwc  staff  0 Mar 19  2012 c.txt
-rw-r--r--  3 dwc  staff  0 Mar 21  2012 d.txt
-rw-r--r--  3 dwc  staff  0 Apr 12 01:02 e.txt
-rw-r--r--  3 dwc  staff  0 Mar 22  2012 f.txt
-rw-r--r--  3 dwc  staff  0 Apr 21 03:04 g.txt
-rw-r--r--  3 dwc  staff  0 Mar 24  2012 h.txt
-rw-r--r--  3 dwc  staff  0 Apr 22 05:06 i.txt
-rw-r--r--  3 dwc  staff  0 Apr 23 07:08 m.txt
-rw-r--r--  3 dwc  staff  0 Apr 27 09:10 n.txt
-rw-r--r--  3 dwc  staff  0 Mar  1  2013 y.txt
-rw-r--r--  3 dwc  staff  0 Oct 31 12:00 z.txt
-rw-r--r--  3 dwc  staff  0 Oct 30 13:00 z10.2.txt
-rw-r--r--  3 dwc  staff  0 Oct 30 12:00 z10.txt

the script:
Code:
#!/bin/sh
function criteria_purge {
	print "Files eligible for purging...."
	ls -lt `$1` | /usr/xpg4/bin/awk -v cy=`date +%Y` '
	BEGIN {	y["Jan"] = y["Feb"] = y["Mar"] = cy
		y["Apr"] = y["May"] = y["Jun"] = cy
		y["Jul"] = y["Aug"] = y["Sep"] = cy - 1
		y["Oct"] = y["Nov"] = y["Dec"] = cy - 1
	}
	NF > 8 {if(length($8) == 4)	# Do we have a year or a timestamp?
			yr = $8		#   year
		else	yr = y[$6]	#   timestamp
	}
	lmo != $6 || lyr != yr {
		dim = ld = 0
		lmo = $6
		lyr = yr
	}
	ld != $7 {
		dim++
		ld = $7
	}
	dim > 2 {
		printf("%s %s %s %s\n", $9, $7, $6, yr)
	}'
}

cd /purge_dir
# Uncomment one, and only one, of the follwoing definitions for var3.
# Use following line to process files in current directory and subdirectories.
# var3="find . -type f -mtime +90"
# Use to process files in current directory only.
var3="find . ! -name . -prune -type f -mtime +90"
criteria_purge "$var3"

produces the output:
Code:
./b.txt 1 Oct 2013
./d.txt 21 Mar 2012
./c.txt 19 Mar 2012
./s.txt 22 Feb 2012
./a.txt 21 Feb 2012
./q.txt 12 Feb 2012

in about 0.02 seconds on an old MacBook Pro laptop, while your script (modified to use the same setting for var3 produces the output:
Code:
./a.txt 21 Feb 2012
./q.txt 12 Feb 2012
./s.txt 22 Feb 2012

in about 3.51 seconds.

If I switch the setting of var3 from:
Code:
var3="find . ! -name . -prune -type f -mtime +90"

to:
Code:
var3="find . -type f -mtime +90"

in both scripts, your script produces the output:
Code:
Files eligible for purging....
./a.txt 21 Feb 2012
./q.txt 12 Feb 2012
./s.txt 22 Feb 2012

in about 5.84 seconds, while the script above produces the output:
Code:
./b.txt 1 Oct 2013
./sub/b.txt 1 Oct 2013
./d.txt 21 Mar 2012
./sub/d.txt 21 Mar 2012
./c.txt 19 Mar 2012
./sub/c.txt 19 Mar 2012
./s.txt 22 Feb 2012
./a.txt 21 Feb 2012
./q.txt 12 Feb 2012

still in about 0.02 seconds. I believe the output from the above script is producing the desired output.

However, the order of the output from the above script is sorted in decreasing date order instead of being sorted in increasing alphanumeric filename order. If you want the script above to print the results in alphanumeric order, change the line:
Code:
	}'

at the end of the awk script to:
Code:
	}' | sort

Doing that will add about another 0.01 seconds running time for the sample data shown.

If the argument list given to ls is too long, we can work on an alternative, but it won't be quite as fast.
# 11  
Old 04-30-2014
Thanks Don for your comments

1) I want to consider subdirectories.
2) i want to delete all files 180 days older hence
PHP Code:
b.txt 
not in list;its fine.
on find command it was 90 ..testing going on .. forgot to remove.. it should be 180
3) yes subdirectories contain arround 2000 files hence it is taking 40-45 mins
4) i am using solaris + ksh
# 12  
Old 04-30-2014
Quote:
Originally Posted by Makarand Dodmis
Thanks Don for your comments

1) I want to consider subdirectories.
2) i want to delete all files 180 days older hence
PHP Code:
b.txt 
not in list;its fine.
on find command it was 90 ..testing going on .. forgot to remove.. it should be 180
3) yes subdirectories contain arround 2000 files hence it is taking 40-45 mins
4) i am using solaris + ksh
So did you try my suggestion with:
Code:
var3="find . -type f -mtime +180"

How long did it take? Or, did you hit an arg max limit on the ls -lt?

What were you trying to do in:
Code:
find . ! -name -prune -type f -mtime +90

with the operands shown in red?

In the future when you present problems like this, mention that you're working on a file hierarchy (rather than just files in a single directory). Knowing what we're trying to do makes life easier for all of us and will get you suggestions that apply to your situation MUCH faster.
# 13  
Old 05-01-2014
i have changed to
Code:
find . -type f -mtime +180

but it is not saving much as it is getting called only once in the script.

i am not getting arg max limit on the ls -lt
# 14  
Old 05-01-2014
The question was how long does this script take:
Code:
#!/bin/ksh
function criteria_purge {
	print "Files eligible for purging...."
	ls -lt $($1) | /usr/xpg4/bin/awk -v cy=`date +%Y` '
	BEGIN {	y["Jan"] = y["Feb"] = y["Mar"] = cy
		y["Apr"] = y["May"] = y["Jun"] = cy
		y["Jul"] = y["Aug"] = y["Sep"] = cy - 1
		y["Oct"] = y["Nov"] = y["Dec"] = cy - 1
	}
	NF > 8 {if(length($8) == 4)	# Do we have a year or a timestamp?
			yr = $8		#   year
		else	yr = y[$6]	#   timestamp
	}
	lmo != $6 || lyr != yr {
		dim = ld = 0
		lmo = $6
		lyr = yr
	}
	ld != $7 {
		dim++
		ld = $7
	}
	dim > 2 {
		printf("%s %s %s %s\n", $9, $7, $6, yr)
	}'
}

cd /purge_dir
criteria_purge "find . -type f -mtime +180"

This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to find all files other than first two dates & last date per month and year?

how to find all files other than first two dates & last date per month and year Hi All, lets say there are following files in directory -rwxr-xr-x 1 user userg 1596 Mar 19 2012 a.txt -rwxr-xr-x 1 user userg 1596 Mar 19 2012 b.txt -rwxr-xr-x 1 user userg ... (6 Replies)
Discussion started by: Makarand Dodmis
6 Replies

2. Shell Programming and Scripting

How to find all files for same month and year?

Hi All, I find all files for same month and year lets say there are following files in directory -rwxr-xr-x 1 user userg 1596 Mar 19 2012 c.txt -rwxr-xr-x 1 user userg 1596 Mar 21 2012 d.txt -rwxr-xr-x 1 user userg 1596 Mar 22 2012 f.txt -rwxr-xr-x 1... (8 Replies)
Discussion started by: Makarand Dodmis
8 Replies

3. UNIX for Advanced & Expert Users

Find all files other than first two files dates & last file date for month

Hi All, I need to find all files other than first two files dates & last file date for month and month/year wise list. lets say there are following files in directory Mar 19 2012 c.txt Mar 19 2012 cc.txt Mar 21 2012 d.txt Mar 22 2012 f.txt Mar 24 2012 h.txt Mar 25 2012 w.txt Feb 12... (16 Replies)
Discussion started by: Makarand Dodmis
16 Replies

4. Shell Programming and Scripting

How to list files that are not first two files date & last file date for every month-year?

Hi All, I need to find all files other than first two files dates & last file date for month and month/year wise list. lets say there are following files in directory Mar 19 2012 c.txt Mar 19 2012 cc.txt Mar 21 2012 d.txt Mar 22 2012 f.txt Mar 24 2012 h.txt Mar 25 2012 w.txt Feb 12... (2 Replies)
Discussion started by: Makarand Dodmis
2 Replies

5. Shell Programming and Scripting

Julian day to dates in YEAR-MONTH-DAY

hello, I have many files called day001, day002, day003 and I want to rename them by day20070101, day20070102, etc. I need to do it for several years and leap years as well. What is the best way to do it ? Thank you. (1 Reply)
Discussion started by: Ggg
1 Replies

6. UNIX for Dummies Questions & Answers

Unix man command to find out month of the year?

how can i display month of the year i was born with using man command? thanks (2 Replies)
Discussion started by: janetroop95
2 Replies

7. Shell Programming and Scripting

Concatenating Files In A Year/Month/Day File Structure

Hi Im trying to concatenate a specific file from each day in a year/month/day folder structure using Bash or equivalent. The file structure ends up like this: 2009/01/01/products 2009/01/02/products .... 2009/12/31/products The file I need is in products everyday and I need the script to... (3 Replies)
Discussion started by: Grizzly
3 Replies

8. UNIX for Advanced & Expert Users

how to get the last month and year in UNIX

how to get the last month and year in UNIx (2 Replies)
Discussion started by: Vijay06
2 Replies

9. UNIX for Advanced & Expert Users

find files with a perticular year of access

Hello all, Might be a silly question, on my AIX machine the year had changed to 2022 and some files were accessed on this date hence the time stamp on these files is with year 2022, there are many such files. i want to list all these file from the root dir and subdir with 2022 year... (3 Replies)
Discussion started by: pradeepmacha
3 Replies
Login or Register to Ask a Question