How to find list of missing files based on the file format?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to find list of missing files based on the file format?
# 1  
Old 05-17-2017
How to find list of missing files based on the file format?

Hi All,

In the file names we have dates.
Based on the file format given by the user,
if any file is not existed for a particular date with in a given interval we should consider that file is missing.

I have the below files in the directory /bin/daily/voda_files.

Code:
asr_spir_2017-05-10-150325_2017-05-10-112227_2017-05-13-112227.txt
adb_voda_2017-05-11-150325_2017-05-10-112227_2017-05-13-112227.txt
adb_voda_2017-05-14-150325_2017-05-11-112227_2017-05-10-112227.txt
adb_voda_2017-05-12-150325_2017-05-12-112227_2017-05-11-112227
adb_voda_2017-05-16_2017-04-30_2017-05-01.txt
adb_voda_20170510.txt
adb_voda_2017-05-10.txt
2017-05-11
2017-05-10.txt
2017-05-12

If user entered

Code:
file_format=xxx_xxxx_YYYY-MM-DD-HHIISS_?????????????????_?????????????????.txt
prog_name="abd_voda_"
interval=10 (from current date -1 to 10 days back It should be from 2017-05-08 to 2017-05-17)

In this case it should consider the first date in the file.
The missing files output should be
Code:
adb_voda_2017-05-08
adb_voda_2017-05-09
adb_voda_2017-05-10
adb_voda_2017-05-12
adb_voda_2017-05-13
adb_voda_2017-05-15
adb_voda_2017-05-16
adb_voda_2017-05-17

If user entered

Code:
file_format=xxx_xxxx_?????????????????_?????????????????_YYYY-MM-DD-HHIISS.txt
prog_name="abd_voda_"
interval=10 (from current date -1 to 10 days back It should be from 2017-05-08 to 2017-05-17)

In this case it should consider the last date in the file.
The missing files output should be
Code:
adb_voda_2017-05-08
adb_voda_2017-05-09
adb_voda_2017-05-11
adb_voda_2017-05-12
adb_voda_2017-05-14
adb_voda_2017-05-15
adb_voda_2017-05-16
adb_voda_2017-05-17

If user entered

Code:
file_format=YYYY-MM-DD
prog_name=""
interval=10 (from current date -1 to 10 days back It should be from 2017-05-08 to 2017-05-17).

It should consider the files which are having only YYYY-MM-DD.
The missing files output should be
Code:
2017-05-08
2017-05-09
2017-05-10
2017-05-13
2017-05-14
2017-05-15
2017-05-16
2017-05-17

Please help me with the script.

Thanks in advance.
# 2  
Old 05-18-2017
given your detailed description.... what have you tried and where exactly are you stuck?
# 3  
Old 05-18-2017
Hi,

I have tried this but its not working as expected.

Code:
file_format=xxx_xxxx_?????????????????_?????????????????_YYYY-MM-DD-HHIISS.txt
prog_name="abd_voda_"
interval=10

 #checking before . extension
YearFormat=$(echo $file_format | sed 's/X//g;s/x//g;s/_*$//g;s/^_*//g' | awk 'BEGIN{FS=".";}{for (i = 1; i <= NF; i++){if ( $i ~ /YY/ ){print $i;}}}' | sed 's/X//g;s/x//g;s/_*$//g;s/^_*//g' | head -1)
YearFormat_count=$(echo $YearFormat | wc -l)
echo "YearFormat-$YearFormat"
if [[ $YearFormat_count -lt 1 ]]; then
echo "enter proper year format"
exit 0
fi
 #exit is missing here
month_count=0;y_count=0;d_count=0;hour_count=0;min_count=0;sec_count=0
for (( i=0; i<${#YearFormat}; i++ )); do
year=$(echo "${YearFormat:$i:1}")
if [[ $year == 'Y' ]];then
y_count=`expr $y_count + 1`
fi
if [[ $year == 'M' ]];then
month_count=`expr $month_count + 1`
fi
if [[ $year == 'D' ]];then
d_count=`expr $d_count + 1`
fi
if [[ $year == 'H' ]];then
hour_count=`expr $hour_count + 1`
fi
if [[ $year == 'I' ]];then
min_count=`expr $min_count + 1`
fi
if [[ $year == 'S' ]];then
sec_count=`expr $sec_count + 1`
fi
done

 #Remove the duplicates
cleandateFormat=$(echo $YearFormat | tr -s 'A-Z') #YmD_HMS
echo $cleandateFormat
if [[ $y_count -eq 2 ]]; then
cleandateFormat=$(echo $cleandateFormat | sed 's/Y/y/g;s/M/m/g;s/D/d/g;s/I/M/g')
elif [[ $y_count -eq 4 ]]; then
cleandateFormat=$(echo $cleandateFormat | sed 's/M/m/g;s/D/d/g;s/I/M/g')
else
echo "enter correct year format"
exit 0
fi
 #maintain format including special character as seperator
finalFormat=""
for (( i=0; i<${#cleandateFormat}; i++ )); do
f_year=$(echo "${cleandateFormat:$i:1}")
echo "char : $f_year"
if [[ $f_year == [a-zA-Z] ]];then
temp=$(echo $f_year | sed 's/^/%/g')
finalFormat="$finalFormat$temp"
echo "IF : $finalFormat"
else
finalFormat="$finalFormat$f_year"
fi
done

if [[ $check_mode -eq 1 ]]
then
missing_count=0
day_count=0
while [[ $interval -ne 0 ]]; do
finalFormat=$(echo "$finalFormat" | sed -r 's/[HMS]+//g;s/%*$//g;s/-*$//g;s/_*$//g')
start=`date +${finalFormat} -d "$interval day ago"`
IFS=$','
for path in ${file_path}; do
count=$(ls -l /bin/daily/voda_files/${prog_name}*${start}* 2>/dev/null | wc -l)
if [[ $count -gt 0 ]]
then
break
fi
done
unset IFS
if [[ $count -eq 0 ]]
then
missing_count=`expr $missing_count + 1`
file_name="${start}"
printf "$file_name\n"
fi
interval=`expr $interval - 1`
done >missingfiles.txt
fi

Please help me.
Thanks in advance.
# 4  
Old 05-18-2017
"its not working as expected" doesn't really answer vgersh99's Q: "where exactly are you stuck".
And, in recent threads you received hints and examples of data & date manipulation which I don't find back in above. Were all those posts in vain?
# 5  
Old 05-18-2017
Your requirements here (as in your previous three threads on this topic) are confusing and incomplete.

I have no idea how you expect to match a filename to the second (as required by the format strings you are using that specify not only year, month, and day but also hour minute and second) which you then compare to the year, month, day (for the previous 10 days) and the hour, minute, and second at the time at which you run your script. How will you guarantee that you are running your script at exactly 15:03:25 when you are looking for matches for the 1st dates in your filenames and at exactly 11:22:27 when you are looking for matches for the last dates in your filenames?

If your input filename samples:
Code:
asr_spir_2017-05-10-150325_2017-05-10-112227_2017-05-13-112227.txt
adb_voda_2017-05-11-150325_2017-05-10-112227_2017-05-13-112227.txt
adb_voda_2017-05-14-150325_2017-05-11-112227_2017-05-10-112227.txt
adb_voda_2017-05-12-150325_2017-05-12-112227_2017-05-11-112227
adb_voda_2017-05-16_2017-04-30_2017-05-01.txt
adb_voda_20170510.txt
adb_voda_2017-05-10.txt
2017-05-11
2017-05-10.txt
2017-05-12

are correct, and you want to match filenames starting with adb_voda_ with the dates marked in red, and ending with .txt, it would seem that the format you feed into your script should be:
Code:
adv_voda_YYYY-MM-DD-??????_????-??-??-??????_????-??-??-??????.txt

which your code would then convert to the date format string:
Code:
adv_voda_%Y-%m-%d-??????_????-??-??-??????_????-??-??-??????.txt

and date would then create a pathname matching pattern from that that would match the file(s) you want to select for a given date without a prefix pattern and without asterisks that have cause you problems in all of your previous threads (as well as in this thread).

All of the code you have that is stripping off _s, and Xs and ?s seems to be fighting against matching only the filenames you want to match.

Similarly, if you wanted to match the last date in those files (marked in blue), it would seem that you want the input format string to be:
Code:
adv_voda_????-??-??-??????_????-??-??-??????_YYYY-DD-MM-??????.txt

which your code would then convert to the date format string:
Code:
adv_voda_????-??-??-??????_????-??-??-??????_%Y-%m-%d-??????.txt

# 6  
Old 05-18-2017
Hi don,

Thanks a lot for your response.


could you lease help me how to convert this
Code:
tadv_voda_YYYY-MM-DD-??????_????-??-??-??????_????-??-??-??????.txt

to date format string as you said.

which your code would then convert to the date format string:


Code:
adv_voda_%Y-%m-%d-??????_????-??-??-??????_????-??-??-??????.txt

Thanks in advance.

---------- Post updated at 05:55 AM ---------- Previous update was at 04:30 AM ----------

Hi don,

I have done the code to convert to the date format string as below.

Code:
adv_voda_%Y-%m-%d-??????_????-??-??-??????_????-??-??-??????.txt

Now how can I search for missing dates.
Could you lease help me.

What changes I have to do in the below code.

Code:
missing_count=0
day_count=0
while [[ $interval -ne 0 ]]; do
finalFormat=$(echo "$finalFormat" | sed -r 's/[HMS]+//g;s/%*$//g;s/-*$//g;s/_*$//g')
start=`date +${finalFormat} -d "$interval day ago"`
IFS=$','
for path in ${file_path}; do
count=$(ls -l /bin/daily/voda_files/${prog_name}*${start}* 2>/dev/null | wc -l)
if [[ $count -gt 0 ]]
then
break
fi
done
unset IFS
if [[ $count -eq 0 ]]
then
missing_count=`expr $missing_count + 1`
file_name="${start}"
printf "$file_name\n"
fi
interval=`expr $interval - 1`
done >missingfiles.txt
fi

Please help me.
Thanks in advance.

Last edited by nalu; 05-18-2017 at 08:08 AM..
# 7  
Old 05-18-2017
What operating system (including release number) are you using?

What shell (including version number) are you using?

Do you have a ksh (version 93u+ or later) that I can use instead of whatever shell you're using to provide an example?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

List the files after sorting based on file content

Hi, I have two pipe separated files as below: head -3 file1.txt "HD"|"Nov 11 2016 4:08AM"|"0000000018" "DT"|"240350264"|"56432" "DT"|"240350264"|"56432" head -3 file2.txt "HD"|"Nov 15 2016 2:18AM"|"0000000019" "DT"|"240350264"|"56432" "DT"|"240350264"|"56432" I want to list the... (6 Replies)
Discussion started by: Prasannag87
6 Replies

2. Shell Programming and Scripting

Find list of files missing read & execute permission

Hi, I'm writing a post-upgrade script and I want to find which files don't have read and execute to everyone. I can run a find . ! -perm, but then I have to use a list of the possible permissions (777,775, 755 etc). Is there a more elegant solution? Thanks (2 Replies)
Discussion started by: Catullus
2 Replies

3. Shell Programming and Scripting

I have this list of files . Now I will have to pick the latest file based on some condition

3679 Jul 21 23:59 belk_rpo_error_**po9324892**_07212014.log 0 Jul 22 23:59 belk_rpo_error_**po9324892**_07222014.log 3679 Jul 23 23:59 belk_rpo_error_**po9324892**_07232014.log 22 Jul 22 06:30 belk_rpo_error_**po9324267**_07012014.log 0 Jul 20 05:50... (5 Replies)
Discussion started by: LoneRanger
5 Replies

4. Shell Programming and Scripting

How to list the files based on the modification time using the find command?

Hi All, I need to list the files based modification time of the files from a directory, I cannot use "ls -t" as there are lot of files, which "ls" command cannot handle. New files will land there daily. So iam looking for an alternative through "find"command. All suggestions are welcomed. ... (6 Replies)
Discussion started by: Kesavan
6 Replies

5. Shell Programming and Scripting

Find files older than X with a weird file format

I have an issue with a korn shell script that I am writing. The script parses through a configuration file which lists a heap of path/directories for some files which need to be FTP'd. Now the script needs to check whether there are any files which have not been processed and are X minutes old. ... (2 Replies)
Discussion started by: MickAAA
2 Replies

6. Shell Programming and Scripting

Find missing files from a list

counter=0; while read line; do ] && let counter=counter+1; done < input_file.txt echo $counter The above code is reading a file line by line and checking whether the filenames mentioned in the file exist or not . At present the o/p is value of counter I want to echo out the name of... (5 Replies)
Discussion started by: ultimatix
5 Replies

7. Shell Programming and Scripting

KSH: Opening Files based on a file list

I'd like to grep files for key words using korn shell, and compile the actual contents (not just file name) of those files that contain a combination of those grepped key words into one repository file for reference. However, I'm stuck at the combining part. Here's what I have thus far: egrep... (5 Replies)
Discussion started by: drumminfool91
5 Replies

8. Shell Programming and Scripting

Bash snippet to find files based on a text file?

Evening all. I'm having a terrible time with a script I've been working on for a few days now... Say I have a text file named top10song.tm2, with the following in it: kernkraft 400 Imagine i kissed a girl Thriller animals hallelujah paint it black psychosocial Oi to the world... (14 Replies)
Discussion started by: DJ Charlie
14 Replies

9. Shell Programming and Scripting

create diffrent files based on other file and parameters list

I would like ot create shell script/ bash to create diffrent files based on a file and parameters list. Here is the detail example: I have a textfile and four static parameter files (having ‘?'). mainfile.txt has below records (this count may be more than 50) A200001 A200101 B200001... (9 Replies)
Discussion started by: raghav525
9 Replies

10. Shell Programming and Scripting

Help with find command and list in a long format each found file

The purpose of those comands are to find the newest file in a directory acvrdind to system date, and it has to be recursively found in each directory. The problem is that i want to list in a long format every found file, but the commands i use produce unexpected results ,so the output lists in a... (5 Replies)
Discussion started by: alexcol
5 Replies
Login or Register to Ask a Question