Quote:
Originally Posted by
Perlbaby
Thanks RudiC and Don for reply .
Just trying to make it more precise for better understanding ( Dropping the insert idea into DB2 tables )
1.you want the number of lines in each selected file - 1,
I would like to take just the count of latest file based on timestamp and previous timestamp file count .
a) Compare both counts ( if latest file >= old File --> just assign FLAG=SUCCESS
b) Compare both counts (if latest file < 10% of old File --> just assign FLAG=SUCCESS
c) Compare both counts (if latest file < 0 --> just assign FLAG=FAIL
2.all of the files you want to process can be selected by the filename matching pattern /var/tmp/*/*.csv,
yes , file pattern will be filename_timestamp ( exam ARCH_20171202.csv)
Lets use only one path /var/tmp/
3.all filenames end with an eight digit date string in the format YYYYMMDD followed by the string .csv, and
yes
4.within each directory under /var/tmp, the string before the date string is a constant?
yes
5.Do you want the line count from each selected file in a directory or do you want the sum of the line counts from the selected files in all of the directories containing *.csv files?
Lets have only file counts . use only one path /var/tmp/
I would like to use python instead of unix/perl . Using windows 7.
let me know for any questions
You have now completely changed your original requirements and introduced some new questions.
Before you wanted a single file from each subdirectory under
/var/tmp, now we have to collect data from two files in each of those subdirectories. We now also have to compare results from those two files and create a flag. Furthermore, the flag is set to
SUCCESS if there are more lines in the newest file than in the previous file or if the newest file contains less than 10% of the number of lines in the previous file (not counting the header line in either file), and the flag is set to
FAIL if the newest file is an empty file. There is no indication of how the flag should be set if the number of lines in the newest file has a number of lines that is greater than or equal to 10% of the number of lines in the previous file but less or equal to the number of lines in the previous file. And, there is no indication of whether or not there is another special case to be handled if the previous file is empty.
And, there is no indication of what is supposed to be done with that flag once it has been set. You need to show us the output you hope to produce with your new specifications (in CODE tags).
In your first post in this thread you showed us an
awk script that had been given to you before that calculated the total number of lines (not counting header lines and assuming that every file contained at least a header line) in a group of one of more files. The
awk code you showed us won't work with empty files, so if you were willing to accept an
awk solution to your problem, the code you provided can't be used as a template. But, you now say that a shell script or an
awk script is not allowed and that anyone wanting to help you must instead write it in
python. I'm not proficient in
python so I won't be able to help you, but I think you will still need to answer the questions I've posed above for anyone else to be able to help you.