The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Trap jeenat Shell Programming and Scripting 1 04-07-2008 03:07 AM
Changing middle mouse button for pasting to right mouse button in cygwin rxvt sayeo UNIX for Dummies Questions & Answers 2 03-14-2008 06:55 PM
trap whatisthis Shell Programming and Scripting 9 04-11-2005 01:39 PM
Need help with trap marc6057 UNIX for Advanced & Expert Users 7 10-23-2001 12:45 AM

Reply
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #1  
Old 04-06-2006
mph mph is offline
Registered User
 

Join Date: Mar 2006
Posts: 44
Building a better mouse trap, or How many lines of code does it take to trap a mouse?

Hello all,

I'm hoping to get a little insight from some of the wily veterans amongst you.

I've written a script to check for new outgoing files to our vendors located on our ssl server. It seems to be working ok, but the final question here, will be one of logic, and/or a better way to do it.

First a little background; The program is run every 5 minutes from cron. The files are uploaded via NFS or CIFS. So file dates can't be fully trusted. So, I use find -cmin for the dates. Files remain on the server for 10 days.

Process;
1) Check for PID file. If PID file exist, exit. (program still running) If not generate PID file.

2) Check filesystem size for changes since the last run. If no changes, clean up PID file and exit. (No new files) If changed sleep 1 minute. (File(s) may still be transferring) Loop until changes stop. Add total sleep time to find time. Continue to step 3 (Transfer done)

3) Using the find command. Build a file containing the list of new files in ftp directory newer than specified cmin time.

4) Filter through the file built in step 3. Generate email for each vendor with file names and send to contact for vendor.

5) Clean up PID file. Copy stat files to backups for comparison on the next program run. exit.

Like I said, this is working, but a few files slip through the cracks.

What I would like to know is: If you have any thoghts on better ways to do this.

One Idea I've been looking into is:
Generate a full file list every 5 minutes and use diff to generate the outgoing file list?

Also, This started out as a small server. So, checking for filesystem changes was no problem. Now I have roughly 180 vendors accessing the site. With all the changes to the filesystem size the program will somtimes run for 15 - 20 minutes. Regardless of how the list is built. I would think that once it is generated I could just check file sizes on those files for changes. Once they finish transferring, generate the mail, and wait for the next go-round to pick up additional files.

So what's the general consensus? Thoughts, Ideas, Opinions?

Thanks in Advance,
MPH

I'd rather have a bottle in front of me, than a frontal labotomy.
Reply With Quote
Forum Sponsor
  #2  
Old 04-06-2006
Perderabo's Avatar
Unix Daemon
 

Join Date: Aug 2001
Location: Washington DC Area
Posts: 8,656
Not sure that I understand. Is this one directory or a directory tree? How the the files get removed? Anyway...

I would loop through all the files getting name and size (if date cannot be trusted, ignore it). Add name and size to a little database somewhere, timestamping this addition. Or if the entry is present, update size and timestamp. Then loop through database and find entries with old timestamps; process these; remove from database and directory (removal not possible? --- mark as processed in the database.)
Reply With Quote
  #3  
Old 04-06-2006
mph mph is offline
Registered User
 

Join Date: Mar 2006
Posts: 44
Perderabo,

Quote:
Not sure that I understand. Is this one directory or a directory tree? How the the files get removed? Anyway...
This is a directory tree /ftp. Under this there are the users and their incoming and outgoing directories. Each user has their own directory for security reasons. Our customers don't want their data availible to the wrong vendors.
Files get removed by another daily cron job that finds files older than 10 days. The date can't be trusted as far as how many minutes old they are. So, find works fine for removing old the files. If they're transferred via CIFS it holds the creation date previous to the transfer. That's why I use the -cmin. It seems to work well and uses the access time of the transfer. But I think that's where some files fall through. I had to setup ntp on the server due to clock variations between the server and the clients causing problems with file times. Another reason to use the "find all files and diff them" logic.
Quote:
I would loop through all the files getting name and size (if date cannot be trusted, ignore it). Add name and size to a little database somewhere, timestamping this addition. Or if the entry is present, update size and timestamp. Then loop through database and find entries with old timestamps; process these; remove from database and directory (removal not possible? --- mark as processed in the database.)
This is simular to what (I guess) I was trying to say with the idea I was looking into. That is to say, find all the files under /ftp/*/outgoing and diff them for additions against the file list built 5 minutes ago. Using the diffed file names, the "database" would simply be a temp file containing the name and size. Grep for the file, awk the $NF for the size and compair till they're the same, sleeping for bit between checks to avoid frantic looping. When the run is finished delete the temp database. Removed files won't be an issue, since I'm only looking for added files between runs. If the file reapears, there's usually a good reason for it (corrupted IGES files, etc...) and the vendor should be re-notified.

I hope this makes sense. My fingers are too well connected to my brain.
Reply With Quote
  #4  
Old 04-06-2006
Perderabo's Avatar
Unix Daemon
 

Join Date: Aug 2001
Location: Washington DC Area
Posts: 8,656
Hmmmm... I gotta learn to learn to leave these hyper-abstract problems alone.
Reply With Quote
  #5  
Old 04-06-2006
mph mph is offline
Registered User
 

Join Date: Mar 2006
Posts: 44
I knew I shouldn't have gone to the picasso school of communication
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes




All times are GMT -7. The time now is 10:05 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0