The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
How to ignore incomplete files sentak SUN Solaris 6 02-14-2008 01:03 PM
Incomplete three way handshake 3wayTrouble IP Networking 0 11-23-2007 08:34 AM
How to ignore '.' files James_UK Shell Programming and Scripting 4 09-05-2007 09:22 PM
Join of files is incomplete?! s0460205 Shell Programming and Scripting 7 06-08-2006 12:40 PM
append newline to files with incomplete last line ziyi UNIX for Dummies Questions & Answers 1 04-14-2004 06:00 AM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1  
Old 02-13-2008
Registered User
 

Join Date: Nov 2006
Posts: 11
How to ignore incomplete files

On Solaris & AIX, suppose there is a directory 'dir'.
Log files of size approx 1MB are continuously being
deposited here by scp command. I have a script that scans
this dir every 5 mins and moves away the log files that
have been deposited so far.

How do I design my script so that I pick up *only* those
files that have been completely deposited. For example,

[/mylogs] $ ls -l
-rwxr-xr-x 1 root bin 1124124 Jan 9 02:26 log3225
-rwxr-xr-x 1 root bin 1092534 Jan 9 02:33 log3228
-rwxr-xr-x 1 root bin 1130932 Jan 9 02:39 log3230
-rwxr-xr-x 1 root bin 369644 Jan 9 02:46 log3235

the file 'log3235' has not completely been deposited yet.


- We are using rsync to syncronise this directory to another 4 server, we don't want to copy the incomplete list. Is there any way to ignore those.

Any help will be much appreciated.

Kind Regards
Reply With Quote
Forum Sponsor
  #2  
Old 02-14-2008
robotronic's Avatar
Can I play with madness?
 

Join Date: Apr 2002
Location: Italy
Posts: 370
For solving your problem, 3 solutions come to my mind:

1) If you are ABSOLUTELY sure that all the transferred files are greater than, for example, 1000000 bytes, you can easily filter out only the files you're interested in with a simple ls/awk script which checks the file size.

2) You can check if the files are in the middle of the transferring by issuing the "fuser" command over every file and check if there is one or more process accessing it. If so, the examined file is incomplete.

3) You have to transfer an empty "flag" file after the real data file has been transferred to the destination. In this manner you can pick up only the files which will have a corresponding flag file and ignore all the others. I think this is the best and reliable solution ( or at least, the one I prefer and regularly adopt in doing things like this )
Reply With Quote
  #3  
Old 02-14-2008
Registered User
 

Join Date: Nov 2006
Posts: 11
Thanks for the details. We have thought about all these options... Since we do scp using wild card, option 3 flag option is not possible...

Anyway do you know any option in rsync to ignore the incomple ones when you rsync from one server to multiple ones?
Reply With Quote
  #4  
Old 02-14-2008
robotronic's Avatar
Can I play with madness?
 

Join Date: Apr 2002
Location: Italy
Posts: 370
Unfortunately I've never used rsync, but I think no program could determine if a file is incomplete or not. An "incomplete file" is a concept which implies knowing the contents and the meaning of the files involved in the transfer.

I think you could easily implement the third solution even if you use wildcards. Simply "expand" the wildcards before sending and generate a flag file for every entry. Then, after rsyncing the data files (with wildcards) you have to copy all the flag files.

You could for example generate empty files called:

log3225.flag
log3228.flag
log3230.flag
...

and so on, and then transfer all the *.flag files to the remote site.
Reply With Quote
  #5  
Old 02-14-2008
drl's Avatar
drl drl is offline
Registered User
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 546
Hi.

I would let rsync handle the details. If a file is "incomplete" in one period, then rsync will copy as much as it can. Then chances are good that it will be complete in the next, and rsync will finish it.

My understanding of the design of rsync is that it transfers a minimum of data, so that you'll be transferring about the same amount of data regardless of what rsync does - transfers it all at once or in pieces ... cheers, drl
Reply With Quote
  #6  
Old 02-14-2008
Registered User
 

Join Date: Nov 2006
Posts: 11
That's the problem.. When rsysnc copies the incomplete files, the target server picks up the incomplete file and process.. Which we don't want to do it.. That's why we need some kind of way to stop the incomple file being picked up by rsync!!!
Reply With Quote
  #7  
Old 02-14-2008
drl's Avatar
drl drl is offline
Registered User
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 546
Hi.

Unless there is a prior agreement between the creating process and the rest of the universe, I don't think there is any method to guarantee that a file is "complete".

I would attempt to address this by having the creating process set some completion flag, create an additional "unlocked-now" file, etc. -- similar to perhaps to what robotronic suggested. You might investigate ownership of the files, placing the files in a holding area until the next one begins to be created and then moving the assumed-to-be-complete file to the transfer directory, writing a wrapper script around the creating process to create the complete signal, and methods along that line.

I'll be interested in additional comments ... cheers, drl
Reply With Quote
Google The UNIX and Linux Forums
Reply

Tags
solaris

Thread Tools
Display Modes




All times are GMT -7. The time now is 05:29 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0