How to ignore incomplete files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to ignore incomplete files
# 1  
Old 02-13-2008
How to ignore incomplete files

On Solaris & AIX, suppose there is a directory 'dir'.
Log files of size approx 1MB are continuously being
deposited here by scp command. I have a script that scans
this dir every 5 mins and moves away the log files that
have been deposited so far.

How do I design my script so that I pick up *only* those
files that have been completely deposited. For example,

[/mylogs] $ ls -l
-rwxr-xr-x 1 root bin 1124124 Jan 9 02:26 log3225
-rwxr-xr-x 1 root bin 1092534 Jan 9 02:33 log3228
-rwxr-xr-x 1 root bin 1130932 Jan 9 02:39 log3230
-rwxr-xr-x 1 root bin 369644 Jan 9 02:46 log3235

the file 'log3235' has not completely been deposited yet.


- We are using rsync to syncronise this directory to another 4 server, we don't want to copy the incomplete list. Is there any way to ignore those.

Any help will be much appreciated.

Kind Regards
# 2  
Old 02-14-2008
For solving your problem, 3 solutions come to my mind:

1) If you are ABSOLUTELY sure that all the transferred files are greater than, for example, 1000000 bytes, you can easily filter out only the files you're interested in with a simple ls/awk script which checks the file size.

2) You can check if the files are in the middle of the transferring by issuing the "fuser" command over every file and check if there is one or more process accessing it. If so, the examined file is incomplete.

3) You have to transfer an empty "flag" file after the real data file has been transferred to the destination. In this manner you can pick up only the files which will have a corresponding flag file and ignore all the others. I think this is the best and reliable solution ( or at least, the one I prefer and regularly adopt in doing things like this Smilie )
# 3  
Old 02-14-2008
Thanks for the details. We have thought about all these options... Since we do scp using wild card, option 3 flag option is not possible...

Anyway do you know any option in rsync to ignore the incomple ones when you rsync from one server to multiple ones?
# 4  
Old 02-14-2008
Unfortunately I've never used rsync, but I think no program could determine if a file is incomplete or not. An "incomplete file" is a concept which implies knowing the contents and the meaning of the files involved in the transfer.

I think you could easily implement the third solution even if you use wildcards. Simply "expand" the wildcards before sending and generate a flag file for every entry. Then, after rsyncing the data files (with wildcards) you have to copy all the flag files.

You could for example generate empty files called:

log3225.flag
log3228.flag
log3230.flag
...

and so on, and then transfer all the *.flag files to the remote site.
# 5  
Old 02-14-2008
Hi.

I would let rsync handle the details. If a file is "incomplete" in one period, then rsync will copy as much as it can. Then chances are good that it will be complete in the next, and rsync will finish it.

My understanding of the design of rsync is that it transfers a minimum of data, so that you'll be transferring about the same amount of data regardless of what rsync does - transfers it all at once or in pieces ... cheers, drl
# 6  
Old 02-14-2008
That's the problem.. When rsysnc copies the incomplete files, the target server picks up the incomplete file and process.. Which we don't want to do it.. That's why we need some kind of way to stop the incomple file being picked up by rsync!!!
# 7  
Old 02-14-2008
Hi.

Unless there is a prior agreement between the creating process and the rest of the universe, I don't think there is any method to guarantee that a file is "complete".

I would attempt to address this by having the creating process set some completion flag, create an additional "unlocked-now" file, etc. -- similar to perhaps to what robotronic suggested. You might investigate ownership of the files, placing the files in a holding area until the next one begins to be created and then moving the assumed-to-be-complete file to the transfer directory, writing a wrapper script around the creating process to create the complete signal, and methods along that line.

I'll be interested in additional comments ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Split Command Generating Incomplete Output Files

Hello All, May i please know how do i ensure my split command would NOT generate incomplete output files like below, the last lines in each file is missing some columns or last line is complete. split -b 50GB File File_ File_aa |551|70210203|xxxxxxx|12/22/2010 20:44:58|11/01/2010... (1 Reply)
Discussion started by: Ariean
1 Replies

2. Red Hat

Lvcreate snapshot - ignore files

Hi, Need help for the below scenario.. Its a linux os snapshot which has been taken based on taking snapshot using lvcreate..while taking rootvg it taking an dump file of 2GB unnecessarily.. So any tricks to avoid the dump file while creating snapshot using lvcreate (0 Replies)
Discussion started by: ksgnathan
0 Replies

3. Shell Programming and Scripting

Wget - how to ignore files in immediate directory?

i am trying to recursively save a remote FTP server but exclude the files immediately under a directory directory1 wget -r -N ftp://user:pass@hostname/directory1 I want to keep these which may have more files under them directory1/dir1/file.jpg directory1/dir2/file.jpg... (16 Replies)
Discussion started by: vanessafan99
16 Replies

4. Shell Programming and Scripting

Exclude incomplete files in ls -rlt

Hi All, I am bit puzzled with this requirement where I need to list the files in a directory. However, files are being continuously written to this folder through FTP. Hence I need to exclude the file which is being written at the time of listing the directory. I thought of using file time... (5 Replies)
Discussion started by: angshuman
5 Replies

5. UNIX for Dummies Questions & Answers

How to deal with incomplete image files

Sorry for the odd title, but I couldn't think of an easy way to describe my issue. Background I have a home security system that continually sends (via FTP) 4 different still images to a directory on my personal website - cam0.jpg, cam1.jpg, etc. I've written an extremely basic html script to... (4 Replies)
Discussion started by: CinciJeff
4 Replies

6. Shell Programming and Scripting

Getting ls to ignore ~ and # files

Is there a way to customize ls to ignore files ending with ~ and #? (those are Emacs backup and auto-save files). I found -B option, which only ignores ~ files (2 Replies)
Discussion started by: yaroslavvb
2 Replies

7. Solaris

How to ignore incomplete files

On Solaris, suppose there is a directory 'dir'. Log files of size approx 1MB are continuously being deposited here by scp command. I have a script that scans this dir every 5 mins and moves away the log files that have been deposited so far. How do I design my script so that I pick up *only*... (6 Replies)
Discussion started by: sentak
6 Replies

8. Shell Programming and Scripting

How to ignore '.' files

I'm running Fedora Core 6 as an FTP server on a powerMac G4... I'm trying to create a script to remove files older than 3 days... I'm able to find all data older than 3 days but it finds hidden files such as /home/ftp/goossens/.canna /home/ftp/goossens/.kde... (4 Replies)
Discussion started by: James_UK
4 Replies

9. Shell Programming and Scripting

Join of files is incomplete?!

Hi folks, I am using the join command to join two files on a common field as follows: File1.txt Adsorption|H01.181.529.047 Adult|M01.060.116 Children|M01.055 File2.txt 5|Adsorption|C0001674 7|Adult|C000001 6|Children|C00002 join -i -t "|" -a 2 -1 1 -2 2 File1.txt File2.txt This... (7 Replies)
Discussion started by: s0460205
7 Replies

10. UNIX for Dummies Questions & Answers

append newline to files with incomplete last line

Hi all, Is there any way I can check a file for the linefeed character at the end of the file, and append one only if it is missing (ie. Incomplete last line)? Need to do this because I need to write a script to process files FTP-ed over from various machines, which may or may not be... (1 Reply)
Discussion started by: ziyi
1 Replies
Login or Register to Ask a Question