|
Thanks cfajohnson. I will try to incorporate your recommendations.
However, for my real problem, the oracle export utility (the process that creates the binary files) will create a file and then populate it with data up until it reaches 1Gb in size, then it will create a new file. If we use parallelisation, it will create n number of files (one for each parallel process) and fill them. The final binary files created could and probably would be less the 1Gb.
My thought was to call the gzip func before the export utility and then have it wait for files to gzip, i.e. only gzip files if there are more than the parallel number n. So if parallel was set to 4, only gzip the 5th file.
Thinking it through, I find it hard to identify which file the gzip program should gzip as we can't just zip files of 1Gb in size as it could still be finishing writing to the file, etc. Could I use something like fuser to identify if the export tool has finished with the file? perhaps some form of looping gzip that waits for the fuser to return no pid for an export file and then zips it? I have looked at an export and can see that when the utility is finished writing the file it no longer locks it so this could be feasible.
I would welcome your ideas.
Best Regards.
|