Hello. I have a scripting query that I am stumped on which I hope you can help with.
Basically, I have a ksh script that calls a process to create n number of binary files. These files have a maximum size of 1Gb. The process can write n number of files at once (parallel operation) based on the paralellisation parameter fed into the script at the start. Normally we would wait for this process to complete and then gzip all the files individually (gzip *.dmp for example). However, on some systems we don't have enough disk space to wait until all the 1Gb files have been produced.
I have previously written some code to gzip the files in parallel (see below), however, I now need to gzip them in parallel whilst the first process runs. I need to be careful not to attempt to gzip any files currently being written (up to n from the parallel command), so some sort of looping will be required. And I want to maintain the option of parallel gzip if possible.
Can anyone help me write some code for this using standard solaris 8/9/10 tools using the korn shell. Perl commands should be possible (vers 5.6.1 installed).
Not only is -1 unnecessary, but so is ls itself. Also, ls will break your script if there are any spaces in the filenames.
Quote:
Use standard syntax:
Quote:
Quote the variable, or your script will break if there are spaces in the filename (and there's no need for the parentheses):
Quote:
What wrong with:
[code]
for
Quote:
Can anyone help me write some code for this using standard solaris 8/9/10 tools using the korn shell. Perl commands should be possible (vers 5.6.1 installed).
Your code looks far more complicated than it needs to be.
It's not clear from your code how you tell whether a file is finished being written to so that you can compress it.
Do you have any control over the process that is writing the binary files?
Thanks cfajohnson. I will try to incorporate your recommendations.
However, for my real problem, the oracle export utility (the process that creates the binary files) will create a file and then populate it with data up until it reaches 1Gb in size, then it will create a new file. If we use parallelisation, it will create n number of files (one for each parallel process) and fill them. The final binary files created could and probably would be less the 1Gb.
My thought was to call the gzip func before the export utility and then have it wait for files to gzip, i.e. only gzip files if there are more than the parallel number n. So if parallel was set to 4, only gzip the 5th file.
Thinking it through, I find it hard to identify which file the gzip program should gzip as we can't just zip files of 1Gb in size as it could still be finishing writing to the file, etc. Could I use something like fuser to identify if the export tool has finished with the file? perhaps some form of looping gzip that waits for the fuser to return no pid for an export file and then zips it? I have looked at an export and can see that when the utility is finished writing the file it no longer locks it so this could be feasible.
As soon as a new file is created, you can gzip the previous one.
Not the most enlightening statement but I understand what you mean.
I did some playing and found that the export program will create a file of 4k initially, then stop using it while it builds up a list of objects to export. It then returns a lock on the file and fills the file up to the 1gb size.
I modified my code to use a du and fuser test to check that the file was bigger than 4k and was not being used by any user. I find that if i use parallelism on the export the program will create n number of files and then populate them, it may start with files 1,2,3, and 4, but 1,3, and 4 suddenly reach 1Gb so it creates 5,6,7 to continue the parallel tasks. file 2 is still not full (for whatever reason).
Files 1,3, and 4 are now unused but the gzip_func does not seem to want to gzip the files until file 2 is also unused - which often might not be until the end of the export. Can you please have a look at the code below and see if you can spot an obvious error? I want the code to really start gzipping when the 2 tests are passed, whether it can only gzip 1 file or up to n threads. Any ideas?
Hi,
I want to display the file names and the record count for the files in the 2nd column for the files created today.
i have written the below command which is listing the file names. but while piping the above command to the wc -l command
its not working for me.
ls -l... (5 Replies)
Hi Guys,
I am using RHEL5 and Solaris 9 & 10.
I want to tar and gzip my files then remove them after a successful tar command...
Lets say I have files with extension .arc then I want to tar and gzip these files.
After successful tar command I want to remove all these files (i.e .arc).
... (3 Replies)
Hi All,
I have a random test file: test.txt, size: 146
$ ll test.txt
$ 146 test.txt
Take 1:
$ cat test.txt | gzip > test.txt.gz
$ ll test.txt.gz
$ 124 test.txt.gz
Take 2:
$ gzip test.txt
$ ll test.txt.gz
$ 133 test.txt.gz
As you can see, gzipping a file and piping into gzip... (1 Reply)
Hi,
I am using the commande line find . -name "*.nc" -type f -exec gzip -v {} \; to zip all files with the extension " *.nc " in all directories.
But I am looking for a way to excluded some directories as the command will recursively check all of them.
If somone can help me with some... (4 Replies)
Hi,
I have 1000 of files in a folder with the file extension as .csv
In this some of the files are already zipped and its looks like filename.csv.gz
Now i need to zip all the files in the folder to free some disk space. When i give gzip *.csv
It prompts me to overwrite filename.csv.gz... (5 Replies)
Is there any way to compress only the files with .xml extension within a folder which in turn has many sub folders?
gzip -r9 path/name/*.xml is not working
This compression is done in the Windows server using Batch script. (2 Replies)
Hello experts,
I run Solaris 9. I have a below script which is used for gunzip the thousand files from a directory.
----
#!/usr/bin/sh
cd /home/thousands/gzipfiles/
for i in `ls -1`
do
gunzip -c $i > /path/to/file/$i
done
----
In my SAME directory there thousand of GZIP file and also... (4 Replies)
Hi,
There are multiple files in a directory with different names.How can they be gzipped such that the timestamp of the files is not changed. (2 Replies)
The windows version of gzip supports pretty much unlimited file sizes while the one we have in solaris only goes up to a set size, one or two gigs I think.
Is there a new version of gzip I can put on our systems that supports massive file sizes? (2 Replies)
Hello Everyone,
Here is what I am trying to do. I have four text files, I want to gzip them under unix and mail the zipped file via outlook. I am able to do this easily enough, but using winzip or pkunzip to unzip the file, there is only one file. (In essence, all four files were... (2 Replies)