|
Help finding/adding up file size...
Ok, I am new to UNIX and Shell scripting and I am trying to do the following using a korn shell script:
I have a large data file that is loaded into a database, this data file is split into parts to make it more manageable based on a size parameter. The file contains many account records, the account number shows up in the 4th field if that is of any help. To split it, I do a grep to a temporary file, when a new account record set begins, there is an identifying record number in the first field. So, the grep looks for that and sends those to a temp file, then I count the #of accounts in the temp file and divide the # of accounts by the number of splits to determine how many accounts will be in each file. Then my script continues with its loading process... my problem is, there is a maxsize for the splits, sometimes an account can have thousands of records, resulting in it being very large for that one account. Since I am only splitting by the # of accounts it does not know that a large account may possibly send it over the limit and cause it to fail. Somehow, I need to be able to add up the size of the account records for each account and use that to determine if I need to create an extra split (if it hits the maxsize). I'm not sure if maybe I should try to get an "average record size" and somehow use that or if there is something right in front of me like wc that will work, any help is greatly appreciated. If this helps, here is a piece of the code:
datafilesz=`ls -l $datafile | awk '{print $5}'`
recsinfile=`grep -c '^' $datafile`
datasplits=$(( datafilesz / $maxsize ))
datafiles=$(( $datasplits + 1 ))
if (( $datafiles > 1 )) then
grep -n '^01' $datafile > /tmp/datarecs.$$
accounts=`grep -c '^' /tmp/datarecs.$$`
recsperfile=$(( $accounts / $datafiles ))
|