Help finding/adding up file size...

Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Help finding/adding up file size...
# 1  
Old 02-13-2008
Help finding/adding up file size...

Ok, I am new to UNIX and Shell scripting and I am trying to do the following using a korn shell script:
I have a large data file that is loaded into a database, this data file is split into parts to make it more manageable based on a size parameter. The file contains many account records, the account number shows up in the 4th field if that is of any help. To split it, I do a grep to a temporary file, when a new account record set begins, there is an identifying record number in the first field. So, the grep looks for that and sends those to a temp file, then I count the #of accounts in the temp file and divide the # of accounts by the number of splits to determine how many accounts will be in each file. Then my script continues with its loading process... my problem is, there is a maxsize for the splits, sometimes an account can have thousands of records, resulting in it being very large for that one account. Since I am only splitting by the # of accounts it does not know that a large account may possibly send it over the limit and cause it to fail. Somehow, I need to be able to add up the size of the account records for each account and use that to determine if I need to create an extra split (if it hits the maxsize). I'm not sure if maybe I should try to get an "average record size" and somehow use that or if there is something right in front of me like wc that will work, any help is greatly appreciated. If this helps, here is a piece of the code:

datafilesz=`ls -l $datafile | awk '{print $5}'`
recsinfile=`grep -c '^' $datafile`

datasplits=$(( datafilesz / $maxsize ))
datafiles=$(( $datasplits + 1 ))

if (( $datafiles > 1 )) then
grep -n '^01' $datafile > /tmp/datarecs.$$
accounts=`grep -c '^' /tmp/datarecs.$$`

recsperfile=$(( $accounts / $datafiles ))
# 2  
Old 02-13-2008
I read this and think "oh how I have been there" ... its very hard to help without seeing some data obviously. But firstly, when I used to do this type of thing I found it somewhat easy on the stomach to transform the data into XML, hit it against a DTD and then export it in a loading friendly format in what ever discrete manner you need (the multiple batches for large accounts).

But sure you could loop through the file, make a file for each account, then loop through each account file and make them conform to your loader friendly size. Perl to the rescue imo.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Debian

Finding out size of sub-directory

Hi, Is there a way to find out the size of a sub-directory? Eg subidrectory in main directory /data, like this : /data/solr-5.3.1. When I do a df -h /data/solr-5.3.1, it still gives the size of /data: root@L28condor:/data/solr-5.3.1# df -h . Filesystem Size Used... (1 Reply)
Discussion started by: anaigini45
1 Replies

2. Shell Programming and Scripting

Finding size of files with spaces in their file names

I am running a UNIX script to get unused files and their sizes from the server. The issue is arising due to the spaces present in the filename/folder names.Due to this the du -k command doesn't work properly.But I need to calculate the size of all files including the ones which have spaces in them.... (4 Replies)
Discussion started by: INNSAV1
4 Replies

3. Shell Programming and Scripting

Help in finding & comparing avg file size

I have some files in a directory with two different extensions which get created everyday. Can you please help me out in getting the average file size for both these extensions and checking it with the last two file sizes of the same file extension ? To be more clear.. Lets say I have 10 files... (1 Reply)
Discussion started by: kiran1112
1 Replies

4. UNIX for Advanced & Expert Users

Command for finding RAM size in HP-UX

I am trying to find RAM size in my HP-UNIX server. what command I should use for this? I am using top command but not clear about below line from top o/p Memory: 1517080K (471284K) real, 1877692K (751256K) virtual, 8078944K free Page# 1/6 (3 Replies)
Discussion started by: venkatababu
3 Replies

5. Shell Programming and Scripting

finding max size

Hi I have a list of 2000 records with multiple entries and I want to get the max size for each entry ABC 1 ABC 2 ABC 3 ABC 4 DEF 1 DEF 2 DEF 2 DEF 2 DEF 2 ... (9 Replies)
Discussion started by: Diya123
9 Replies

6. UNIX for Dummies Questions & Answers

Finding size of all directories

Alright so I've tried a couple different things that at first glance, looked like they worked. find . -maxdepth 5 -type d -daystart -mtime 1 | xargs du -h Which seems to ignore the previous commands such as depth and modified time. find .. -maxdepth 2 -type d -daystart -ctime 1 | xargs... (8 Replies)
Discussion started by: Aussiemick
8 Replies

7. Shell Programming and Scripting

adding new line after finding specific text

hello i need some help here are the contents of my file. test.txt this is filename 1.mp3 filename 2.mp3 so this file has 100 of these lines filename url I would... (9 Replies)
Discussion started by: mscice
9 Replies

8. UNIX for Dummies Questions & Answers

Need help in finding Folder Size

Hi, I would like to find the size of a folder. When I run the command du -k It is going through all the sub-folder and files and taking really much time. Is there any command to get the complete directory size without showing the sub-folder and file size. Appreciate your response. ... (3 Replies)
Discussion started by: TonySolarisAdmi
3 Replies

9. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies

10. Shell Programming and Scripting

Finding EOL char is there or not in a big size file

Hi, I am using ksh. I have to find wether data file has EOL or not. as per my knowledge we can easily find by checking each character. But this is a tedious job as per my requirement because my data file size is very big . It may be in 25-30 MB. So please advice me how i can check wether... (4 Replies)
Discussion started by: HariRaju
4 Replies
Login or Register to Ask a Question