Need help improving my script.

04-14-2016

Registered User

4, 0

Join Date: Apr 2016

Last Activity: 14 April 2016, 12:00 AM EDT

Posts: 4

Thanks Given: 1

Thanked 0 Times in 0 Posts

Thank you so much for your time. I will give this a try this evening after business hours. I will also cut the mtime down to 1 so it only processes 24 zipped files at once for the test run.

Sorry for the small snippet of sample data. This is a production firewall that is generating the raw data so I was trying to be careful to only include non-identifying data in the sample.

As for the output to the files, I can tweak that to get exactly what I want if any of the fields are incorrect.

I believe I am beginning to understand how this script works at a basic level on its interaction with the files. I do think your correct that the majority of the time is spent repeatedly un-compressing and re-compressing data unnecessarily.

While the script runs I will have a second connection open to the box running the top command to watch the processor and memory usage to determine the load being placed on the system before expanding it to additional files.

I will let you know how it runs. Your script is far more elegant than my cobbled together one. It just goes to show that just because "my" way works doesn't mean it is the best way to get things done.

---------- Post updated at 11:00 PM ---------- Previous update was at 02:26 PM ----------

The script you provided cuts the time by over 50%!
I do have a few tweaks to do to get the output correct but that is something I can easily handle.

I did do a couple of tests

1. Copied the files down to a local directory and changed the script to search that directory. /home/kenneth.cramer/temp To see if the files being located on the ZFS was having an impact on the speed. I did not see an improvement in the speed of the script decompressing the file.

2. Copied the files to a local directory and unzipped them first. The speed was improved but the time taken to copy and uncompress the file made it balance out. There is no real gain unless I create a timed script to copy down the files and uncompress them before I need to run the script. So that approach is impractical.

3. Tested the size of the compressed vs the uncompressed files. Each file represents an hours worth of data. Compressed each files averages 54 MB. Uncompressed they average 1 gig per file. 7 days with 24 files per day is 168 files. So taking an hour or so to sift through 168 gig of data is not bad for time. The shear size also makes it impractical to copy down the files and uncompress them just to do these few operations on them.

Thank you for all the help. I believe I can manage the last few tweaks from here and get the output I need in the format I need.

Thank you again for all your help.

garlandxj11

View Public Profile for garlandxj11

Find all posts by garlandxj11

04-14-2016

Moderator

1,484, 567

Join Date: Mar 2011

Last Activity: 28 November 2020, 9:34 AM EST

Posts: 1,484

Thanks Given: 68

Thanked 567 Times in 444 Posts

Can you use ZFS filesystem properties to compress ?

Then you can use 'regular' commands on those files and let zfs handle compression operations.
Perhaps noticeable speed can be gained there for read operations, depending on the zfs setup and memory arc/l2arc occupies.

Also, one might use zfs filesystem goodies, in terms of using snapshots and send recv to make backups, which would simplify the archive procedure.

But again, not all operating systems have ZFS and if you are looking for a portable solution shell is the way.

Hope that helps
Best regards
Peasant.

Peasant

View Public Profile for Peasant

Find all posts by Peasant

Shell Programming and Scripting

Need help improving my script.

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Improving code

Discussion started by: jiam912

2. Shell Programming and Scripting

Help with improving korn shell script

Discussion started by: bayouprophet

3. Shell Programming and Scripting

Basic help improving for in loop

Discussion started by: Heath_T

4. Shell Programming and Scripting

Improving code by using associative arrays

Discussion started by: kristinu

5. Shell Programming and Scripting

Improving this validate function

Discussion started by: pyscho

6. UNIX for Dummies Questions & Answers

Improving Unix Skills

Discussion started by: sak900354

7. Shell Programming and Scripting

improving my script

Discussion started by: bcheaib

8. UNIX for Dummies Questions & Answers

improving my script (find & replace)

Discussion started by: amir_yosha