Best compression for log files?


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Best compression for log files?
# 1  
Old 03-06-2008
Best compression for log files?

I have been doing some investigation into a log file from one of my systems, and the means which I currently use to compress and rotate it. I am looking for something smarter than gzip, faster than bzip2, and that can match or beat "my script" (which is slow as heck, but WAY better compression ratios result)

If 23-bytes per line are timestamp, that is roughly 33% of the file. If another 45% can be saved by doing a dictionary-map/replace of the 25 most-common phrases (I wrote a python script to do my mapping)... then there must be a compression program out there that can compress my logs without me needing to do this stuff prior to a gzip... right? And the bonus is that I wouldnt have to also un-do my changes on the decompress

"My Script" does the following:
  1. Find the first date/timestamped line, convert the timestamp to a number (IE: 2008-03-06 11:24:36.123 becomes 20080306112436123)
  2. Each subsequent datestamp is replaced with the difference between it and the last stamped line (resulting in small numbers) converted into a "base 72" number-string
  3. looking at everything on the line BEYOND the stamp, I check to see if a message is repeating from the line above, if it IS, then I replace the entire message with a hyphen (so only the first occurrence is actually seen)
  4. finally I compress using "gzip --best" because it is 1000x faster than bzip2 (although bzip2 gives me a better ratio)

Any Ideas???
# 2  
Old 03-06-2008
Compressing means keeping all the data as it was.

If you decrease file size a lot simply by removing stuff or using a predetermined methods for replacing redundancy, you are kind creating huffman encoding on your own. Without a table, so it can't be reversed unless a human knows the drill.

Why don't you just write these files off to tape and delete them off disk? That would result in an ultimate space savings. It will always take human intervention to expand and then interpret your hashed files anyway. So why not add in a little bit more time on the restore side and save time and lots of disk on the compression side. Or get really good archiving software --- :smile:
# 3  
Old 03-07-2008
You might try re-writing the logs and convert date to a binary value on your own. Then gzip it.
However gzip is already doing quite nice thing with compression...
Every time when there is some possibility to optimize something - before doing anything think twice if it is worth of it.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Redirecting log files to null writing junk into log files

Redirecting log files to null writing junk into log files. i have log files which created from below command exec <processname> >$logfile but when it reaches some size i am redirecting to null while process is running like >$logfile manually but after that it writes some junk into... (7 Replies)
Discussion started by: greenworld123
7 Replies

2. UNIX for Advanced & Expert Users

Compression with openssl

Hi , 1-I need to know please if it's possible to compress using openssl? Here is the version used: openssl version -a OpenSSL 0.9.7d 17 Mar 2004 (+ security fixes for: CVE-2005-2969 CVE-2006-2937 CVE-2006-2940 CVE2006-3738 CVE-2006-4339 CVE-2006-4343 CVE-2007-5135 CVE-2008-5077... (3 Replies)
Discussion started by: Eman_in_forum
3 Replies

3. Shell Programming and Scripting

Modification of MySQLDump-files before compression needed

Hi @all! In my MySQL-backup-script I backup and compress every single table with this command: /usr/bin/mysqldump --opt database_x table_y | /usr/bin/pbzip2 -c > "/media/BackUpDrive/Backup/table_x.gz"Unfortunately these files need modification - they have to start with the following line(s):... (7 Replies)
Discussion started by: gogo555
7 Replies

4. Shell Programming and Scripting

Compression - Exclude huge files

I have a DB folder which sizes to 60GB approx. It has logs which size from 500MB - 1GB. I have an Installation which would update the DB. I need to backup this DB folder, just incase my Installation FAILS. But I do not need the logs in my backup. How do I exclude them during compression (tar)? ... (2 Replies)
Discussion started by: DevendraG
2 Replies

5. UNIX for Advanced & Expert Users

Sun Cluster log rotation & compression

I currently have in root's crontab: 20 4 * * 0,3 /usr/cluster/lib/sc/newcleventlog /var/cluster/logs/eventlog 20 4 * * 0,3 /usr/cluster/lib/sc/newcleventlog /var/cluster/logs/DS 20 4 * * 0,3 /usr/cluster/lib/sc/newcleventlog /var/cluster/logs/commandlog there is no man page on... (1 Reply)
Discussion started by: rkruck
1 Replies

6. Shell Programming and Scripting

How to avoid CR after compression

Hi all, I am having few files which needs to be concted into a single file and then it is compressed and FTPed from the UNIX server to the Windows server. For the above purpose i am using gzip command to compress the files after concetenation. And i am FTP ing the compressed file in the... (3 Replies)
Discussion started by: Codesearcher
3 Replies

7. UNIX for Dummies Questions & Answers

Un-compression types...

Hi Folks, As I am familiar wih both types compresion forms: gun-zip and .rpm. My questions is how do I uncompress gunz.zip type? As the .rpm I can double click and it will extract...Can someone shed some light on this and thank you... M (2 Replies)
Discussion started by: Mombo_Z
2 Replies

8. UNIX for Dummies Questions & Answers

compression utilities

I've noticed bzip2 gives a little bit better compression than gzip. So...I'm curious...what's gives the best compression out of all the compression utilities? Thanks! (6 Replies)
Discussion started by: jalburger
6 Replies

9. UNIX for Dummies Questions & Answers

file compression

Is it possible to unzip / compress a file that was zipped using WinZip? thanks, kristy (2 Replies)
Discussion started by: kristy
2 Replies
Login or Register to Ask a Question