Sponsored Content
Top Forums UNIX for Advanced & Expert Users Best compression for log files? Post 302173453 by jjinno on Thursday 6th of March 2008 05:21:09 PM
Old 03-06-2008
Best compression for log files?

I have been doing some investigation into a log file from one of my systems, and the means which I currently use to compress and rotate it. I am looking for something smarter than gzip, faster than bzip2, and that can match or beat "my script" (which is slow as heck, but WAY better compression ratios result)

If 23-bytes per line are timestamp, that is roughly 33% of the file. If another 45% can be saved by doing a dictionary-map/replace of the 25 most-common phrases (I wrote a python script to do my mapping)... then there must be a compression program out there that can compress my logs without me needing to do this stuff prior to a gzip... right? And the bonus is that I wouldnt have to also un-do my changes on the decompress

"My Script" does the following:
  1. Find the first date/timestamped line, convert the timestamp to a number (IE: 2008-03-06 11:24:36.123 becomes 20080306112436123)
  2. Each subsequent datestamp is replaced with the difference between it and the last stamped line (resulting in small numbers) converted into a "base 72" number-string
  3. looking at everything on the line BEYOND the stamp, I check to see if a message is repeating from the line above, if it IS, then I replace the entire message with a hyphen (so only the first occurrence is actually seen)
  4. finally I compress using "gzip --best" because it is 1000x faster than bzip2 (although bzip2 gives me a better ratio)

Any Ideas???
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

file compression

Is it possible to unzip / compress a file that was zipped using WinZip? thanks, kristy (2 Replies)
Discussion started by: kristy
2 Replies

2. UNIX for Dummies Questions & Answers

compression utilities

I've noticed bzip2 gives a little bit better compression than gzip. So...I'm curious...what's gives the best compression out of all the compression utilities? Thanks! (6 Replies)
Discussion started by: jalburger
6 Replies

3. UNIX for Dummies Questions & Answers

Un-compression types...

Hi Folks, As I am familiar wih both types compresion forms: gun-zip and .rpm. My questions is how do I uncompress gunz.zip type? As the .rpm I can double click and it will extract...Can someone shed some light on this and thank you... M (2 Replies)
Discussion started by: Mombo_Z
2 Replies

4. Shell Programming and Scripting

How to avoid CR after compression

Hi all, I am having few files which needs to be concted into a single file and then it is compressed and FTPed from the UNIX server to the Windows server. For the above purpose i am using gzip command to compress the files after concetenation. And i am FTP ing the compressed file in the... (3 Replies)
Discussion started by: Codesearcher
3 Replies

5. UNIX for Advanced & Expert Users

Sun Cluster log rotation & compression

I currently have in root's crontab: 20 4 * * 0,3 /usr/cluster/lib/sc/newcleventlog /var/cluster/logs/eventlog 20 4 * * 0,3 /usr/cluster/lib/sc/newcleventlog /var/cluster/logs/DS 20 4 * * 0,3 /usr/cluster/lib/sc/newcleventlog /var/cluster/logs/commandlog there is no man page on... (1 Reply)
Discussion started by: rkruck
1 Replies

6. Shell Programming and Scripting

Compression - Exclude huge files

I have a DB folder which sizes to 60GB approx. It has logs which size from 500MB - 1GB. I have an Installation which would update the DB. I need to backup this DB folder, just incase my Installation FAILS. But I do not need the logs in my backup. How do I exclude them during compression (tar)? ... (2 Replies)
Discussion started by: DevendraG
2 Replies

7. Shell Programming and Scripting

Modification of MySQLDump-files before compression needed

Hi @all! In my MySQL-backup-script I backup and compress every single table with this command: /usr/bin/mysqldump --opt database_x table_y | /usr/bin/pbzip2 -c > "/media/BackUpDrive/Backup/table_x.gz"Unfortunately these files need modification - they have to start with the following line(s):... (7 Replies)
Discussion started by: gogo555
7 Replies

8. UNIX for Advanced & Expert Users

Compression with openssl

Hi , 1-I need to know please if it's possible to compress using openssl? Here is the version used: openssl version -a OpenSSL 0.9.7d 17 Mar 2004 (+ security fixes for: CVE-2005-2969 CVE-2006-2937 CVE-2006-2940 CVE2006-3738 CVE-2006-4339 CVE-2006-4343 CVE-2007-5135 CVE-2008-5077... (3 Replies)
Discussion started by: Eman_in_forum
3 Replies

9. Shell Programming and Scripting

Redirecting log files to null writing junk into log files

Redirecting log files to null writing junk into log files. i have log files which created from below command exec <processname> >$logfile but when it reaches some size i am redirecting to null while process is running like >$logfile manually but after that it writes some junk into... (7 Replies)
Discussion started by: greenworld123
7 Replies
GZIP(1) 						    BSD General Commands Manual 						   GZIP(1)

NAME
gzip -- compression/decompression tool using Lempel-Ziv coding (LZ77) SYNOPSIS
gzip [-cdfhlNnqrtVv] [-S suffix] file [file [...]] gunzip [-cfhNqrtVv] [-S suffix] file [file [...]] zcat [-fhV] file [file [...]] DESCRIPTION
The gzip program compresses and decompresses files using Lempel-Ziv coding (LZ77). If no files are specified, gzip will compress from stan- dard input, or decompress to standard output. When in compression mode, each file will be replaced with another file with the suffix, set by the -S suffix option, added, if possible. In decompression mode, each file will be checked for existence, as will the file with the suffix added. If invoked as gunzip then the -d option is enabled. If invoked as zcat or gzcat then both the -c and -d options are enabled. This version of gzip is also capable of decompressing files compressed using compress(1) or bzip2(1). OPTIONS
The following options are available: -1, --fast -2 -3 -4 -5 -6 -7 -8 -9, --best These options change the compression level used, with the -1 option being the fastest, with less compression, and the -9 option being the slowest, with optimal compression. The default compression level is 6. -c, --stdout, --to-stdout This option specifies that output will go to the standard output stream, leaving files intact. -d, --decompress, --uncompress This option selects decompression rather than compression. -f, --force This option turns on force mode. This allows files with multiple links, overwriting of pre-existing files, reading from or writing to a terminal, and when combined with the -c option, allowing non-compressed data to pass through unchanged. -h, --help This option prints a usage summary and exits. -l, --list This option displays information about the file's compressed and uncompressed size, ratio, uncompressed name. With the -v option, it also displays the compression method, CRC, date and time embedded in the file. -N, --name This option causes the stored filename in the input file to be used as the output file. -n, --no-name This option stops the filename and timestamp from being stored in the output file. -q, --quiet With this option, no warnings or errors are printed. -r, --recursive This option is used to gzip the files in a directory tree individually, using the fts(3) library. -S suffix, --suffix suffix This option changes the default suffix from .gz to suffix. -t, --test This option will test compressed files for integrity. -V, --version This option prints the version of the gzip program. -v, --verbose This option turns on verbose mode, which prints the compression ratio for each file compressed. ENVIRONMENT
If the environment variable GZIP is set, it is parsed as a white-space separated list of options handled before any options on the command line. Options on the command line will override anything in GZIP. SEE ALSO
bzip2(1), compress(1), xz(1), fts(3), zlib(3) HISTORY
The gzip program was originally written by Jean-loup Gailly, licensed under the GNU Public Licence. Matthew R. Green wrote a simple front end for NetBSD 1.3 distribution media, based on the freely re-distributable zlib library. It was enhanced to be mostly feature-compatible with the original GNU gzip program for NetBSD 2.0. This manual documents NetBSD gzip version 20040427. AUTHORS
This implementation of gzip was written by Matthew R. Green <mrg@eterna.com.au>. BSD
June 18, 2011 BSD
All times are GMT -4. The time now is 08:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy