Sponsored Content
Top Forums Shell Programming and Scripting How to make awk command faster for large amount of data? Post 303024250 by Don Cragun on Thursday 4th of October 2018 02:06:13 PM
Old 10-04-2018
Expanding on what Corona688 said, your two commands:
Code:
zcat file1.gz | head -n 1
zcat file1.gz | tail -n 1

decompress the file twice (maybe not completing the first decompression), and if you find that there is some data in that file that you need, you'll then decompress it again for your awk script to process.

I would strongly suggest creating a separate text file that contains the timestamp of the first record in each compressed file and the name of that compressed file. (And, add a new entry to the end of that file each time you create a new compress log file.) Then you can look at that (uncompressed) text file to quickly determine which compressed file(s) you need to uncompress and feed to your awk script to get the records you want for any particular timestamp range.
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk help to make my work faster

hii everyone , i have a file in which i have line numbers.. file name is file1.txt aa bb cc "12" qw xx yy zz "23" we bb qw we "123249" jh here 12,23,123249. is the line number now according to this line numbers we have to print lines from other file named... (11 Replies)
Discussion started by: kumar_amit
11 Replies

2. Programming

Read/Write a fairly large amount of data to a file as fast as possible

Hi, I'm trying to figure out the best solution to the following problem, and I'm not yet that much experienced like you. :-) Basically I have to read a fairly large file, composed of "messages" , in order to display all of them through an user interface (made with QT). The messages that... (3 Replies)
Discussion started by: emitrax
3 Replies

3. AIX

amount of memory allocated to large page

We just set up a system to use large pages. I want to know if there is a command to see how much of the memory is being used for large pages. For example if we have a system with 8GB of RAm assigned and it has been set to use 4GB for large pages is there a command to show that 4GB of the *GB is... (1 Reply)
Discussion started by: daveisme
1 Replies

4. Shell Programming and Scripting

How to tar large amount of files?

Hello I have the following files VOICE_hhhh SUBSCR_llll DEL_kkkk Consider that there are 1000 VOICE files+1000 SUBSCR files+1000DEL files When i try to tar these files using tar -cvf backup.tar VOICE* SUBSCR* DEL* i get the error: ksh: /usr/bin/tar: arg list too long How can i... (9 Replies)
Discussion started by: chriss_58
9 Replies

5. Emergency UNIX and Linux Support

Help to make awk script more efficient for large files

Hello, Error awk: Internal software error in the tostring function on TS1101?05044400?.0085498227?0?.0011041461?.0034752266?.00397045?0?0?0?0?0?0?11/02/10?09/23/10???10?no??0??no?sct_det3_10_20110516_143936.txt What it is It is a unix shell script that contains an awk program as well as... (4 Replies)
Discussion started by: script_op2a
4 Replies

6. Shell Programming and Scripting

Running rename command on large files and make it faster

Hi All, I have some 80,000 files in a directory which I need to rename. Below is the command which I am currently running and it seems, it is taking fore ever to run this command. This command seems too slow. Is there any way to speed up the command. I have have GNU Parallel installed on my... (6 Replies)
Discussion started by: shoaibjameel123
6 Replies

7. Shell Programming and Scripting

Faster way to use this awk command

awk "/May 23, 2012 /,0" /var/tmp/datafile the above command pulls out information in the datafile. the information it pulls is from the date specified to the end of the file. now, how can i make this faster if the datafile is huge? even if it wasn't huge, i feel there's a better/faster way to... (8 Replies)
Discussion started by: SkySmart
8 Replies

8. Shell Programming and Scripting

awk changes to make it faster

I have script like below, who is picking number from one file and and searching in another file, and printing output. Bu is is very slow to be run on huge file.can we modify it with awk #! /bin/ksh while read line1 do echo "$line1" a=`echo $line1` if then echo "$num" cat file1|nawk... (6 Replies)
Discussion started by: mirwasim
6 Replies

9. Shell Programming and Scripting

Perl : Large amount of data put into an array

This basic code works. I have a very long list, almost 10000 lines that I am building into the array. Each line has either 2 or 3 fields as shown in the code snippit. The array elements are static (for a few reasons that out of scope of this question) the list has to be "built in". It... (5 Replies)
Discussion started by: sumguy
5 Replies

10. Shell Programming and Scripting

How to make awk command faster?

I have the below command which is referring a large file and it is taking 3 hours to run. Can something be done to make this command faster. awk -F ',' '{OFS=","}{ if ($13 == "9999") print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12 }' ${NLAP_TEMP}/hist1.out|sort -T ${NLAP_TEMP} |uniq>... (13 Replies)
Discussion started by: Peu Mukherjee
13 Replies
compress(1)							   User Commands						       compress(1)

NAME
compress, uncompress, zcat - compress, uncompress files or display expanded files SYNOPSIS
compress [-fv] [-b bits] [file...] compress [-cfv] [-b bits] [file] uncompress [-cfv] [file...] zcat [file...] DESCRIPTION
compress The compress utility will attempt to reduce the size of the named files by using adaptive Lempel-Ziv coding. Except when the output is to the standard output, each file will be replaced by one with the extension .Z, while keeping the same ownership modes, change times and mod- ification times. If appending the .Z to the file pathname would make the pathname exceed 1023 bytes, the command will fail. If no files are specified, the standard input will be compressed to the standard output. The amount of compression obtained depends on the size of the input, the number of bits per code, and the distribution of common sub- strings. Typically, text such as source code or English is reduced by 50-60%. Compression is generally much better than that achieved by Huffman coding (as used in pack(1)) and it takes less time to compute. The bits parameter specified during compression is encoded within the compressed file, along with a magic number to ensure that neither decompression of random data nor recompression of compressed data is subsequently allowed. uncompress The uncompress utility will restore files to their original state after they have been compressed using the compress utility. If no files are specified, the standard input will be uncompressed to the standard output. This utility supports the uncompressing of any files produced by compress. For files produced by compress on other systems, uncompress sup- ports 9- to 16-bit compression (see -b). zcat The zcat utility will write to standard output the uncompressed form of files that have been compressed using compress. It is the equiva- lent of uncompress -c. Input files are not affected. OPTIONS
The following options are supported: -c Writes to the standard output; no files are changed and no .Z files are created. The behavior of zcat is identical to that of `uncompress -c'. -f When compressing, forces compression of file, even if it does not actually reduce the size of the file, or if the corresponding file.Z file already exists. If the -f option is not given, and the process is not running in the background, prompts to verify whether an existing file.Z file should be overwritten. When uncompressing, does not prompt for overwriting files. If the -f option is not given, and the process is not running in the background, prompts to verify whether an existing file should be over- written. If the standard input is not a terminal and -f is not given, writes a diagnostic message to standard error and exits with a status greater than 0. -v Verbose. Writes to standard error messages concerning the percentage reduction or expansion of each file. -b bits Sets the upper limit (in bits) for common substring codes. bits must be between 9 and 16 (16 is the default). Lowering the number of bits will result in larger, less compressed files. OPERANDS
The following operand is supported: file A path name of a file to be compressed by compress, uncompressed by uncompress, or whose uncompressed form is written to standard out by zcat. If file is -, or if no file is specified, the standard input will be used. USAGE
See largefile(5) for the description of the behavior of compress, uncompress, and zcat when encountering files greater than or equal to 2 Gbyte ( 2**31 bytes). ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of compress, uncompress, and zcat: LANG, LC_ALL, LC_CTYPE, LC_MESSAGES, and NLSPATH. EXIT STATUS
The following error values are returned: 0 Successful completion. 1 An error occurred. 2 One or more files were not compressed because they would have increased in size (and the -f option was not specified). >2 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
ln(1), pack(1), attributes(5), environ(5), largefile(5), standards(5) DIAGNOSTICS
Usage: compress [-fvc] [-b maxbits] [file... ] Invalid options were specified on the command line. Missing maxbits Maxbits must follow -b, or invalid maxbits, not a numeric value. file: not in compressed format The file specified to uncompress has not been compressed. file: compressed with xxbits, can only handle yybits file was compressed by a program that could deal with more bits than the compress code on this machine. Recompress the file with smaller bits. file: already has .Z suffix -- no change The file is assumed to be already compressed. Rename the file and try again. file: already exists; do you wish to overwrite (y or n)? Respond y if you want the output file to be replaced; n if not. uncompress: corrupt input A SIGSEGV violation was detected, which usually means that the input file is corrupted. Compression: xx.xx% Percentage of the input saved by compression. (Relevant only for -v.) - - not a regular file: unchanged When the input file is not a regular file, (such as a directory), it is left unaltered. - - has xx other links: unchanged The input file has links; it is left unchanged. See ln(1) for more information. - - file unchanged No savings are achieved by compression. The input remains uncompressed. filename too long to tack on .Z The path name is too long to append the .Z suffix. NOTES
Although compressed files are compatible between machines with large memory, -b 12 should be used for file transfer to architectures with a small process data space (64KB or less). compress should be more flexible about the existence of the .Z suffix. SunOS 5.10 9 Sep 1999 compress(1)
All times are GMT -4. The time now is 09:44 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy