Sponsored Content
Top Forums Shell Programming and Scripting How to make awk command faster for large amount of data? Post 303024254 by Don Cragun on Thursday 4th of October 2018 02:38:35 PM
Old 10-04-2018
The comparison is taking a trivial amount of time compared to the time required to:
  1. read your compressed data,
  2. uncompress your data,
  3. write your uncompressed data into a pipe, and
  4. read your uncompressed data from the pipe.
Anything you can do to eliminate those four steps for data that can't match your desired time range will yield huge benefits in run-time reduction.

Increasing the number of times you perform those four steps will increase your run times; not decrease them.
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk help to make my work faster

hii everyone , i have a file in which i have line numbers.. file name is file1.txt aa bb cc "12" qw xx yy zz "23" we bb qw we "123249" jh here 12,23,123249. is the line number now according to this line numbers we have to print lines from other file named... (11 Replies)
Discussion started by: kumar_amit
11 Replies

2. Programming

Read/Write a fairly large amount of data to a file as fast as possible

Hi, I'm trying to figure out the best solution to the following problem, and I'm not yet that much experienced like you. :-) Basically I have to read a fairly large file, composed of "messages" , in order to display all of them through an user interface (made with QT). The messages that... (3 Replies)
Discussion started by: emitrax
3 Replies

3. AIX

amount of memory allocated to large page

We just set up a system to use large pages. I want to know if there is a command to see how much of the memory is being used for large pages. For example if we have a system with 8GB of RAm assigned and it has been set to use 4GB for large pages is there a command to show that 4GB of the *GB is... (1 Reply)
Discussion started by: daveisme
1 Replies

4. Shell Programming and Scripting

How to tar large amount of files?

Hello I have the following files VOICE_hhhh SUBSCR_llll DEL_kkkk Consider that there are 1000 VOICE files+1000 SUBSCR files+1000DEL files When i try to tar these files using tar -cvf backup.tar VOICE* SUBSCR* DEL* i get the error: ksh: /usr/bin/tar: arg list too long How can i... (9 Replies)
Discussion started by: chriss_58
9 Replies

5. Emergency UNIX and Linux Support

Help to make awk script more efficient for large files

Hello, Error awk: Internal software error in the tostring function on TS1101?05044400?.0085498227?0?.0011041461?.0034752266?.00397045?0?0?0?0?0?0?11/02/10?09/23/10???10?no??0??no?sct_det3_10_20110516_143936.txt What it is It is a unix shell script that contains an awk program as well as... (4 Replies)
Discussion started by: script_op2a
4 Replies

6. Shell Programming and Scripting

Running rename command on large files and make it faster

Hi All, I have some 80,000 files in a directory which I need to rename. Below is the command which I am currently running and it seems, it is taking fore ever to run this command. This command seems too slow. Is there any way to speed up the command. I have have GNU Parallel installed on my... (6 Replies)
Discussion started by: shoaibjameel123
6 Replies

7. Shell Programming and Scripting

Faster way to use this awk command

awk "/May 23, 2012 /,0" /var/tmp/datafile the above command pulls out information in the datafile. the information it pulls is from the date specified to the end of the file. now, how can i make this faster if the datafile is huge? even if it wasn't huge, i feel there's a better/faster way to... (8 Replies)
Discussion started by: SkySmart
8 Replies

8. Shell Programming and Scripting

awk changes to make it faster

I have script like below, who is picking number from one file and and searching in another file, and printing output. Bu is is very slow to be run on huge file.can we modify it with awk #! /bin/ksh while read line1 do echo "$line1" a=`echo $line1` if then echo "$num" cat file1|nawk... (6 Replies)
Discussion started by: mirwasim
6 Replies

9. Shell Programming and Scripting

Perl : Large amount of data put into an array

This basic code works. I have a very long list, almost 10000 lines that I am building into the array. Each line has either 2 or 3 fields as shown in the code snippit. The array elements are static (for a few reasons that out of scope of this question) the list has to be "built in". It... (5 Replies)
Discussion started by: sumguy
5 Replies

10. Shell Programming and Scripting

How to make awk command faster?

I have the below command which is referring a large file and it is taking 3 hours to run. Can something be done to make this command faster. awk -F ',' '{OFS=","}{ if ($13 == "9999") print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12 }' ${NLAP_TEMP}/hist1.out|sort -T ${NLAP_TEMP} |uniq>... (13 Replies)
Discussion started by: Peu Mukherjee
13 Replies
COMPRESS(1)						    BSD General Commands Manual 					       COMPRESS(1)

NAME
compress, uncompress -- compress and expand data SYNOPSIS
compress [-fv] [-b bits] [file ...] compress -c [-b bits] [file ...] uncompress [-fv] [file ...] uncompress -c [file ...] DESCRIPTION
The compress utility reduces the size of files using adaptive Lempel-Ziv coding. Each file is renamed to the same name plus the extension .Z. A file argument with a .Z extension will be ignored except it will cause an error exit after other arguments are processed. If compres- sion would not reduce the size of a file, the file is ignored. The uncompress utility restores compressed files to their original form, renaming the files by deleting the .Z extensions. A file specifica- tion need not include the file's .Z extension. If a file's name in its file system does not have a .Z extension, it will not be uncompressed and it will cause an error exit after other arguments are processed. If renaming the files would cause files to be overwritten and the standard input device is a terminal, the user is prompted (on the standard error output) for confirmation. If prompting is not possible or confirmation is not received, the files are not overwritten. As many of the modification time, access time, file flags, file mode, user ID, and group ID as allowed by permissions are retained in the new file. If no files are specified or a file argument is a single dash ('-'), the standard input is compressed or uncompressed to the standard output. If either the input and output files are not regular files, the checks for reduction in size and file overwriting are not performed, the input file is not removed, and the attributes of the input file are not retained in the output file. The options are as follows: -b bits The code size (see below) is limited to bits, which must be in the range 9..16. The default is 16. -c Compressed or uncompressed output is written to the standard output. No files are modified. The -v option is ignored. Compression is attempted even if the results will be larger than the original. -f Files are overwritten without prompting for confirmation. Also, for compress, files are compressed even if they are not actually reduced in size. -v Print the percentage reduction of each file. Ignored by uncompress or if the -c option is also used. The compress utility uses a modified Lempel-Ziv algorithm. Common substrings in the file are first replaced by 9-bit codes 257 and up. When code 512 is reached, the algorithm switches to 10-bit codes and continues to use more bits until the limit specified by the -b option or its default is reached. After the limit is reached, compress periodically checks the compression ratio. If it is increasing, compress continues to use the existing code dictionary. However, if the compression ratio decreases, compress discards the table of substrings and rebuilds it from scratch. This allows the algorithm to adapt to the next "block" of the file. The -b option is unavailable for uncompress since the bits parameter specified during compression is encoded within the output, along with a magic number to ensure that neither decompression of random data nor recompression of compressed data is attempted. The amount of compression obtained depends on the size of the input, the number of bits per code, and the distribution of common substrings. Typically, text such as source code or English is reduced by 50-60%. Compression is generally much better than that achieved by Huffman cod- ing (as used in the historical command pack), or adaptive Huffman coding (as used in the historical command compact), and takes less time to compute. EXIT STATUS
The compress and uncompress utilities exit 0 on success, and >0 if an error occurs. The compress utility exits 2 if attempting to compress a file would not reduce its size and the -f option was not specified and if no other error occurs. SEE ALSO
gunzip(1), gzexe(1), gzip(1), zcat(1), zmore(1), znew(1) Welch, Terry A., "A Technique for High Performance Data Compression", IEEE Computer, 17:6, pp. 8-19, June, 1984. STANDARDS
The compress and uncompress utilities conform to IEEE Std 1003.1-2001 (``POSIX.1''). HISTORY
The compress command appeared in 4.3BSD. BUGS
Some of these might be considered otherwise-undocumented features. compress: If the utility does not compress a file because doing so would not reduce its size, and a file of the same name except with an .Z extension exists, the named file is not really ignored as stated above; it causes a prompt to confirm the overwriting of the file with the extension. If the operation is confirmed, that file is deleted. uncompress: If an empty file is compressed (using -f), the resulting .Z file is also empty. That seems right, but if uncompress is then used on that file, an error will occur. Both utilities: If a '-' argument is used and the utility prompts the user, the standard input is taken as the user's reply to the prompt. Both utilities: If the specified file does not exist, but a similarly-named one with (for compress) or without (for uncompress) a .Z exten- sion does exist, the utility will waste the user's time by not immediately emitting an error message about the missing file and continuing. Instead, it first asks for confirmation to overwrite the existing file and then does not overwrite it. BSD
May 17, 2002 BSD
All times are GMT -4. The time now is 02:21 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy