Sponsored Content
Top Forums Shell Programming and Scripting How to make awk command faster for large amount of data? Post 303024191 by RudiC on Tuesday 2nd of October 2018 02:14:36 PM
Old 10-02-2018
Corona688's reasoning is absolutely correct. I created a large file and timed several awk selection methods:




Code:
time awk '{if ($9 == "(200)" && $3 > "[13/Jul/2018:17:00:00" && $3 < "[13/Jul/2018:21:00:00") print}' file > /dev/null

real    0m0,794s
user    0m0,725s
sys    0m0,068s
time awk '$9 == "(200)" {if ($3 > "[13/Jul/2018:17:00:00" && $3 < "[13/Jul/2018:21:00:00") print}' file > /dev/null

real    0m0,787s
user    0m0,733s
sys    0m0,052s
time awk '$9 == "(200)" {X = substr ($3, 14); if (X > "17:00:00" && X < "21:00:00") print}' file > /dev/null

real    0m0,806s
user    0m0,732s
sys    0m0,072s
time awk '$9 == "(200)" {X = substr ($3, 14); if (X < "17:00:00") next; if (X > "21:00:00") exit; print}' file > /dev/null

real    0m0,775s
user    0m0,676s
sys    0m0,093s
time awk '$9 == "(200)" {X = substr ($3, 14); if (X ~ /1[7-9]|2[01]:[0-5][0-9]:[0-5][0-9]$/) print}' file > /dev/null

real    0m0,827s
user    0m0,727s
sys    0m0,078s

, and found very little variation in execution time. You'd better focus on the data supply / file access.

Last edited by RudiC; 10-02-2018 at 03:22 PM..
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk help to make my work faster

hii everyone , i have a file in which i have line numbers.. file name is file1.txt aa bb cc "12" qw xx yy zz "23" we bb qw we "123249" jh here 12,23,123249. is the line number now according to this line numbers we have to print lines from other file named... (11 Replies)
Discussion started by: kumar_amit
11 Replies

2. Programming

Read/Write a fairly large amount of data to a file as fast as possible

Hi, I'm trying to figure out the best solution to the following problem, and I'm not yet that much experienced like you. :-) Basically I have to read a fairly large file, composed of "messages" , in order to display all of them through an user interface (made with QT). The messages that... (3 Replies)
Discussion started by: emitrax
3 Replies

3. AIX

amount of memory allocated to large page

We just set up a system to use large pages. I want to know if there is a command to see how much of the memory is being used for large pages. For example if we have a system with 8GB of RAm assigned and it has been set to use 4GB for large pages is there a command to show that 4GB of the *GB is... (1 Reply)
Discussion started by: daveisme
1 Replies

4. Shell Programming and Scripting

How to tar large amount of files?

Hello I have the following files VOICE_hhhh SUBSCR_llll DEL_kkkk Consider that there are 1000 VOICE files+1000 SUBSCR files+1000DEL files When i try to tar these files using tar -cvf backup.tar VOICE* SUBSCR* DEL* i get the error: ksh: /usr/bin/tar: arg list too long How can i... (9 Replies)
Discussion started by: chriss_58
9 Replies

5. Emergency UNIX and Linux Support

Help to make awk script more efficient for large files

Hello, Error awk: Internal software error in the tostring function on TS1101?05044400?.0085498227?0?.0011041461?.0034752266?.00397045?0?0?0?0?0?0?11/02/10?09/23/10???10?no??0??no?sct_det3_10_20110516_143936.txt What it is It is a unix shell script that contains an awk program as well as... (4 Replies)
Discussion started by: script_op2a
4 Replies

6. Shell Programming and Scripting

Running rename command on large files and make it faster

Hi All, I have some 80,000 files in a directory which I need to rename. Below is the command which I am currently running and it seems, it is taking fore ever to run this command. This command seems too slow. Is there any way to speed up the command. I have have GNU Parallel installed on my... (6 Replies)
Discussion started by: shoaibjameel123
6 Replies

7. Shell Programming and Scripting

Faster way to use this awk command

awk "/May 23, 2012 /,0" /var/tmp/datafile the above command pulls out information in the datafile. the information it pulls is from the date specified to the end of the file. now, how can i make this faster if the datafile is huge? even if it wasn't huge, i feel there's a better/faster way to... (8 Replies)
Discussion started by: SkySmart
8 Replies

8. Shell Programming and Scripting

awk changes to make it faster

I have script like below, who is picking number from one file and and searching in another file, and printing output. Bu is is very slow to be run on huge file.can we modify it with awk #! /bin/ksh while read line1 do echo "$line1" a=`echo $line1` if then echo "$num" cat file1|nawk... (6 Replies)
Discussion started by: mirwasim
6 Replies

9. Shell Programming and Scripting

Perl : Large amount of data put into an array

This basic code works. I have a very long list, almost 10000 lines that I am building into the array. Each line has either 2 or 3 fields as shown in the code snippit. The array elements are static (for a few reasons that out of scope of this question) the list has to be "built in". It... (5 Replies)
Discussion started by: sumguy
5 Replies

10. Shell Programming and Scripting

How to make awk command faster?

I have the below command which is referring a large file and it is taking 3 hours to run. Can something be done to make this command faster. awk -F ',' '{OFS=","}{ if ($13 == "9999") print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12 }' ${NLAP_TEMP}/hist1.out|sort -T ${NLAP_TEMP} |uniq>... (13 Replies)
Discussion started by: Peu Mukherjee
13 Replies
bup-random(1)						      General Commands Manual						     bup-random(1)

NAME
bup-random - generate a stream of random output SYNOPSIS
bup random [-S seed] [-fv] DESCRIPTION
bup random produces a stream of pseudorandom output bytes to stdout. Note: the bytes are not generated using a cryptographic algorithm and should never be used for security. Note that the stream of random bytes will be identical every time bup random is run, unless you provide a different seed value. This is intentional: the purpose of this program is to be able to run repeatable tests on large amounts of data, so we want identical data every time. bup random generates about 240 megabytes per second on a modern test system (Intel Core2), which is faster than you could achieve by read- ing data from most disks. Thus, it can be helpful when running microbenchmarks. OPTIONS
the number of bytes of data to generate. Can be used with the suffices k, M, or G to indicate kilobytes, megabytes, or gigabytes, respectively. -S, --seed=seed use the given value to seed the pseudorandom number generator. The generated output stream will be identical for every stream seeded with the same value. The default seed is 1. A seed value of 0 is equivalent to 1. -f, --force generate output even if stdout is a tty. (Generating random data to a tty is generally considered ill-advised, but you can do if you really want.) -v, --verbose print a progress message showing the number of bytes that has been output so far. EXAMPLES
$ bup random 1k | sha1sum 2108c55d0a2687c8dacf9192677c58437a55db71 - $ bup random -S1 1k | sha1sum 2108c55d0a2687c8dacf9192677c58437a55db71 - $ bup random -S2 1k | sha1sum f71acb90e135d98dad7efc136e8d2cc30573e71a - $ time bup random 1G >/dev/null Random: 1024 Mbytes, done. real 0m4.261s user 0m4.048s sys 0m0.172s $ bup random 1G | bup split -t --bench Random: 1024 Mbytes, done. bup: 1048576.00kbytes in 18.59 secs = 56417.78 kbytes/sec 1092599b9c7b2909652ef1e6edac0796bfbfc573 BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-random(1)
All times are GMT -4. The time now is 04:27 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy