awk slowing down -- why?


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users awk slowing down -- why?
# 1  
Old 01-08-2014
show your code then
# 2  
Old 01-08-2014
Quote:
Originally Posted by kurumi
show your code then
As requested! Attached as .txt.
# 3  
Old 01-08-2014
As of this posting, my attachment is still pending approval. So, here's the script:

Code:
#! /bin/awk -f

BEGIN   {
        OFS = ","
        count = 1
        prevtime = 0
        while ( "cat /root/scripts/billing/subant_list" | getline )
                {
                split($0, sublist, ",")
                subants[count] = sublist[1]
                count ++
                }
        }

## Let's chew on something now...
NR % 10000 == 0 {
        print NR "\t" length(time_count) "\t" systime() - prevtime
        prevtime = systime()
        }

$1 >= start_time && $1 < end_time       {
linecount++

## Hourly operations count

hourlyOperationsCount[substr($2,2,14)]++

## Generate a per-subant count, including:
##      Count of HTTP status codes per subant
##      Count of HTTP tx types per subant

for ( i = 1 ; i <= length(subants) ; i++)
        {
        if ( $14 == subants[i] )
                {
                ##  Subtenant count:
                found["TotalTXCount," subants[i]]++

                ##  Count of HTTP status codes per subant:
                found["StatusCount," subants[i] "," $5] ++

                ##  Count of HTTP tx types per subant
                httptype = substr($9, 2, length($9) - 1)
                found["TXTypeCount," subants[i] "," httptype] ++

                ##  Cumulative size and time of tx by subant_list and HTTP tx type
                indexInSizeByType = "InSizeByType," subants[i] "," httptype
                found[indexInSizeByType] = found[indexInSizeByType] + $17

                indexOutSizeByType = "OutSizeByType," subants[i] "," httptype
                found[indexOutSizeByType] = found[indexOutSizeByType] + $18

                indexTimeByType = "TimeByType," subants[i] "," httptype
                found[indexTimeByType] = found[indexTimeByType] + $19
                }
        }
}


## Ok, these next two sections warrant a little 'splainin.  We track:
##      1)  Concurrent connections -- connections that are ongoing during
##              a given second, whether or not they were initiated in that
##              particular second.
##      2)  initiated connections -- connections that were started in a
##              given second.

{

## First, track concurrent connections.  This one doesn't have the time
##      filter that everything else has so that connections already in
##      progress when the time window starts are counted.

for ( i = 1 ; i <= length(subants) ; i++)
        {
        if ( $14 == subants[i] )
                {
                stime = $1
                if (int(($19 + 500000) / 1000000) >= 1 )
                        {
                        for ( j = stime ; j <= (stime + int(($19 + 500000) / 1000000)) ; j ++ )
                                {
                                time_count[subants[i] "," j] ++
                                }
                        }
#               for (i in time_count) {print i "\t" time_count[i]}}
                }
        }
}

$1 >= start_time && $1 < end_time       {
## Finally, we track initiated connections.

for ( i = 1 ; i <= length(subants) ; i++)
        {
        if ( $14 == subants[i] )
                {
                txInitiated[subants[i] "," $1] ++
                }
        }
}


END     {
        print linecount
        for ( i in found )
                {
                print i "," found[i]
                }

        for ( i in time_count )
                {
                split(i,st,",")
                subant = st[1]
                subant_time = st[2]
                if ( time_count[i] > max_concurrency[subant] && subant_time >= start_time && subant_time < end_time )
                        {
                        max_concurrency[subant] = time_count[i]
                        max_concurr_time[subant] = subant_time
                        }
                }

        for ( i in txInitiated )
                {
                split(i,st,",")
                subant = st[1]
                subant_time = st[2]
                if (txInitiated[i] > max_initiated[subant])
                        {
                        max_initiated[subant] = txInitiated[i]
                        max_init_time[subant] = subant_time
                        }
                }

        for ( i in max_concurrency )
                {
                print "PeakCncrntConns," i "," max_concurrency[i] ",@" max_concurr_time[i]
                }

        for ( i in max_initiated )
                {
                print "PeakInitConns," i "," max_initiated[i] ",@" max_init_time[i]
                }

        for ( i in hourlyOperationsCount )
                {
                print "hourlyOperationsCount", i, hourlyOperationsCount[i]
                }

        }

# 4  
Old 01-08-2014
Attachment approved.
You may try to run your script like this:
Code:
export LC_ALL=C
./your_awk_script args...

and see if the elapsed time decreases.

If you post sample datafiles, the analysis would be easier.

Last edited by radoulov; 01-08-2014 at 06:36 PM..
# 5  
Old 01-08-2014
Quote:
Originally Posted by radoulov
Attachment approved.
You may try to run your script like this:
Code:
export LC_ALL=C
./your_awk_script args...

and see if the elapsed time decreases.

If you post sample datafiles, the analysis would be easier.
Thanks for the reply. I'm trying the variable export to see how the completion time compares.
# 6  
Old 01-09-2014
Answering your previous question (now deleted): yes, you can limit the scope
of LC_ALL and execute it like this:
Code:
LC_ALL=C <your_script> <args> ...

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk output yields error: awk:can't open job_name (Autosys)

Good evening, Im newbie at unix specially with awk From an scheduler program called Autosys i want to extract some data reading an inputfile that comprises jobs names, then formating the output to columns for example 1. This is the inputfile: $ more MapaRep.txt ds_extra_nikira_usuarios... (18 Replies)
Discussion started by: alexcol
18 Replies

2. Shell Programming and Scripting

Expect slowing down / missing characters

Im writing an expect program to connect to cisco routers and run commands. my commands file has only two entries show version show running-config when I run the script, the first command is run without a problem. The second command isn't. The "s" is missing at the device command line,... (1 Reply)
Discussion started by: popeye
1 Replies

3. Shell Programming and Scripting

Passing awk variable argument to a script which is being called inside awk

consider the script below sh /opt/hqe/hqapi1-client-5.0.0/bin/hqapi.sh alert list --host=localhost --port=7443 --user=hqadmin --password=hqadmin --secure=true >/tmp/alerts.xml awk -F'' '{for(i=1;i<=NF;i++){ if($i=="Alert id") { if(id!="") if(dt!=""){ cmd="sh someScript.sh... (2 Replies)
Discussion started by: vivek d r
2 Replies

4. Shell Programming and Scripting

HELP with AWK one-liner. Need to employ an If condition inside AWK to check for array variable ?

Hello experts, I'm stuck with this script for three days now. Here's what i need. I need to split a large delimited (,) file into 2 files based on the value present in the last field. Samp: Something.csv bca,adc,asdf,123,12C bca,adc,asdf,123,13C def,adc,asdf,123,12A I need this split... (6 Replies)
Discussion started by: shell_boy23
6 Replies

5. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

6. UNIX for Dummies Questions & Answers

Sendmail process "Toomany" system slowing down

Hello Experts I have M4000 Solaris 10 server, from few many days there are too many sendmail and mail.local process starting on server and each time i need to kill mannualy using pkill send mail, some time there will 600 of them taking 30mb memory for each and hence slowing down the server,... (2 Replies)
Discussion started by: karghum
2 Replies

7. Shell Programming and Scripting

Problem with awk awk: program limit exceeded: sprintf buffer size=1020

Hi I have many problems with a script. I have a script that formats a text file but always prints the same error when i try to execute it The code is that: { if (NF==17){ print $0 }else{ fields=NF; all=$0; while... (2 Replies)
Discussion started by: fate
2 Replies

8. Shell Programming and Scripting

scripting/awk help : awk sum output is not comming in regular format. Pls advise.

Hi Experts, I am adding a column of numbers with awk , however not getting correct output: # awk '{sum+=$1} END {print sum}' datafile 2.15291e+06 How can I getthe output like : 2152910 Thank you.. # awk '{sum+=$1} END {print sum}' datafile 2.15079e+06 (3 Replies)
Discussion started by: rveri
3 Replies

9. Shell Programming and Scripting

Awk problem: How to express the single quote(') by using awk print function

Actually I got a list of file end with *.txt I want to use the same command apply to all the *.txt Thus I try to find out the fastest way to write those same command in a script and then want to let them run automatics. For example: I got the file below: file1.txt file2.txt file3.txt... (4 Replies)
Discussion started by: patrick87
4 Replies

10. UNIX for Advanced & Expert Users

mysqldump slowing down the process?

Hi All, I have a data calculation process-a perl script running each and every hour which will do some calculations on the data stored in a mysql server. Normally it tooks around 2minutes (max) to complete. But in case if i did any actions on the linux box where the database is... (7 Replies)
Discussion started by: DILEEP410
7 Replies
Login or Register to Ask a Question