Making script run faster


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Making script run faster
# 8  
Old 05-12-2012
If you want to put it in a shell script, you have to use single quotes:
Code:
#!/bin/bash

awk 'BEGIN { FS="\n"; RS="}\n"; ORS="}\n\n"; }

{
        for(X in D) delete D[X];

        for(N=2; N<=NF; N++)
        {
                split($N, A, "=");
                D[A[1]]=A[2];
        }
}
D["service_description"]==pat' pat=$2 $1

Then you can call it with
Code:
./script.sh status.log MEMORY_CHECK

The last line,
Code:
D["service_description"]==pat

filters the chunk and prints only the chunk that contains
Code:
service_description=<pattern>

So you could make a compound statement, with logical AND (&&) or OR (||) operators:
Code:
...
D["service_description"]==pat && D["host"]==h ' h=$3 pat=$2 $1

Then the script you'd invoke like
Code:
./script.sh status.log MEMORY_CHECK <hostname>

It will still print the whole chunk though. If you just want to see the line that contains "host", pipe the output (selected chunk) to grep.

But admittedly, the cranky old guy is right that you should be as specific as possible in what you are trying to achieve, in order to get fast and accurate help. It would save us time and enable us to help more people. Smilie

Last edited by mirni; 05-12-2012 at 04:37 AM..
# 9  
Old 05-12-2012
hey guys, thanks for your help. i thought i gave as much information as would be needed in my previous posts. but i apologize if that wasn't enough. i didn't want to overindulge you with too much explanation and cause probable confusion.

but here is what i'm trying to do.

the chunk that i posted is a chunk that exist in a giant file for a number of servers.

so, i have about 2500 servers being monitored. and EACH of those servers have chunks in the status.log file. chunk(s) is plural because, for each server, there can be more than one chunk. matter of fact, there is usually at least 4 chunks for each server. each of those 4 chunks represent service_descriptions (i.e., CPU_CHECK, MEMORY_CHECK, DISK_CHECK etc). so you can imagine how huge that file is.

now, my task is to grab information on ANY host out of the 2500 from the status.log file. the problem is, when you have a file that is big with information of 2500 servers stored in it, processing the file becomes sort of a nightmare. and response is very slow.

in the script below, i attempted to do precisely what i need, i hope you guys can offer suggestions:

Code:
#!/bin/bash

cat Servers.txt | while read server

do

OUTPUT=$(

awk 'BEGIN { FS="\n"; RS="}\n"; ORS="}\n\n"; }

{
        for(X in D) delete D[X];

        for(N=2; N<=NF; N++)
        {
                split($N, A, "=");
                D[A[1]]=A[2];
        }
}

D["service_description"]==pat && D["host_name"]==h ' h=$server pat=$2 $1 | egrep "^plugin_output"

)

echo "$server  ==  $OUTPUT"

done


so i call this script like this:
Code:
./script status.log MEMORY_CHECK

so in essence, what i'm saying through this script is that, for EACH server in the list provided in Servers.txt, I want you to pull out the chunk for "service_description" titled MEMORY_CHECK. and from the output, i want you to only grab out information about plugin_output.

i ran this just now on a list of about 200 server and while it is working, it is taking forever to complete.

Last edited by SkySmart; 05-12-2012 at 10:56 AM..
# 10  
Old 05-12-2012
You are processing the whole logfile for each server separately, that's what hinders you.
Try this:

Code:
#!/bin/bash

awk 'BEGIN {
  while((getline < "server.list")>0)
     S[$0]

  FS="\n"; RS="}\n"; ORS="}\n";
}

/service_description=MEMORY_CHECK/ {

  for(X in D) delete D[X];

  for(N=2; N<=NF; N++)
  {
       split($N, A, "=");
       D[A[1]]=A[2];
  }

  if (D["host_name"] in S)
       printf("%20s -- %50s\n", D["host_name"], D["plugin_output"])

}' $1

Stash all the host names in a text file, one host per line, save it as "server.list", then call this like
Code:
./script.sh status.log


Last edited by mirni; 05-12-2012 at 05:13 PM.. Reason: "text file", not "test file" :)
# 11  
Old 05-12-2012
Quote:
Originally Posted by mirni
You are processing the whole logfile for each server separately, that's what hinders you.
Try this:

Code:
#!/bin/bash

awk 'BEGIN {
  while((getline < "server.list")>0)
     S[$0]

  FS="\n"; RS="}\n"; ORS="}\n";
}

/service_description=MEMORY_CHECK/ {

  for(X in D) delete D[X];

  for(N=2; N<=NF; N++)
  {
       split($N, A, "=");
       D[A[1]]=A[2];
  }

  if (D["host_name"] in S)
       printf("%20s -- %50s\n", D["host_name"], D["plugin_output"])

}' $1

Stash all the host names in a test file, one host per line, save it as "server.list", then call this like
Code:
./script.sh status.log

Wow. this actually seems to be super fast. however, it doesn't go through the entire file. it only spat out arbitrary servers. out of 200 servers, it only spat out 8 servers, and there should be way more than that. and also it provides limited output for the "plugin_output". it didn't show the entire output of the plugin_output.
# 12  
Old 05-12-2012
Your feedback is (once again) not very constructive. Please:

1. give an example of one hostname that it doesn't catch, post the chunk that you are missing. What do you expect to get?

2. Furthermore, make sure your server list contains one server per line, with no whitespaces, and no extra characters (e.g. if you created this on windows system)

3. Try to understand what it does. This way you can learn, and adjust to your particular needs:
- the script processes only the chunks with
Code:
service_description=MEMORY_CHECK

- and it only prints the host name and the part of plugin_output line that is after the = sign.
# 13  
Old 05-12-2012
Quote:
Originally Posted by mirni
Your feedback is (once again) not very constructive. Please:

1. give an example of one hostname that it doesn't catch, post the chunk that you are missing. What do you expect to get?

2. Furthermore, make sure your server list contains one server per line, with no whitespaces, and no extra characters (e.g. if you created this on windows system)

3. Try to understand what it does. This way you can learn, and adjust to your particular needs:
- the script processes only the chunks with
Code:
service_description=MEMORY_CHECK

- and it only prints the host name and the part of plugin_output line that is after the = sign.
There are different servers in the file. and yes, it is one server per line with no extra spaces.

Lets assume in the file containing the list of servers, there are servers with different naming schemes. some servers are named apples1, apples2 etc.
others are named oranges1, oranges2 etc. and some are named serverA, serverB, as is shown below.

each one of all these servers has a service_description on it for MEMORY_CHECK. however, when i run the script, it doesn't grab all these servers.
it grabs random servers, examples of which is shown below. the random servers that it did grab, it only shows part of the "plugin output" instead of everything.

when i run the script, i get this output:

Code:
serverA --                                          OK: Used
serverB --                                          OK: Used
serverC --                                          OK: Used
serverD --                                          OK: Used
serverE --                                          OK: Used
serverF --                                          OK: Used
serverG --                                          OK: Used
serverH --                                          OK: Used

the "OK: Used", should really show the complete output which is:

Code:
OK: Used = [ 394.941 MB ], System = [ 11.8867 GB ], Used = [ 3.24464% ], Available = [ 11.501 GB ], Cached = [ 11.248 GB ].

so in other words, when i run the script, i should get something like this for each server:

Code:
serverA --                                          OK: Used = [ 394.941 MB ], System = [ 11.8867 GB ], Used = [ 3.24464% ], Available = [ 11.501 GB ], Cached = [ 11.248 GB ].

# 14  
Old 05-12-2012
OK, well, that is progress.
Notice these two lines?
Code:
       
      split($N, A, "="); 
      D[A[1]]=A[2];

This splits the line (stored in $N) on '=' and stores it in array A.
The second line stores the second field only A[2].
Since nobody mentioned before that the lines could have multiple equal signs, it was assumed that lines were in the form
Code:
label=value

where the value would not contain the '=' (a very resonable assumption).

So, you need to append the rest of the fields, like e.g.:
Code:
awk 'BEGIN {
  while((getline < "server.list")>0)
     S[$0]

  FS="\n"; RS="}\n"
}

/service_description=MEMORY_CHECK/ {

  for(X in D) delete D[X];

  for(N=2; N<=NF; N++)
  {
       split($N, A, "=");
       D[A[1]] = A[2]
       i = 3;
       while (i in A) 
          D[A[1]] = D[A[1]] "=" A[i++];
  }

  if (D["host_name"] in S) 
       printf("%20s -- %50s\n", D["host_name"], D["plugin_output"])

}' $1

Quote:
it grabs random servers
Now did you read the code and try to think how it works? The order is not random, at all. It processes one chunk at a time, and looks if the particular chunk's hostname is in the server list.
If you insist to process them in a fixed order, by-server, you would have to do what you did before, scan the log once for each server.
So I suggest you take the order that is given by the log file, and sort the results to get them in the order you want.
This User Gave Thanks to mirni For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script works, but I think it could be better and faster

Hi All, I'm new to the forum and to bash scripting. I did some stuff with VB.net, Batch, and VBScripting in the past, but because I shifted over to Linux, I am learning to script in Bash at this moment. So bear with me if I seem to script like a newbie, that's just because I am ;-) OK, I... (9 Replies)
Discussion started by: cornelvis
9 Replies

2. Shell Programming and Scripting

Optimize shell script to run faster

data.file: contact { contact_name=royce-rolls modified_attributes=0 modified_host_attributes=0 modified_service_attributes=0 host_notification_period=24x7 service_notification_period=24x7 last_host_notification=0 last_service_notification=0 host_notifications_enabled=1... (8 Replies)
Discussion started by: SkySmart
8 Replies

3. Shell Programming and Scripting

Making a faster alternative to a slow awk command

Hi, I have a large number of input files with two columns of numbers. For example: 83 1453 99 3255 99 8482 99 7372 83 175 I only wish to retain lines where the numbers fullfil two requirements. E.g: =83 1000<=<=2000 To do this I use the following... (10 Replies)
Discussion started by: s052866
10 Replies

4. Shell Programming and Scripting

Make script faster

Hi all, In bash scripting, I use to read files: cat $file | while read line; do ... doneHowever, it's a very slow way to read file line by line. E.g. In a file that has 3 columns, and less than 400 rows, like this: I run next script: cat $line | while read line; do ## Reads each... (10 Replies)
Discussion started by: AlbertGM
10 Replies

5. Shell Programming and Scripting

Script to parse a file faster

My example file is as given below: conn=1 uid=oracle conn=2 uid=db2 conn=3 uid=oracle conn=4 uid=hash conn=5 uid=skher conn=6 uid=oracle conn=7 uid=mpalkar conn=8 uid=anarke conn=1 op=-1 msgId=-1 - fd=104 slot=104 LDAPS connection from 10.10.5.6 to 10.18.6.5 conn=2 op=-1 msgId=-1 -... (7 Replies)
Discussion started by: sags007_99
7 Replies

6. UNIX for Advanced & Expert Users

Making things run faster

I am processing some terabytes of information on a computer having 8 processors (each with 4 cores) with a 16GB RAM and 5TB hard drive implemented as a RAID. The processing doesn't seem to be blazingly fast perhaps because of the IO limitation. I am basically running a perl script to read some... (13 Replies)
Discussion started by: Legend986
13 Replies

7. Shell Programming and Scripting

Can anyone make this script run faster?

One of our servers runs Solaris 8 and does not have "ls -lh" as a valid command. I wrote the following script to make the ls output easier to read and emulate "ls -lh" functionality. The script works, but it is slow when executed on a directory that contains a large number of files. Can anyone make... (10 Replies)
Discussion started by: shew01
10 Replies

8. UNIX for Advanced & Expert Users

Country Codes script faster response ;please help

Dear all I have group of input lines which look like this These input lines is placed in a file named phonelines.txt and there is a script which match $4 and $5 with country codes placed in another file named country-codes.txt and its contents is : Italy 39 Libyana 21892 Thuraya... (12 Replies)
Discussion started by: zanetti321
12 Replies

9. UNIX for Dummies Questions & Answers

making ssh run without password

Hello Everybody, Could anyone please tell me how to get ssh to work without asking for passwords? (i want to do a ssh <hostname> without getting a request for a password but getting connected straight away) I have attempted the following but to no avail :( ... I tried to generate a SSH... (5 Replies)
Discussion started by: rkap
5 Replies
Login or Register to Ask a Question