Possible performance improvement (Bash and flat file)
Hello,
I am pretty new to shell scripts and I recently wrote one that seems to do what it should but I am exploring the possibility of improving its performance and would appreciate some help. Here is what it does - Its meant to monitor a bunch of systems (reads in IPs one at a time from a flat file). For each IP, it fetches a set of web pages, parses them to extract certain numbers, compares them against defined thresholds and alerts if the metric falls outside the threshold range. The catch is for certain metrics, it requires the last 5 values that it observed so I store those in a flat file and every time a new value is retrieved from the web page, that along with the stored values are used to compare against the threshold. Basically, I am doing everything sequentially so 2 loops, one to read in the IP and the next to do the web page download, threshold check, etc. Every time a new IP is added or a new metric needs to be monitored, the time taken to loop back to a machine increases. I wanted to see if there was a way to improve this? Intuitively, I feel, because all historical values are stored in a single flat file, something like multi processing would not work since, a process would have that file locked. Any ideas?????
As the level of complexity increases, it begins to make more sense to utilize a database to manage the changing state of the environment. Maybe look into something simple to start with - like Berkely DB
The catch is for certain metrics, it requires the last 5 values that it observed so I store those in a flat file and every time a new value is retrieved from the web page, that along with the stored values are used to compare against the threshold. Basically, I am doing everything sequentially so 2 loops, one to read in the IP and the next to do the web page download, threshold check, etc. Every time a new IP is added or a new metric needs to be monitored, the time taken to loop back to a machine increases. I wanted to see if there was a way to improve this?
It would help to see the actual code.
Quote:
Intuitively, I feel, because all historical values are stored in a single flat file, something like multi processing would not work since, a process would have that file locked. Any ideas?
Most systems don't do that kind of locking unless you explicitly ask for it. But having two processes simultaneously read the same file handle wouldn't be a great idea, they might each get half a line or somesuch. If you're just reading flat files line by line, you could try a 'reader' script that reads everything for them and parcels them out individually. That'd have some extra overhead for the extra process and its pipes, but would let more than one reader operate at once.
I'll need to see your actual code to help you here, I think, at least some of it. What needs to be optimized depends not just on what you're doing, but how you're doing it. If you're new to shell scripting there's some trivial design mistakes that could be causing slowdowns... excessive use of pipes and/or backticks is particularly bad. If you've got pipe chains on almost every line, there's probably much room for improvement. In my early scripting days I wrote a linewrapper in BASH that fed everything through about 9 sub-processes, it ended up processing at 10 kilobytes per second!
Last edited by Corona688; 05-07-2010 at 03:02 PM..
Reason: fix inexplicable doublepost
Thanks a lot everyone. I do seem to have a very large number of back tics. Would appreciate help in eliminating them and any other way of improving performance.
can just be
Also, I'm not entirely sure what this line is doing:
...but if you're guarding against blank lines:
Or better yet, do this. It will skip blank lines without another layer of nested if at all:
Constructs like these are extremely slow since they can run cut uncountable numbers of times.
Instead, since you're using a shell that supports arrays, just split it into an array once then use the array. This should split fine on spaces:
You can also split on other characters by changing the IFS variable but be aware that this affects read too.
You're running grep many, many times per loop. This is slow. Instead of
try
This reads the file only once and doesn't execute four extra processes. Note that the ~= regular expression operator only works in bash.
Whenever you have VAR=`something | grep something | grep something | grep something` that's an enormous performance waster, and likely possible with shell built-ins, though exactly how depends on what bits you want to get.
...and so forth and so forth. Your script is enormous. You might want to break it into functions so you can tell what's happening where. Functions are easy:
They act like processes in that they return numbers, not strings, and output to stdin/stdout/stderr. But they can set global variables (as long as they're not behind a pipe).
You can also keep the historical data manageable by tailing the file. Log all values into a single file, such as history.log
At the beginning of the log file processing, execute:
This will give you a smaller file from which to get your historical data. The size of your history.log file will not matter, your processing file will always contain the last 10 entries.
Hello,
For several of our scripts we are using awk to search patterns in files with data from other files. This works almost perfectly except that it takes ages to run on larger files. I am wondering if there is a way to speed up this process or have something else that is quicker with the... (15 Replies)
Hi guys and gals...
MacBook Pro.
OSX 10.13.2, default bash terminal.
I have a flat file 1920 bytes in size of whitespaces only. I need to put every single whitespace character into a bash array cell.
Below are two methods that work, but both are seriously ugly.
The first one requires that I... (7 Replies)
Hi,
another little question...
"sn" is an array whose elements can vary from about 55,000 to about 150,000 elements. Each element consists of an integer between 0-255, eg: ${sn} contain the value: 103 . For a decrypt-procedure I need scroll all the elements 4 or 5 times. Here is an example of... (15 Replies)
Below script is used to search numeric data from around 400 files in a folder. I have 300 such folders. Need help in performance improvement in the script.
Below Script searches 20 such folders ( 300 files in each folder) simultaneously. This increases cpu utilization upto 90% What changes... (3 Replies)
Hello Coders
Some time ago i was asking about python and bash performances, and i was told i could post the regarding code, and someone would kindly help to make it faster (if possible).
If you have noted, i'm on the way to finalize, finish, stable TUI - Text(ual) User Interface.
It is a... (6 Replies)
Heyas
I've been working on my project TUI (Text User Interface) for quite some time now, its a hobby project, so nothing i sit in front of 8hrs/day.
Since the only 'real' programming language i knw is Visual Basic, based upon early steps with MS-Batch files. When i 'joined' linux 3 years ago,... (7 Replies)
can anyone help to share the knowledge on linux os improvement?
1) os account
- use window AD authentication, such as ldap, but how to set /etc/passwd, where to put user home?
2) user account activity
- how to log os user activity
share the idea and what tools can do that...thx (5 Replies)
Hi!
Thank you for the help yesterday
This is the finished product
There is one more thing I would like to do to it but I’m not to certain
On how to proceed I would like to log all output to a log in order to
Be able to roll back
This script is meant to be used in repairing a... (4 Replies)
Hi All
I am reading a huge file of size 2GB atleast. I am reading each line and cutting certain columns and writing it to another file.
Here is the logic.
int main()
{
string u_line;
string Char_List;
string u_file;
int line_pos;
string temp_form_u_file;
... (10 Replies)
Hello All,
I am brand new to the UNIX world and so far and very intrigued and enjoy scripting. This is just a new language for me. I would really like assistance with the below request. Any help would be greatly appreciated!
I want to create a flat file in Vi that has a header field and... (0 Replies)