The Most Incomplete Guide to Performance Tuning


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Answers to Frequently Asked Questions Tips and Tutorials The Most Incomplete Guide to Performance Tuning
# 1  
Old 09-15-2013
The Most Incomplete Guide to Performance Tuning

Overview:
  • Introduction
  • What Does Success Mean?
  • What Does Performance Mean?
  • Every Picture is Worth a Thousand Words
  • Work Like a Physicist
  • Work Like You Walk - One Step at a Time
  • Learn to Know Your System
  • Choose Your Weapons!
  • Tools of the Trade 1 - vmstat
  • A Little Theory Along the Way - Kernel Workings
  • Tools of the Trade 2 - iostat
  • A Little Practice in Between - disc magic
  • Even More Disc Magic - RAIDing the Storage!
  • There's Gold in Them Thar Networks!
  • What You Always Wanted to Know About Networks But Were Too Afraid to Ask
  • Tools of the Trade vol. 3: netstat
  • Conclusion
  • Acknowledgements


Introduction

Performance monitoring and tuning is often the most neglected part of a system administrator’s work. I will try to give you some tips on how to carry out the role, and how to find out if you are succeeding, or not. I will try to keep the discussion as general as possible, and avoid system-specific details where possible.

Let’s start with some theory. This is easier for me, because it means I can write all sorts of general knowledge, and don't have to put in any real work at all! Eventually I will fill in the gaps with practical examples from my experience. In case you didn't know, experience is something you get just after you needed it, by which time, of course, you’ve resolved the problem!

One more word of caution before we get going: I have tried hard to suppress my sense of humour (which I am not in the least bit known for) to make this as hard to digest as possible. Should you manage to stay awake throughout the text - well done! It is completely my fault and I would not hesitate to say so. So you, who are entering here, abandon all hope and let's start! I jest, of course. We’ll be just fine!

What Does Success Mean?

Many performance tuning projects go like this:

Quote:
Customer: “We need the XYZ application to be faster”

SysAdmin: “OK, lets see…”, works for 2 two hours, “I got the queue to be 20 per cent shorter on average”

Customer: “Great. But we still want it to be a little faster, could you…”

SysAdmin: “OK.”, works for two more days, “I reduced the overall response time by 10 per cent”

Customer: “Very Good! Now, if you could make the system a wee bit faster…”

SysAdmin: Sighs and spends the next two weeks, with very long shifts, eking out a few more microseconds here and there. ”So, I finally managed to gain a 5 per cent boost in network output on the system by applying a mixture of RFCs and obscure voodoo rituals!”

Customer: “Fantastic! Could you now make the system a little bit faster, please?”

SysAdmin: Starts to sob uncontrollably and subsequently hangs himself using a power cord!
The problem is that fast is a relative term. Therefore it is absolutely imperative that you agree with your customer exactly what fast means. Fast is not “I don't believe you could squeeze any more out of it even if I threaten to fire you". Fast is something measurable - kilobytes, seconds, transactions, packets, queue length - anything which can be measured and thus quantified. Agree with your customer about this goal before you even attempt to optimise the system. Such an agreement is best laid down literally and is called a Service Level Agreement (SLA). If your customer is internal a mail exchange should be sufficient. Basically it means that you won't stop your efforts before measurement X is reached and in turn the customer agrees not to pester you any more once that goal is indeed reached.

A possible SLA looks like this:

Quote:
The ABC-program is an interactive application. Average response times are now at 2.4 seconds and have to be reduced to below 1.5 seconds on average. Single responses taking longer than 2.5 seconds must not occur.
This can be measured, and it will tell you - and your customer - when you have reached the agreed target. It will also tell you when the project is over and you can go back to surfing the web! Think of an SLA like an American Express card: “don't leave home without it”!

By contrast, here’s a typical example of work that is not covered by an SLA, a graveyard of hundreds of hours of uncounted, wasted man-hours:

Quote:
The ABC-program is a bit slow, but we can't afford a new system right now, therefore make it as fast as possible without replacing the machine or adding new resources.
The correct answer for such an order is: "if the system is not important enough for you to spend any money on upgrade it, why should it be important enough for me to put any serious work into?

Of course, if you are in possession of a magic wand you could wave it and probably be the hero of the company’s IT, at least for a short time. If you don't have a magic wand just be the best you can!

What Does Performance Mean?

Another all too common misconception is the meaning of "performance", especially its confusion with speed. Performance is not just about being fast. It’s about being fast enough for a defined purpose under an agreed set of circumstances.

A simple comparison of the difference between performance and speed can be described with this analogy: We have a Ferrari, a large truck, and a Land Rover. Which is fastest? Most people would say the Ferrari, because it can travel at over 300kph. But suppose you’re driving deep in the country with narrow, windy, bumpy roads? The Ferrari’s speed would be reduced to near zero. So, the Land Rover would be the fastest, as it can handle this terrain with relative ease, at near the 100kph limit. Right? But, suppose, then, that we have a 10-tonne truck which can travel at barely 60kph along these roads? If each of these vehicles are carrying cargo, it seems clear that the truck can carry many times more the cargo of the Ferrari and the Land Rover combined. So again: which is the "fastest"? It depends on the purpose (amount of cargo to transport) and environment (streets to go). This is the difference between "performance" and "speed”. The truck may be the slowest vehicle, but if delivering a lot of cargo is part of the goal it might still be the one finishing the task fastest.

There is a succinct difference between fast and fast enough. Most of us work for demanding customers, under economic constraints. We have to not only accommodate their wishes, which are usually easy - throw more hardware at the task - but also their wallet, which is usually empty. Every system is a trade-off between what a customer wants, and what he is willing to pay for. This is another reason why SLA’s are so important. You can attach a price tag to the work the customer is ordering, so they know exactly what they’re getting.

Every Picture is Worth a Thousand Words

Most performance tuning tasks are stunningly simple once you have found out what the bottleneck is. Finding the bottleneck, though, can be a complicated task. You will need all kinds of strange tools (more on that later) and even if you have all the tools, you will still have to understand how they work, and interpret their output.

One of the most important tools in my toolbox is a program to produce graphs from table data. It’s called xgraph. It isn't really important which program you use, any will do. The point is that by converting your data into a graph you can easily see any derivations from the mean, as a peak (bulge) or a trough (pit). Some of these peaks and troughs may be easy to explain. For example, a transaction-oriented database may have more transactions during the day when people are at work, than at night when most people are asleep. So the peaks from roughly 9:00 AM to around 5:00 PM can be explained.

But maybe you will see another bulge on Wednesday’s from 11AM - 12 PM, and you can’t immediately explain. Investigating, you might find that someone has put a crontab entry at exactly that time to do a system backup. Gotcha! I actually had this on a big database system, which started to page in and out uncontrollably. The system backup was moved to a low-transaction time and the problem was solved.

Work Like a Physicist

The people in physics are quite a methodical lot. They have worked the same way for the last two thousands years! First, they have some phenomenon they want to know more about. They create all kinds of theories about what could cause the phenomenon, and then design experiments to prove or disprove each theory. Sometimes they end up disproving all of them, and are left with no working theories (in this case they usually demand more money to build bigger toys!). But more often they end up disproving most of the theories, except one. This is as close to the “truth" as they can ever hope to get - a theory with no conceivable experiment to show something which can't be explained by this theory.

When I start to work on a system, I do some standard tests (running vmstat, iostat, etc.), then I create a theory (or perhaps several theories) from this initial data. Then I test each of those theories, one after the other, by carrying out experiments designed to prove or disprove them.

In most cases things are quite simple and this process is short. That is, you have one theory and every test shows that it is correct. But sometimes there are nasty, complicated, interdependent problems that can't be solved in any other way than the one described above. Oh, and by the way, if you run out of theories, you take the physicists as a role model too and demand more money - maybe your system is already maxed out and the only thing that helps is to have a bigger system.

Work Like You Walk - One Step at a Time

I still remember the good old days! One of these good old days, I had the opportunity to hear Seymour Cray (the builder of supercomputers from Cray Industries-fame) talk about how he developed the Cray II from the Cray I. He said:

Quote:
... and in the Cray II, I left everything as it was in the Cray I except for the memory interface. Because you develop computers the same way you walk: one step after the other. If you try to make several steps at once you just hop up and down and don't get anywhere.
I'm not sure if I remember exactly the words Seymour used, but his message has stuck with me ever since: don't try to do too much at once. If you try to tune a system, change one parameter, then monitor again and see what impact that had, or whether it had any impact at all. Even if you have to resort to sets of (carefully crafted) parameter changes do one set, then monitor before moving onto the next set.

Otherwise you run into the problem that you don't really know what you are measuring, or why. For example, suppose you change the kernel tuning on a system while, at the same time, your colleague has dynamically added several GB of memory to that system. To make matters “worse” the guy from storage is in the process of moving the relevant disks to another, faster subsystem. At the end, your system’s response time improved by 10%.

Great! But how? If you need to gain another 5%, where would you start? If you had known that adding 1GB of memory had improved the response time by 3% and that adding 3 GB more was responsible for most of the rest, while the disk change brought absolutely nothing, and the kernel tuning brought around 0.5%, you could start by adding another 3GB, and then check if that still has a positive impact. Maybe it didn’t, but it’s a promising place to start. As it is, you only know that something you, or your colleagues, did caused the effect, and you have learned little about your problem or your system.

Learn to Know Your System

Speaking of knowing your system: do you know your system? Really? Okay, right now it is performing at an acceptable speed. But what would happen if, let's say, you distributed the same number of physical processors over a larger number of logical processors? Will it make your system perform better? Worse? Would it have any effect at all?

You don't want to try this change on your production system just to satisfy your curiosity, but you don't really know what would happen either, right? This is why it’s a good idea to meticulously keep all the data from previous performance tuning attempts filed away. File not only what has finally worked, but your raw data, the charts you created from them, your failed attempts (what did you try, and for what reason and why has it failed?), etc. The only risk you run is saving a few megabytes of useless data. The DBA guys probably do the same every second they’re in the office, or at least their database do, by saving "redo logs" or database transactions. They are most probably never ever used again, but they do it for the same reason you should do it: the data is most probably useless, but if you need it, it can be a life saver!

You should also monitor your system for a few key values even if it performs satisfactorily. This will tell you, if nothing else, two things: firstly, what the "baseline" of the system is. You will see patterns in its usage and how fast the demand for certain resources increases; and secondly, once the system really starts to be a problem you already have the data you need for long-term-analysis, and you will not only be able to optimise the system more easily, but also be able to predict shortages proactively: "We had an increase in memory consumption of 30% per year for the last 3 years, if we now upgrade to X GB RAM this will predictably last until March next year. With no upgrade at all we will run into shortages around August this year."

I can't guarantee for your private life but management will love you, the users will adore you and you will probably never get any promotion because there would be no replacement for such an accomplished sysadmin - the Peter-Principle applying!

Choose Your Weapons!

I think I have bored you long enough with my stories from the days of yore. So let me share something you could actually use, to make you think you could learn something here. Let's talk about the tools you will use for performance monitoring.

You all come from different systems and every system has its own specialities, but there are still a few constants. You all have the same problems: the machine is memory-bound, processor-bound, I/O-bound, meaning the respective resource is the bottleneck for its operation, and you want to know how much memory every process uses, how much processor time every process uses, how much disk-I/O / network traffic every process is consuming, and so on. Unix - every Unix, including Linux - offers standard tools to answer precisely these questions.

I suggest you learn to use these tools instead of some fancy agent that management has purchased for a lot of money. These tools tend to not measure the things which are really interesting, but the ones that look good on a chart and the ones management is able to understand. I remember a data centre using "BMC Patrol". Management was loving it. The thing had a chart showing memory consumption. What they didn’t understand, however, is that memory consumption on an AIX system is usually constantly near 100% because the system dedicates most available memory to the file cache as long as it isn't needed elsewhere. It is anybody’s guess how much meaning you can discern from a nearly flat line at, say, 97% with little changes of about 0.1%. They were basing their "business expectations and forecast" actually on this line, much to the amusement of the admin staff!

(A necessary comment, in case you are a lawyer from BMC! The problem is not with the tool, itself, but rather with the people who are not trained to interpret its output. Data without the necessary interpretation is useless and a fool with a tool is still a fool.)

You can, of course, waste your time using these management tools. You could also waste your time arguing against their use. They won't pay you more even if you can prove that the new software was a waste of money, so don't say a word and do the wise thing: silently ignore these nuisances. The tools we use will work on any Unix system, will not aggregate data to some meaningless values and will display what they have to display from the very source. The tools we will use are:
  • vmstat
  • iostat
  • ps
  • netstat
We will also use some text filters (awk, sed, grep and sort, among others), also available on every system. To keep this article short (and because I need something to write about once my muse lashes out again!), I assume that you are familiar with these tools and with regular expressions common in Unix text filters.

Tools of the Trade 1 - vmstat

While I futilely try to set my brain in motion, let us talk about vmstat, which is the tool to use when you need to get a quick overview on what your system is doing. Let’s see a typical output (in this case taken from an AIX system. The output formats may change slightly on other systems, but the core parts are always the same):

Code:
   kthr            memory                         page                       faults           cpu       time  
----------- --------------------- ------------------------------------ ------------------ ----------- --------
  r   b   p        avm        fre    fi    fo    pi    po    fr     sr    in     sy    cs us sy id wa hr mi se
  0   0   4    3603088      78543     0     0     0     0     0      0  2039 162218  9485 12  7 59 22 21:36:39
  2   0   3    3603070      78561     0     0     0     0     0      0  2268  26937  8614 13  5 61 21 21:36:41
  1   0   3    3603065      78566     0     0     0     0     0      0  2730  32741  9945 12  5 60 22 21:36:43
  1   0   3    3603149      78482     0     0     0     0     0      0  2099  25234  8246 14  4 60 22 21:36:45
  1   0   3    3603838      77776     0     3     0     0     0      0  2209  31075  8496 14  6 57 22 21:36:47
  2   0   3    3603677      77916     0    16     0     0     0      0  2184  26788  8543 16  5 58 22 21:36:49
  0   0   4    3603056      78526     0     0     0     0     0      0  2249  27634  8504 15  4 61 20 21:36:51
  1   0   4    3603068      78514     0     0     0     0     0      0  1847  22577  7417 31  4 43 22 21:36:53
  1   0   4    3603055      78527     0     0     0     0     0      0  1892  23173  7613 14  4 59 23 21:36:55
  1   0   4    3603063      78519     0     0     0     0     0      0  2167  25862  8387  8  4 64 23 21:36:57

The first thing you want to look at are the columns labeled pi and po in the page section. The headers mean "pages in" and "pages out" - to or from the swap space, that is. What does that mean? As more and more programs start and demand memory from the kernel it gives out what it has. But at some point there is nothing left to give and the kernel falls back on some clever tactics: it uses disk space to temporarily store memory pages which aren't currently in use, to allow it to free up memory to give to another program. This process is called "paging" and the file space used is the "paging space". When the program whose memory was swapped out needs to run again its memory is copied back from disk to memory ("paged in”), and on it goes.

It is clear that this will slow down the computer considerably, because disk I/O is, by an order of several magnitudes, slower than memory I/O.

So is there a point in having paging space at all, you might ask? Yes, there is. It is, or at least should be, the contingency for the very rare occasion where the installed memory isn't enough for a very limited period of time. Think of it like a seat belt: you should wear it but under normal circumstances never need to rely on it.

Therefore, the first thing the output of vmstat tells us is: is there any paging activity going on? We don't care about occasional 1s or 2s in the pi/po columns. But if even small numbers are on two or even three consecutive lines you should start to worry. Even that may seem negligible, but you should investigate to make sure there is in fact no problem.

The next columns of interest are in the kthr (“kernel threads”) section and are the columns labeled r and b, for "running" and "blocked". These two columns denote the status of the processes the system is working on at the very moment.

The first thing we should be concerned with is the number in the b (blocked queue) column. If there is any value greater than zero the system has a problem and, for you, this means it’s time for action! Unix is a time-sharing system. That means several processes are using the system at the same time. Well, not exactly “the same time”, but it seems like that. In fact every process is given the system’s processors for a short period of time (some milliseconds), then it is halted and the next process continues. After all the processes have taken a turn the cycle starts over again, so it looks like all processes are running simultaneously. It is clear that some processes can't always run when it is their turn: maybe they need to load data, maybe they are waiting for some input, etc. These processes are skipped. If there is an entry in the "blocked" column, this means that a process is ready to run, but can't because it’s blocked. In most cases this is because some pages of its program code have been swapped out and the process is waiting for them to get paged in. Obviously it can't run with half its program code missing from memory.

That means a non-zero blocked-queue is an immediate call for help from the systems kernel to the administrator. You should take that absolutely seriously and immediately start to investigate. In many cases you will find paging activity along with blocked processes and in most of these cases this means the system needs more memory. But never jump too fast to any conclusions! When assessing performance you should be quick to make theories but slow to promote them to declared facts. Always look at the picture from several different angles, and if all of them show the same result you have probably found a fact.

Another interesting part of vmstat’s output is the r (run queue) column. It shows the number of processes running right now. There is nothing wrong in having processes running, but watch the mean of this number closely. If many processes are running, say 20 to 30 on average, it might be that the system’s speed would benefit from more processors, even if they are "smaller". In most virtualised environments it is possible to divide physical processors into a number of logical processors. If you see high numbers in the run queue, and you need to make the system faster one way to achieve this is to create more logical CPUs from the physical CPUs there are.

The last section I want to direct your eyes to is labeled "CPU". Four columns reside there, labeled us, sy, id and wa, which stand for “user”, “system”, “idle” and “wait” respectively. “user" is, by and large, your programs. They are executed in user space (hence the name) and running them is ultimately the goal of the system. “system" is the kernel part of the system’s workings. Every running program uses the system’s services (for instance system calls, usage of driver facilities, etc.) to do its work. Apart from this the kernel has a lot to do to keep the system running, including job control, memory management, and so on. All these responsibilities cause CPU usage and this is shown here. The “wait" column is programs waiting for I/O, usually the disk to come up with some data. If there is a high number in "wait" and a non-zero number in the blocked-queue, then in 95 out of 100 cases the pi and po columns will show heavy paging activity. That is to say, a program otherwise ready to run was swapped out and now has to wait while it gets swapped in. The last column, “idle", is the remainder of unused CPU time not shown in the other three columns. Large idle times usually means the system is not under load. Large idle times with high wait times typically indicates that the system is I/O-bound, but not necessarily so).

Let’s look again at the vmstat output above and try to put to work what we have learned. The blocked queue is empty, which is good. There is also no paging activity, because the pi and po columns are constantly at 0. Still we see that the wa column is higher than us and sy. At the same time id is relatively high and the run queue shows very low numbers.

So the picture is of a system with enough memory and a light load, but somehow the I/O part is hardly managing to keep up because otherwise us would be higher than wa. The situation is still OK. What happened here is that the snapshot was taken during a backup phase, and the system was doing really nothing except shovelling data from the disks to the network stack and this is where the I/O is coming from.

This is why it is of the utmost importance to correlate a vmstat snapshot with the activity going on at that time. Without this information the picture will lose a lot of its meaning.

A Little Theory Along the Way - Kernel Workings

After this rather lengthy discussion about vmstat we will take a little break, and I can tell you a story which is guaranteed to cheer you up, before you happily nod off. It’s about memory management. You might have noticed that in discussing the various columns of the vmstat output I ignored a section called "memory". That's right, and what I have to tell you has everything to do with this.

First, for the sake of completeness, what do these columns mean? The first, labeled avm, for "available memory", indicates the total amount of memory the system is able to dedicate to programs. The second one, fre for "free”, is the memory the system makes absolutely no use of at the moment. These numbers are not in bytes but in memory pages, the size of which can vary with different implementations of Unix. In AIX it is 4KB (mostly) or 64MB (very rarely), and in many other Unix and Linux variants it is 4KB, too. You will have to check the systems documentation for your system to determine its page size. What does "available" mean? This is the whole memory, plus the paging space, minus the memory the kernel thinks it needs itself, minus the memory which is "pinned”, meaning absolutely dedicated to a certain program without the option of reclaiming it, minus memory which, for one reason or another, can't be given to programs either. Yes, the last one was vague and no, I don't feel obligated to explain it. This is an incomplete guide, remember!

You may still wonder if available memory and free memory is interesting. Well, it is, somewhat, but not as much as you might think. The point is, that a Unix kernel tries to make use of the available memory as much as possible. That means every byte not given to programs already will be given to buffers to speed up I/O. The disk read/write buffer (i.e. the cache) is by far the biggest of these buffers. Let’s say you have a system with 4GB of RAM. Running programs use 2 of these 4GBs. How much is free? For apparent reasons it will be less than 2GB, because the kernel itself, the various drivers, etc. need some space to run too. Let’s say, just for the sake of this example, they need 500MB in total. How much memory is free? Did I hear "1.5GB" somewhere? You would be wrong! Most probably, and depending on the exact Unix derivate running and its kernel tuning parameters, there will be next to nothing free. For AIX, configured with its most usual tuning, exactly 3% of RAM, or roughly 130MB, will be free.

Now suppose another program on this system starts and needs 500MB. Wouldn't the system crash because there is not enough available RAM? No, not at all, because there is RAM available, just not free. The kernel, presented with this requirement, will dynamically diminish all its buffers (mainly the disk cache) and give the memory to the program requests it. Once this program stops and releases its used memory (yes, that's the theory - the gory details of real-world memory sinks will be only disclosed to a proven adult reader, to protect the innocent!), the kernel will grab it instantly and instead of marking it as "free" (which is a synonym for "unused") it will grow back its buffers to their original size. The fact that 500MB was used and released again will hardly show up in the avm and fre columns, save for some short-lived spikes.

So, from now on, you know what to answer to the many "my Unix system has only 100MB free memory, please help" threads! The correct answer is “show your vmstat output, and if there’s no paging activity it’s a matter of ‘not enough Unix understanding’, not one of ‘not enough memory’!”

Tools of the Trade 2 - iostat

We have talked a lot about memory, so before you beat me with a memory stick, let’s get to the second largest cause of performance bottleneck: I/O. I/O, as we use the term in our discussion, comes in three forms: disk I/O, network I/O and special I/O. Most systems we deal with are servers and only have disks and network. Sometimes, though, special systems have other sorts of I/O, such as serial lines, user input and so on. In most cases (but this is only a rule-of-thumb - don't rely on that being the case on any given system) only network and disks are critical.

We have already seen one cause of I/O holding back the system - swapping. Disks are several orders of magnitude slower than memory and, ideally, we would like to have all the data we need already loaded into memory so that we never have to wait while it’s read from disk.

This is where iostat enters the picture. Notice that it isn't always installed by default. In AIX it is, but in many Linux distributions you have to install the sysstat package to get it. Do so and your life will never be the same!

iostat works in similar way to vmstat in that you give it a sampling interval (in seconds) and optionally the number of samples it should take. If you do not provide this last parameter it will run forever. To get a first impression, I always start with an interval of 1 (second), because my inner clock is set to seconds as a default interval. When I know better what I am looking for I change the sample rate to a sensible interval, which could be anything from 1, for close-meshed observations over a short period of time, to 1200 (that’s 20 minutes), for long-term analysis. You can even run several instances of iostat with different intervals at the same time.

So what does iostat tell us? It tells us the extent to which our disks are taxed by I/O.

Disks (physical, classical hard disks) have a limited bandwidth, which is around 30MB/s today. This is all that a state-of-the-art system of rotating magnetic platters with a moving read/write head can muster! Everything faster than that is only achieved by either another technology, such as SSD’s, or by caching or multiplexing several of these devices, such as in RAID-sets or the like. Modern SAN systems usually have RAID’s in the background and heavily-caching disk controllers up front, so they use a mixture of the second and third method. SSD’s are incredibly fast by comparison, but also currently very expensive and therefore not widely used when storing vast amounts of data. What’s more, they are usually complicated to operate. If you have a database server with terabytes of data in the database, expect these to be stored on good, old-fashioned hard disks, probably enclosed in a SAN system.

Why am I telling you this? Because in the time of multi-gigabytes-per-nanosecond bandwidths these 30MB/s disks run at quite a modest rate. This makes knowing this, and taking appropriate measures to improve on this shortcoming, even more important. Disks, because of their mechanical properties, will not increase their bandwidth a lot, and definitely not at the "triple the speed of last year" that we are used to seeing with processors and network connections. This means in turn that knowing how to speed up disk I/O is a valuable asset.

But I do not want to jump ahead. Let us first finish our discussion about measuring disk I/O and then talk about how to improve it.

The following is taken from a Solaris system. Like with vmstat output, formats vary across platforms but some parts are constant:

Code:
$ iostat -xtc 5 2
                          extended disk statistics       tty         cpu
     disk r/s  w/s Kr/s Kw/s wait actv svc_t  %w  %b  tin tout us sy wt id
     sd0   2.6 3.0 20.7 22.7 0.1  0.2  59.2   6   19   0   84  3  85 11 0
     sd1   4.2 1.0 33.5  8.0 0.0  0.2  47.2   2   23
     sd2   0.0 0.0  0.0  0.0 0.0  0.0   0.0   0    0
     sd3  10.2 1.6 51.4 12.8 0.1  0.3  31.2   3   31

The first fields you want to examine are Kr/s and Kw/s, which shows reads and writes per second in Kilobytes respectively, then r/s and w/s, the read- and write-actions per second. While the first pair measures the amount of data by quantity the latter measures the number of actions that generate that traffic. Why do we have to differentiate?

Because of how a disk works. You give it an order to read some data, it doesn't matter how much - it could be just a single byte. Now the head moves over to the cylinder in which this byte can be read. The machinery has to wait until the right sector rotates under the head before the byte can be read and transferred to the system. If you want to read another byte - which might be stored somewhere else - the head again has to be repositioned, and the process is repeated.

And this whole process, foremost the mechanical part of moving the head around and waiting until the right part of the magnetic platter moves under it, takes time. It is possible to "drown" a disk in such requests without actually transferring more than a small fraction of the possible bandwidth.

Can this really happen, or is this just a theoretical possibility I came up with to vindicate my writings? Well, consider this scenario. Most systems are built like this: one (or two, mirrored) physical system disks with the OS, from which the system boots, and some SAN LUN(s) (Logical Unit Number) to hold application data. If there are many users connecting at the same time and they all work at a command line then a lot of small utilities are being loaded constantly. If RAM is now scarce, and the system doesn't have much to spare for disk caching, these command line utilities will get loaded to and unloaded from memory time and again, and the system’s disks will fast approach saturation.

Also interesting to note in the iostat output above, and as you can see, the values are not evenly distributed across the available disks. Some disks, such as sd2, are not used at all, while others are taxed to some extent. It is easy to find out the "hotspots" - the most heavily taxed disks by running and watching iostat for a while when the system is under load. Usually I filter out unused disks with a few sed commands and then analyse the remaining ones.

Another valuable field is the wait field. This counts the average number of transactions waiting to be serviced - the so-called "queue length". This also shows how much a disk is taxed and if it is operating at its limit (in terms of number of transactions), or not. As long as this value is a nearly-constant zero the disk is coping with what the system throws at it. Once non-zero numbers start to show up regularly, the disk is slowing the system down with I/O-waits.

Lastly, look at svc_t and %b. svc_t is the average service time, and will tell us the time a request to the disk spends in the queue until it is processed. %b is the percentage of time a disk is busy (I.e. has requests to work on) and tells us how fast a disk is able to process requests. As a general rule, if the "percentage busy" (%b) exceeds 5% and the "average service time" exceeds 30ms action has to be taken. A possible action could include tuning the application. Sometimes developers handle I/O as an afterthought, or with no thought at all! In the case of a database system, it might help to reorganise some queries, replace full table scans with indexed searches, and the like. Then there is striping or, more generally, using more spindles to distribute the I/O-load over more disks. More on that in the next chapter. Finally, it might help to modify the systems buffers to avoid some repeated disk I/O. This might mean using more disk caches, but is not limited to that.

Not all data is the same. There is data and there is data about data, known as metadata. Your system stores and retrieves not only data, but information about what it has stored, like i-nodes, directory structures, super-nodes, etc. A colleague of mine actually reduced the backup time of a very big (some 500TB) GPFS filesystem significantly by moving the filesystem information to a solid state disk. While it would have been unfeasible to move a significant amount of this data to SDD, it was easy to put a 150GB SDD to work to hold the filesystem data. By simply speeding up the retrieval of i-node information he squashed the backup time by two thirds!

Here is a table with the meaning of the columns:
disk name of the disk
r/s reads per second
w/s writes per second
Kr/s kilobytes read per second
Kw/s kilobytes written per second
wait average number of transactions waiting for service (Q length)
actv average number of transactions actively being serviced (removed from the queue but not yet completed)
svc_t average service time
%w percent of time there are transactions waiting for service (queue non-empty)
%b percent of time the disk is busy (transactions in progress)
A Little Practice in Between - disc magic

I have told you already a bit of the theory of magnetic disks - how they work and at what speed they work - so let’s have a little practice instead of just a theoretic discussion. What can we do to speed up I/O?

Well, have you ever wondered how such small ants can build such big nests? It isn't too hard to imagine: many ants working together, in parallel. The same is true for disks. To speed up the process of storing and retrieving data simply have more disks working together! How is that done?

Disks store data not in a linear way but in chunks, known as blocks. With only a single disk at our disposal we retrieve one block after the other to form a seemingly linear stream of data. If we use several disks we could use one disk after the other, but this would not speed things up. Instead we spread the blocks over as many disks as possible, to make sure every consecutive block is found on a different disk. This way we can approximate using the disks in parallel:

Code:
        Disk1   Disk2   Disk3   Disk4   Disk5
Block1    1       2       3       4       5
Block2    6       7       8       9      10
Block3   11      12      ...
Block4
Block5
Block6
...

We call this a "stripe set" and it’s clear from looking at it that, while speeding things up, it also makes the whole setup less reliable because every single disk failing will cause the data on the whole system to be lost. Therefore this basic layout, while being instructive, is rarely used in its pure form. In many cases an additional disk is set aside with some checksum. This checksum is so cleverly created that it is possible to retrieve the missing data from it as long as 4 of the 5 blocks covered by the respective checksum are still working:

Code:
        Disk1   Disk2   Disk3   Disk4   Disk5   Disk6
Block1    1       2       3       4       5     CS1-5
Block2    6       7       8       9      10     CS6-10
Block3   11      12      ...                    ...
Block4
Block5
Block6
...

Now any of the six disks could fail and the data would still be available. The downside is that the net capacity of the six disks is now only that of five disks. This is what is usually, but not completely correctly, called a "RAID-set”, or "RAID5". It is "striping with parity", so that the additional risk of having several disks involved is alleviated. Alas the overhead of computing the checksum means also that our performance gain is not as big as with the stripe-set without parity. You may have heard that "RAID5 is not well suited for databases”. This is a half-truth. Striping with parity (RAID5) is indeed slower than striping without parity (i.e. with "RAID0"), but it is still faster than with a single disk. With appropriate caching controllers and hardware-based checksum computing, done in specialised RAID-controllers, this is in fact the way to go with large data volumes. This is usually what is inside a SAN device, like IBMs DS-4/6/8k series, HP’s EVA, Netapp’s FAS, etc. systems.

Another way to speed up disk I/O is caching. We have already learned that the system is using every ounce of memory it can spare to this noble end. However, this is something even the most generous of OS’s can be helped with. So it helps that disks also have small amounts (a few MBs) of onboard cache. Whenever a series of blocks are requested the cache is first checked for those blocks. If the blocks are required often they may already be cached and so no disk read is required. In a striped system, all of the blocks can be read from the various disks and shovelled to the cache very quickly, and be available when the disk is again asked for that data.

But even this caching can be improved and therefore modern controllers often have 1GB RAM or even more local cache on board to buffer these requests - and write requests in particular. This way modern SAN devices can achieve unprecedented I/O-rates.

Even More Disc Magic - RAIDing the Storage!

You may have noticed that when it came to disks we concerned ourselves only with the speed of I/O-operations. While this is all well and good, I have some shocking news for you: how much does the risk of data loss increase as the amount of stored data increases?

Let’s see! Modern disks have a MTBF (mean time between failure - the time after which the average disk can be expected to fail) of 10 years. (Well, not the cheap disk’s you get at the local electronics store. I mean data centre quality disks!). That means that the average disk is expected to work for around 3500 days. Now suppose you have a data centre with 1000TB of online capacity, and let’s suppose that’s in 500GB-sized disks, so we have 2000 disks. How often does a disk fail on average? One every 2 days, roughly?

Have you ever seen a data centre that’s losing data every 2 days? Have you ever seen a data centre using emergency procedures to restore from backups every two days? Well, me neither. This means there must be a method (or several methods) of maintaining redundancy so that the loss of one disk doesn't mean losing the data stored on it.

And, in fact, there are such methods. One of the methods we have already seen is striping with parity, or RAID5. But RAID5 comes with a performance hit, even if being alleviated by RAID-capable controllers and hardware caching. And there is another point. If it happens that in a 6-disk set of a RAID5 a second disk breaks then all the information of all the disks is lost. This is why further methods of securing data has been searched for and found: disk mirroring or RAID1.

The concept of RAID1 is shockingly simple: two disks (sometimes even three disks) storing exactly the same information. If one of these disks breaks the data is still secure so long as one of the disks is still functioning. We can even combine this method with the others, to arrive at RAID10 (two stripe sets without parity, which are then mirrored for security) - also called RAID01. Further, there is RAID15 which is a mirrored set of RAID5 arrays. There are a whole lot of other combinations, too.

There are also other methods trading security versus performance versus disk capacity. Every increase in security lessens the total capacity of the RAID set. If you are interested in other RAID variants like RAID2 (bit-level striping with a Hamming-based ECC), RAID3 (Byte-level striping with ECC on a separate disk), RAID10E, RAID00, you may want to visit the Wikipedia article about RAID. The article is quite exhaustive and much more detailed than the rather simplified picture we have covered here, where I have given only a short, easy to digest overview.

The point I want to drive across with that discussion of various storage technologies is that there are three goals to storage which work usually in opposition to each other: speed, capacity and security. Every implementation in one or the other is a trade-off between these goals. Security, in general is reached by writing the same data several times. The more often you write it, the less likely you are to lose it, but the less net capacity is left from the raw capacity of the disks. The better you distribute the read/write-operations over various disks the better the performance, but this offers less in terms of data security, as the failure of some disks (in an extreme case of striping, even one) means the failure of the whole system. Whenever you design a system you make it as fast as necessary, and as secure as feasible, while preserving as much capacity as possible.

There's Gold in Them Thar Networks!

You have probably deduced from the title, which also adorns the venerable RFC 1290, that the theme of our next chapter is networks! These beasties give us troubles-galore even without having performance problems: routers, firewalls, switches, VLANs, etc. Networks are a very complex, interdependent machinery and so is monitoring and assessing them. We will not do this here. We will assume that the networks we deal with all work. Stay tuned for the Most Incomplete Guide to Troubleshooting for more information about non-working networks. But even a working network might have performance issues, and we will start tackling them by discussing a little theory before plunging into detail. I will also limit myself to the discussion of TCP/IP networks, since they make for the overwhelming majority of networks you will have to deal with. Secondly, these are the only network stacks which are comparably halfway implemented, so that a discussion makes any sense at all.

I will spare you the "classical TCP/IP Ethernet" lecture, too. Any Ethernet today is switched (have you seen a BNC network cable lately? I haven't for the last 10 years.), so talking about congestion, CSMA/CD and similar magic rituals of the Old Days of Higher Network Alchemy will not help us in getting modern network-related problems solved. But, if you are interested and want to develop a deeper understanding of why and how networks came to work the way they do (a most recommended idea) you should seek out W. Richard Stevens' TCP/IP Illustrated (3 volumes). Ours is an incomplete guide, as I am sure you remember by now.

What You Always Wanted to Know About Networks But Were Too Afraid to Ask

Lets get straight to the facts! There is one, and only one, network stack in every machine. A system can have any number of network cards and it can maintain any number of network connections but the basic, underlying data structure in the kernel - the network stack - is just one and one alone. This means that unlike with processors, or disks, there is nothing you can do to "parallelise" it if it’s not working fast enough. If the damn thing is too slow, its too slow, full stop!

But there is hope. In the overwhelming majority of cases the hardware is fast enough, and the problem is merely one of bad tuning. The speed of the network stack is heavily reliant on buffers, and by adjusting these buffers to a sensible size we can solve most problems.

One of these buffers, for instance, is the "reassemble buffer". While the most often-used Layer-4 TCP protocol maintains flow control, the underlying Layer-3 IP protocol does not. This means that IP datagrams can arrive at a system in no particular order, and the driver has to rearrange them so that TCP is presented with an orderly queue of packets again. This needs some memory where the datagrams can be stored until being brought back in order again. This is the reassemble buffer.

How big this buffer has to be is hard to determine beforehand. It depends, obviously, on the quality of the overall network and how often it is necessary to re-transmit a packet. Left to its own devices the originating host would send the packets in order, of course, and if there is only a single (crossover) cable between the sender and receiver, the latter would receive the packets always in the order they were sent. Real networks, however, are more complicated. They have all sorts of parallel routes that a packet could take, hardware of different speeds, connections with higher and lower data rates (and hence more or less collisions) and so on. Quality of Service (QoS) employed by real-time protocols (for example, Voice over IP (VoIP)) will do their part to disrupt the once established order of IP datagrams, and will make them arrive at the receiver in any conceivable order.

We will come back to these buffers when we discuss the actual tuning of a network stack. For now it suffices to say there are several of them which may heavily influence the data throughput of a host.

Sometimes, though, the problem lies not even in the host itself. What do you think happens when you enter a line like

Code:
# rlogin -l me my.remotesys.com

If you answered “I log in as user ‘me’ to host 'my.remotesys.com'" you get a bonus point for stating the obvious! Maybe I can interest you in a career as a politician?

But seriously, the first thing that happens is that our host recognises that 'my.remotesys.com' is not a valid IP address, and so it has to resolve it into one. To this end, it searches the name resolution configuration file. The name of this file varies between systems. In Solaris, and on some Linux’s, it’s /etc/netsvc.cfg, whereas in AIX and yet other Linux’s it’s /etc/resolv.conf), but its purpose is the same: to know who to ask first for a translation. It then asks this source. There are various sources for name resolution, such as the /etc/hosts file, the DNS server or the NIS server.

I already stressed that networks are complicated, interdependent beasts!

It could well be that the problem with slow responses is not with the system but the server it uses for name resolution. As long as your host hasn't found out which IP address "my.remotesys.com" has it can't start any communication with it. If it takes long to find the IP address and establish communication, it may well not your our server, but a problem with the domain name service (DNS).

Tools of the Trade vol. 3: netstat

When it comes to analysing network problems there is one tool you should keep as close to hand as possible - netstat. This is truly the Swiss army knife in your network toolbox. Like a Swiss army knife, it has a whole load of functions, fits in even the smallest pocket and nobody knows even half of what you can do with it! That includes me, which is why I only write incomplete guides!

We will start with a simple overview of the buffers I told you about before and their statistics. Use netstat -m to see them. Because it is taxing to the system to collect all of the network statistics, sometimes there are some statistics that are turned off by default, and you may have to turn them on at some point. In AIX, for instance, there is the "extended_netstats" setting in no (short for network options, a program used to manipulate network tuning options). It’s wise not to interfere with these settings on a productive system unless absolutely necessary!

The output of netstat -m differs from platform to platform, and even with different versions on the same platform, so the picture you get on your system might look a little different. Here is the output from a Linux system:
Code:
$ netstat -m

streams allocation:

 config alloc free total max fail
streams 292 79 213 233 80 0
queues 1424 362 1062 516 368 0
mblks 5067 196 4871 3957 206 0
dblks 4054 196 3858 3957 206 0
class 0, 4 bytes 652 50 602 489 53 0
class 1, 16 bytes 652 2 650 408 4 0
class 2, 64 bytes 768 6 762 2720 14 0
class 3, 128 bytes 872 105 767 226 107 0
class 4, 256 bytes 548 21 527 36 22 0
class 5, 512 bytes 324 12 312 32 13 0
class 6, 1024 bytes 107 0 107 1 1 0
class 7, 2048 bytes 90 0 90 7 1 0
class 8, 4096 bytes 41 0 41 38 1 0
total configured streams memory: 1166.73KB
streams memory in use: 44.78KB
maximum streams memory used: 58.57KB

The most important column is fail. Ideally, it should always be zero. If it isn’t it means the respective resource is overtaxed and its buffer size (memory blocks assigned to that buffer) should be increased. How this is done depends on the system, too, but on most platforms this needs a reboot, so investigate carefully and plan for some downtime.

The next interesting statistics is netstat -a. It displays information about all connections (-a for all). Did you know that roughly 90% of the traffic in your network is TCP traffic? Because this is so, we will ignore the other protocols for the moment, and limit the output to TCP connections with the -t option:
Code:
$ netstat -ta

Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
ip         0      0 *.*           *.*
tcp        0   2124 tpci.login    merlin.1034     ESTABL.
tcp        0      0 tpci.1034     prudie.login    ESTABL.
tcp    11212      0 tpci.1035     treijs.1036     ESTABL.
tcp        0      0 tpci.1021     reboc.1024      TIME_WAIT
tcp        0      0 *.1028        *.*             LISTEN
tcp        0      0 *.*           *.*             CLOSED
tcp        0      0 *.6000        *.*             LISTEN
tcp        0      0 *.listen      *.*             LISTEN
tcp        0      0 *.1024        *.*             LISTEN
tcp        0      0 *.sunrpc      *.*             LISTEN
tcp        0      0 *.smtp        *.*             LISTEN
tcp        0      0 *.time        *.*             LISTEN
tcp        0      0 *.echo        *.*             LISTEN
tcp        0      0 *.finger      *.*             LISTEN
tcp        0      0 *.exec        *.*             LISTEN
tcp        0      0 *.telnet      *.*             LISTEN
tcp        0      0 *.ftp         *.*             LISTEN
tcp        0      0 *.*           *.*             CLOSED

Every line shows one active (or possibly active) connection. The first column shows the protocol, the next two columns the current content of the send- and receive-buffers.

Now look at the last column. LISTEN means there is no connection established but some daemon in your system is waiting for incoming requests. When you connect, say via telnet, to a system you can do so because behind the telnet port (port 23) the telnetd daemon waits in this "LISTEN" state. Once your client connects, this daemon process picks it up and establishes a new connection, which will carry the status "ESTABLISHED". These are active connections. Once a connection is waiting to be hung up its status changes to "TIME_WAIT" and, after a certain time, it is changed to “CLOSED" and then removed again from the list.

If your system is dropping connections it might be that many short-term connections are overtaxing the available ports. It might help to reduce the time until a “hung-up” connection is moved from status "TIME_WAIT" to "CLOSED”, in such a case. Note that a bigger network connection with more bandwidth wouldn't help at all. As always in performance tuning bigger isn't always better. I actually had such a problem once and the first reaction from management was to replace a 100-MBit interface with a (at the time shamelessly priced) Gigabit card, which didn't help at all, save for boosting the wasted-money counter! A careful study revealed the real problem, and left me with a spare Gigabit NIC to play with. Careful work always yields rewards!

The remaining two columns show the end points (host and port) of the connections where possible (or "*" where no endpoint is set right now). For established connections this shows which hosts are connected and helps you to identify "hotspots" in network connections.

In systems especially with several interfaces (this means most bigger production servers) you need to know not only how the network stack behaves overall, but how much every interface is used, and if there might be any bottlenecks already in place or building up. For this you need the final word in interface statistics: netstat -i. Here is a sample output from a Linux system:

Code:
$ netstat -i

Kernel Interface table

Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags
lo   2000   0   231      0      0      0   231      0      0      0 BLRU
eth0 1500   0  1230      2      9     12  1421      3      2      1 BRU

You see a system with one (physical) network interface (eth0) and the loopback driver (lo). Notice the non-zero value in RX-ERR (received packets errors) and TX-ERR (transmitted packets errors). It is a direct consequence of how TCP/IP works that these values are almost never zero. Should they be too high, however, and you have trouble with the related interface showing these high values, it hints at serious connection problems.

Possible reasons might be all sorts of physical connection problems (for example, intermittently failing cables or connectors or misconfigured switch ports). For instance, for a long time auto-negotiation of the network interface hardware in IBM systems had problems. If you connected a 10/100-card with a switch port in auto-negotiation mode and the NIC was set to auto-negotiation too you had a good chance of ending up with a switch port set to 100-MBit Half-Duplex and a NIC in 100-MBit Full-Duplex. The effective "throughput" in such a connection is somewhere around 1KB/s! This wouldn't show if you issued a ping, only if you transferred data.

The last column, labeled FLAGS, shows the same information as ifconfig. The following table might come in handy:
B Broadcast address has been set
L Loopback driver
M Promiscuous mode
N Trailers are avoided
O ARP turned off
P Point-to-point connection
R Running
U Interface is up
Let me stress for the last time that this is a most incomplete guide! There are many more informative and interesting options in netstat and, as always, reading the man pages thoroughly is a rewarding activity. Here are some other useful options:
netstat -p ports statistics
netstat -r routing table nformation
netstat -s protocol statistics
netstat -o information about timer states, expiration times and backoff times
netstat -x information about sockets
Conclusion

We have reached the end of my collection of boring old stories and I will do the sensible thing and end this article as well. I hope I could tell you something you haven't already heard many times before, and I do hope I have entertained you in the process of informing you.

So, what can I say? Well, maybe a last thought! Always remember that, as a SysAdmin, you do not operate in a vacuum. You are part of a complex environment which includes network admins, DBAs, storage admins and so on. Most of what they do affects what you do. Have a lousy SAN layout? Your I/O-performance will suffer. Have a lousy network setup? Your faster-than-light machine may look like a slug to the users. There is much to be gained if you provide these people with the best information you can glean from your system, because the better the service you offer to them, the better the service you can expect back from them! The network guy will love you if you do not only tell him a hostname but also a port, some connection data, interface statistics and your theory about possible reasons for network problems. The storage admin will adore you if you turn out to be a partner in getting the best storage layout possible, instead of being only a demanding customer.

Unix is all about small specialised entities working together in an orchestrated effort to get something done. The key point in this is that the utility itself might be small and specialised but its interface is usually very powerful and generalised. What works in Unix utilities also works in people working together: increase your "interface" by creating better and more meaningful data and you will see that others will better be able to pool their efforts with yours towards a common goal.


Acknowledgements

First and foremost, I'd like to thank Neo for his ongoing dedication in keeping this board alive. I have learned a thing or two about Unix from this board. In fact much of what I know today I learned here. Speaking of learning, many thanks to shockneck, a most trusted friend in real life, for bearing with me and my constant questions about the inner workings of AIX.

The most special thanks go to Scott, who not only does a fantastic job in making this board better and better, but on top of that acted as my lector. He had to wade through my scribblings and proof-read everything I have written here. The reason you are able to decipher all of this is because of his effort to translate what I have written into real English.

Finally I'd like to thank all the contributors of the forum for providing me with enlightening insights on one hand, and a constant stream of interesting questions on the other. Most of the threads which directly (and, more often, indirectly) influenced this article will not be mentioned, simply because I am a lot less fastidious when writing incomplete guides than with monitoring my own systems. After a while it was often not possible to trace back the thread that started me to think over something more thoroughly. I am sorry.

bakunin


_______________
Threads I have shamelessly copied from:

CPU Utilization
fr and sr (from vmstat output) values are very high
https://www.unix.com/emergency-unix-l...m-working.html

Related links, Sources:
Who is interested in performance aspects of IP networks might want to read: Network Performance Tuning
The iostat figure is from Solaris iostat

Last edited by bakunin; 09-15-2013 at 06:01 PM..
These 4 Users Gave Thanks to bakunin For This Post:
# 2  
Old 09-15-2013
Just a small comment. Beware that depending on the operating system used there are (rare) cases where monitoring commands display slightly different metrics under the same label so might lead to incorrect interpretation.

In particular, what you wrote about the pi/po columns do not apply at least to Solaris vmstat with which the paging activity account disk read and writes due to normal virtual memory I/O like regular mmaped files being accessed.

On Solaris, what AIX reports as pi/po is shown under the api/apo (anonymous pages in/out) columns when the '-p' option is passed to vmstat.
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. AIX

IBM AIX I/O Performance Tuning

I have a IBM Power9 server coupled with a NVMe StorWize V7000 GEN3 storage, doing some benchmarks and noticing that single thread I/O (80% Read / 20% Write, common OLTP I/O profile) seems slow. ./xdisk -R0 -r80 -b 8k -M 1 -f /usr1/testing -t60 -OD -V BS Proc AIO read% IO Flag IO/s ... (8 Replies)
Discussion started by: c3rb3rus
8 Replies

2. Solaris

Solaris Performance tuning

Dear all, I have a Local zone , where users feel that performance is not good. Is it wise to collect the inputs from the local zone rather than taking from the global zone. And also Can I tune from Global zone , so that it will reflect in local zone. Rgds rj (2 Replies)
Discussion started by: jegaraman
2 Replies

3. Shell Programming and Scripting

Performance Tuning

Hi All, In last one week, i have posted many questions in this portal. At last i am succeeded to make my 1st unix script. following are 2 points where my script is taking tooooo long. 1. Print the total number of records excluding header & footer. I have found that awk 'END{print NR -... (2 Replies)
Discussion started by: Amit.Sagpariya
2 Replies

4. Shell Programming and Scripting

Oracle-performance tuning

Sorry, This is out of scope of this group.But I require the clarification pretty urgently. My Oracle database is parallely enabled. Still,in a particular table queries do not work "parallely" always. How is this? (9 Replies)
Discussion started by: kthri
9 Replies

5. UNIX for Dummies Questions & Answers

Performance Tuning

Hi to all, I'm interested in finding an introduction about Performance Tuning under Unix (or Linux); can somebody please point me in the right direction? Best regards (1 Reply)
Discussion started by: domyalex
1 Replies

6. Filesystems, Disks and Memory

EXT3 Performance tuning

Hi all, long time ago I posted something, but now, it is needed again :( Currently, I am handling with a big NFS Server for more than 200 clients, this sever has to work with 256 NFSDs. Because of this huge amount of NFSDs, there are thousands of small write accesses down to the disk and... (3 Replies)
Discussion started by: malcom
3 Replies

7. UNIX for Dummies Questions & Answers

Performance tuning.

can someone tell me a good site to go to in order to learn this. please do not recommen nay books because i dont have interest in that. if you know of any good sites with good straight forward explanation on how to split loads on machines that has excessive loading, please let me know Also,... (1 Reply)
Discussion started by: TRUEST
1 Replies
Login or Register to Ask a Question