AIX memory usage by processes

Login or Register to Reply

Thread Tools Search this Thread
# 22  
Originally Posted by dim
Thanks Bacunin (a phrase I never expected to say :P),
It is "bakunin", like the russian anarchist, but anyways, you are welcome. ;-)

Originally Posted by dim
I have looked too many threads on this forum, and other forums as well.. but i just don't get it..
OK. I suggest my little introduction i once wrote, but i will try to give a short explanation how memory is handled in a UNIX machine. Bear with me when i reiterate things you already know and/or touch things not directly connected with your situation. The matter is complicated and before answering your questions i'll try to lay the grounds for you to be able to understand the answer. For that i have to start at a more fundamental level.

A system has a certain amount of memory installed. We call this "real memory", because this in fact it is: real, tangible pieces of memory chips. In an AIX system you can find out how much memory is installed (for LPARs this is equivalent to "assigned") with the command

lsattr -El mem0

Now, memory was a scarce resource until quite recently and when UNIX was developed in the late sixties a system was even expected to not have as much memory as it would need to run all the programs. This is why a contingency plan was built into the kernel: a part of the disk storage was set aside and the kernel would - if in a pinch - move memory pages which aren't used at that moment to this disk part and delete it from the real memory, this way freeing some. This disk part was called "swap space" or "paging space" and the activity of moving memory pages to it or back to memory from it is called "paging" or "swapping".

Two things are interesting here: first, disk is slower than memory by several orders of magnitude. This is why swapping is to be avoided. Today, as memory is easily available, it is tolerable only as a very rare exception but not as part of the normal operations of a system. Second: what slows the system down is not memory pages being in the swap but only the fact that they are put there or fetched from it. When we assess the performance of a system we don't care how much swap is used but only how many pages are being transferred to/from it.

When we talk about memory we usually talk about "virtual memory". This is the sum of the real memory plus the swap area because to processes the kernel presents this as one contiguous mass. A process itself doesn't notice when it is moved around in this virtual memory, be it that it is transferred from one memory location in real memory to another ("garbage collection", think of it like disk defragmentation for memory) or that it is transferred to someplace in the swap memory ("swapped out") or back from it respectively ("swapped in").

To end my explanation about swap, there are two possible policies for allocating swap: "early swap allocation" and "late swap allocation". AIX, up to v5L (or 5.1? i can't remember exactly) used early swap allocation per default. This means that for every starting process the swap space it might need once if it is indeed swapped out is immediately allocated in the swap. This makes sure that, regardless of what catastrophe happens, a started process will always have enough space to finish, no matter what. The system runs more stable but on the downside a lot of swap allocation and deallocation is done without the swap ever being used which costs not very much but some time. The other policy, used by AIX since 5.1 or 5.2, is late swap allocation where the swap is only allocated in the moment when it is needed. It might happen that not enough swap is not available in this precise moment, but on the other hand the more rarely swap is needed at all the less unnecessary allocation/deallocation is done. It is possible to configure AIX to use either of this policies but that is rarely done and from the look at your outputs you use late swap allocation anyways.

Now, how does the UNIX kernel handle memory? First, it "knows" that real memory is faster, so it will use only this as long as possible. To application processes there might be no difference between real and virtual memory but for the kernel there is. Second, it also knows that a lot of operations (resp. their speed) profit from buffering in memory: namely disk operations and network operations can gain a lot in throughput by being cached with memory. This is the reason why every UNIX kernel dedicates memory which is not used by running processes right now to buffering these operations. There are many of these buffers but the disk buffers (aka "file cache") are by far the biggest and the most important ones.

When a UNIX kernel needs memory it has already given to filecache because it was not needed before it shrinks this buffer(s) dynamically, makes the necessary amount of memory available to the process and once this process exits the memory is given back and the buffers are regrown (Well - this is the theory. Once you are jaded and cynical enough by your mounting experience i will tell you about real-world memory sinks, shoddy programming technique and more. But this is kindergarden and kids need fairy tales with an elevating happy end. ;-))

Up to now when we talked about processes we only have considered the standard process: a process starts, requests and is given some memory, does its work and gives the memory back on exit. The world is more complicated, though, and UNIX kernels address that: many processes work simultaneously in a system and some of them even work together. To effectively do this they might need a common memory where one process can put data and one (or several) others take it or even change it. This is called "shared memory" and is part (together with "semaphores" and "queues") of the inter-process communication. It makes sense that a command that shows you these IPC-facilites is called inter-process communication services or ipcs in short. What is interesting for you are the allocated shared memory segments because they tend to stay if a process that requested them terminates abnormally for some reason. Since the kernel doesn't know if the memory segment is still needed or not (just because the requesting process exited doesn't mean another process might not need it any more) it won't reclaim them automatically. I have found some systems heavily swapping just because a database process was killed quite unceremonously by a novice systems admin with kill -9. Experienced admins use (if they have to!) kill -15 because this way a process will be given the opportunity to clean up/give back/free whatever it has opened/requested/locked. kill -9 is only the very last resort for extremely hung/badly programmed processes, but that only as an aside.

Notice that shared memory cannot be swapped out and has to remain in real memory. This means that left over shared memory segments can effectively diminish your real memory and by that force other processes to swap even when in fact enough memory would be free. Especially crashed database programs are prone to this problem because they make heavy use of the shared memory feature. The "SGA" in Oracle is mostly shared memory, for instance.

In AIX you can control with several tuning parameters how the kernel exactly manages aforementioned buffers. I will tell you about it but a warning up front: this is really advanced stuff and to tune a system you need a working understanding of how the kernel works. I can't give you that here, only a glimpse. There is an IBM course, "kernel internals" which i suggest you attend if you are interested.

The main command to view and to set virtual memory tuning options is vmo and you see all tuning parameters and their current settings with

vmo -a

Many of these settings are defaults. Whatever is not default is written in the files /etc/tunables/lastboot and /etc/tunables/nextboot. "lastboot" shows the values the system has started with, "nextboot" are the values the system will use for the next boot. At boot time nextboot is copied over lastboot. You can set every option with

vmo -o <option> <value>

and you can make that change permanent (that is: it is entered into the "nextboot" file, if you know what you do you can use a text editor to edit this file instead) by

vmo -po <option> <value>

(By the way: the other tuning utilities ioo (I/O-parameters), no (network) and schedo (scheduler) work analogously and have the same set of parameters).

I won't explain every parameter here but only the two most important ones: minperm and maxperm. But this will have to wait until evening, since i've got some work to do. sorry, but this post is getting longer than i expected. Stay tuned for part 2 later this day.

Originally Posted by dim
My next shot is to reboot the server which confuses me and check its reaction...
As far as i can see there is no necessity for this, so don't. If it doesn't matter that the server is down for a few minutes you might do it anyways (it will not hurt either) but what you have posted till now doesn't justify it.

I hope this helps.


Last edited by bakunin; 10-29-2018 at 09:05 AM..
# 23  
Thanks Bakunin for your help!
I will study and i will let you know...
Concerning with your nickname.. i know who he is.. But i support the German bearded man (karl)! Smilie
# 24  
As promised, here is the second part of the answer:

In the following, when not stated explicitly otherwise, the unit we measure in are memory pages, which is 4K (4096 bytes) in the overwhelming majority of cases. There is the option of using "large pages" (16MB) but i have yet to these used in reality.

I said above that memory is used for programs and what isn't used for programs is used for buffers. IBM calls that computational memory (the part which is given to running processes) and file memory (the part used for file caching). It is obvious that with the constant loading and unloading of processes the memory is given back and forth as the current amount of used memory changes. Also notice that there is "pinned memory". That is memory which is allocated in a way that it can't ever be swapped out. I gave you already one example - shared memory - but there are other types of pinned memory too.

So, at a given moment, save for this pinned memory, we have a memory that is to some part computational memory and (nearly) all the rest is given to file memory. The not pinned memory is constantly scanned for possible candidates to either swap out or to add to either file or computational memory. There is a special process for this, the "lrud" (least recently used daemon). Pages which could be used for file or computational memory (basically the memory that could be swapped) are called "lruable" (notice that all these words are IBM speak, you will find these terms in the output of various programs).

Now suppose we have a system that is running at capacity limit (all other cases are rather trivial): the kernel now has to decide at any given point, what to do: either shrink the file memory or swap some programs out. At first, as ever more memory is demanded, it makes sense to shrink the file cache. At a certain point read/write operations would suffer so much from a further reduction of this cache that it might be overall better to swap something out and back in later. The procedure is called "page stealing" because memory pages which are in use are "stolen" and put to another use. What the kernel does under which circumstance is controlled by the tuning parameters minperm and maxperm. Both these values are percentages.

When the percentage of memory holding file pages falls below minperm, needed memory will come from both file and computational memory. Processes might get swapped out (stealing from computational memory) to avoid reducing the file memory any more. Again, which specific pages are stolen will be determined by the lrud.

When the percentage of memory holding file pages is between minperm and maxperm only file pages get stolen. Processes that are starting get their memory by the kernel diminishing the file memory and giving the so freed memory to them.

This way you can control the decisions a kernel will take under certain circumstances.

You haven't posted any vmstat-output yet, so i take one from the IBM site:

# vmstat -v
              1048576 memory pages
               936784 lruable pages
               683159 free pages
                    1 memory pools
               267588 pinned pages
                 90.0 maxpin percentage
                  3.0 minperm percentage
                 90.0 maxperm percentage
                  5.6 numperm percentage
                52533 file pages
                  0.0 compressed percentage
                    0 compressed pages
                  5.6 numclient percentage
                 90.0 maxclient percentage
                52533 client pages
                    0 remote pageouts scheduled
                    0 pending disk I/Os blocked with no pbuf
                    0 paging space I/Os blocked with no psbuf
                 2228 filesystem I/Os blocked with no fsbuf
                   31 client filesystem I/Os blocked with no fsbuf
                    0 external pager filesystem I/Os blocked with no fsbuf
                 29.8 percentage of memory used for computational pages

First things first:
              1048576 memory pages
               936784 lruable pages

We have ~1Mio memory pages, which is 4096*1Mio=4G of real memory. Of these about 936k pages are eligible to be scanned by the lru-daemon.

                 3.0 minperm percentage
                 90.0 maxperm percentage
                  5.6 numperm percentage
                52533 file pages

maxperm is at 90%, minperm is at 3%. (This is a setting typical for databases, which do not need the file cache for their operations and in fact bypass it.) Right now 52533 pages (~ 200MB) are used for the file cache, which is ~5.6% ("numperm") of the memory: Also 29.8% of the memory is used by programs ("percentage used for computational pages"). I suppose this machine was started only recently because otherwise the file cache would be bigger. Right now the system doesn't know what to put into cache that makes sense.

Another value that is worth checking is seen when issuing vmstat -vs: "Revolutions of the clock hand". This value shows the number of times the lrud has scanned the whole memory completely for potentially stealable/swappable page candidates. The value itself increases over time and is not interesting as such but interesting is the growth rate of the value. The faster it grows the faster the memory is scanned - to put it in dramatic terms: the more frantic the system is searching for freeable memory. If the system searches frantically there must be some pinch it is in, yes? This way you can anticipate the development of a memory shortage even before the system starts really swapping. Of course the numperm and number of computational pages add to this picture but it is always a good idea to take pictures from different sides and look if all adds up from every side.

OK, here i end. There is a lot more to say about each of these things but, again, if you are really interested, you should take a class or two and you might also search our forum for the terms you now should be familiar with. Several people have written valuable info about these things over time and you can learn from others perhaps more than from me. If you still have specific questions you can also ask, we are glad to help - just don't expect every answer to be as long as this one.

You might want to read my performance tuning introduction for more general information about the concepts and tools involved in performance monitoring and tuning.

I hope this helps.


Last edited by bakunin; 10-30-2018 at 02:52 AM..
This User Gave Thanks to bakunin For This Post:
# 25  
2 weeks later... i am sorry for my late response, but other tasks were popped up...
Finally i studied you INCOMPLETE guide (:P) and your response above as well.
To be honest, i didn't actually get with your answer, but i learned things by your guide.
I think that pretty much this is how i had in my mind the architect of AIX.
I said pretty much...

Although, I do not think that this makes it clear to my problem....
Login or Register to Reply

Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Identify All Processes memory and cpu usage.
Hi All, Anyone has script to monitor AIX total processes memory and cpu usage that contribute to the total memory and CPU utilize so far ? The purpose of this is to analyze process memory trend. Thanks. Best Regards, ckwan... AIX
How to monitor the IBM AIX server for I/O usage,memory usage,CPU usage,network..?
How to monitor the IBM AIX server for I/O usage, memory usage, CPU usage, network usage, storage usage?... AIX
Memory usage on AIX
How to check the memory usage on AIX by various processes?... AIX
estimating memory usage by database processes
Hi Guys, I wonder what would be the best way to determine how much memory is in use on any given time by the database processes. I thought about using ipcs -m command but I wonder if there any better way to determine this. Thanks. Harby.... AIX
Memory Usage in AIX
Hi All, I have a question, can you guys please help me by giving your valuable suggestons: I am using AIX 5L, running oracle 7 version. I need to increase the oracle memory to 40 MB more. Currently Oracle occupies 260M. I wanted to know whether I can increase the memory without any problem....... UNIX for Advanced & Expert Users
UNIX for Advanced & Expert Users