Bug Hunter Extraordinaire
Join Date: May 2005
Last Activity: 18 April 2019, 8:50 AM CDT
Location: In the leftmost byte of /dev/kmem
Thanks Given: 139
Thanked 2,137 Times in 1,511 Posts
A system has a certain amount of memory installed. We call this "real memory", because this in fact it is: real, tangible pieces of memory chips. In an AIX system you can find out how much memory is installed (for LPARs this is equivalent to "assigned") with the command
Now, memory was a scarce resource until quite recently and when UNIX was developed in the late sixties a system was even expected to not have as much memory as it would need to run all the programs. This is why a contingency plan was built into the kernel: a part of the disk storage was set aside and the kernel would - if in a pinch - move memory pages which aren't used at that moment to this disk part and delete it from the real memory, this way freeing some. This disk part was called "swap space" or "paging space" and the activity of moving memory pages to it or back to memory from it is called "paging" or "swapping".
Two things are interesting here: first, disk is slower than memory by several orders of magnitude. This is why swapping is to be avoided. Today, as memory is easily available, it is tolerable only as a very rare exception but not as part of the normal operations of a system. Second: what slows the system down is not memory pages being in the swap but only the fact that they are put there or fetched from it. When we assess the performance of a system we don't care how much swap is used but only how many pages are being transferred to/from it.
When we talk about memory we usually talk about "virtual memory". This is the sum of the real memory plus the swap area because to processes the kernel presents this as one contiguous mass. A process itself doesn't notice when it is moved around in this virtual memory, be it that it is transferred from one memory location in real memory to another ("garbage collection", think of it like disk defragmentation for memory) or that it is transferred to someplace in the swap memory ("swapped out") or back from it respectively ("swapped in").
To end my explanation about swap, there are two possible policies for allocating swap: "early swap allocation" and "late swap allocation". AIX, up to v5L (or 5.1? i can't remember exactly) used early swap allocation per default. This means that for every starting process the swap space it might need once if it is indeed swapped out is immediately allocated in the swap. This makes sure that, regardless of what catastrophe happens, a started process will always have enough space to finish, no matter what. The system runs more stable but on the downside a lot of swap allocation and deallocation is done without the swap ever being used which costs not very much but some time. The other policy, used by AIX since 5.1 or 5.2, is late swap allocation where the swap is only allocated in the moment when it is needed. It might happen that not enough swap is not available in this precise moment, but on the other hand the more rarely swap is needed at all the less unnecessary allocation/deallocation is done. It is possible to configure AIX to use either of this policies but that is rarely done and from the look at your outputs you use late swap allocation anyways.
Now, how does the UNIX kernel handle memory? First, it "knows" that real memory is faster, so it will use only this as long as possible. To application processes there might be no difference between real and virtual memory but for the kernel there is. Second, it also knows that a lot of operations (resp. their speed) profit from buffering in memory: namely disk operations and network operations can gain a lot in throughput by being cached with memory. This is the reason why every UNIX kernel dedicates memory which is not used by running processes right now to buffering these operations. There are many of these buffers but the disk buffers (aka "file cache") are by far the biggest and the most important ones.
When a UNIX kernel needs memory it has already given to filecache because it was not needed before it shrinks this buffer(s) dynamically, makes the necessary amount of memory available to the process and once this process exits the memory is given back and the buffers are regrown (Well - this is the theory. Once you are jaded and cynical enough by your mounting experience i will tell you about real-world memory sinks, shoddy programming technique and more. But this is kindergarden and kids need fairy tales with an elevating happy end. ;-))
Up to now when we talked about processes we only have considered the standard process: a process starts, requests and is given some memory, does its work and gives the memory back on exit. The world is more complicated, though, and UNIX kernels address that: many processes work simultaneously in a system and some of them even work together. To effectively do this they might need a common memory where one process can put data and one (or several) others take it or even change it. This is called "shared memory" and is part (together with "semaphores" and "queues") of the inter-process communication. It makes sense that a command that shows you these IPC-facilites is called inter-process communication services or ipcs in short. What is interesting for you are the allocated shared memory segments because they tend to stay if a process that requested them terminates abnormally for some reason. Since the kernel doesn't know if the memory segment is still needed or not (just because the requesting process exited doesn't mean another process might not need it any more) it won't reclaim them automatically. I have found some systems heavily swapping just because a database process was killed quite unceremonously by a novice systems admin with kill -9. Experienced admins use (if they have to!) kill -15 because this way a process will be given the opportunity to clean up/give back/free whatever it has opened/requested/locked. kill -9 is only the very last resort for extremely hung/badly programmed processes, but that only as an aside.
Notice that shared memory cannot be swapped out and has to remain in real memory. This means that left over shared memory segments can effectively diminish your real memory and by that force other processes to swap even when in fact enough memory would be free. Especially crashed database programs are prone to this problem because they make heavy use of the shared memory feature. The "SGA" in Oracle is mostly shared memory, for instance.
In AIX you can control with several tuning parameters how the kernel exactly manages aforementioned buffers. I will tell you about it but a warning up front: this is really advanced stuff and to tune a system you need a working understanding of how the kernel works. I can't give you that here, only a glimpse. There is an IBM course, "kernel internals" which i suggest you attend if you are interested.
The main command to view and to set virtual memory tuning options is vmo and you see all tuning parameters and their current settings with
Many of these settings are defaults. Whatever is not default is written in the files /etc/tunables/lastboot and /etc/tunables/nextboot. "lastboot" shows the values the system has started with, "nextboot" are the values the system will use for the next boot. At boot time nextboot is copied over lastboot. You can set every option with
and you can make that change permanent (that is: it is entered into the "nextboot" file, if you know what you do you can use a text editor to edit this file instead) by
(By the way: the other tuning utilities ioo (I/O-parameters), no (network) and schedo (scheduler) work analogously and have the same set of parameters).
I won't explain every parameter here but only the two most important ones: minperm and maxperm. But this will have to wait until evening, since i've got some work to do. sorry, but this post is getting longer than i expected. Stay tuned for part 2 later this day.
I hope this helps.
Last edited by bakunin; 10-29-2018 at 09:05 AM..