Coredumps and swap - was part of Solaris Mem Consumption


 
Thread Tools Search this Thread
Operating Systems Solaris Coredumps and swap - was part of Solaris Mem Consumption
# 22  
Old 08-10-2008
Again, I'm sorry insisting you are wrong.

You are now confusing the kernel dump operation and the savecore one.

The kernel dump operation is done right after a panic on a raw device. This raw device is either the swap area or a dedicated partition.
You might have deduced it from the documentation given the fact slice 2 is given as an example and slice 2 is never used to lay out a filesystem.

The savecore operation is done when the OS reboots and indeed create files on a filesystem.

The latter operation doesn't happen if the former one didn't so I still assert either a swap or a dedicated partition is required for the kernel crash dump to succeed, not a filesystem.

One of the reasons why an option was added to dumpadm to allow not using the swap space was precisely swap space not being allocated large enough for crash dump to succeed.
# 23  
Old 08-10-2008
Sorry, Neo, but I must concur with jillagre. Look again at the example:
Code:
example# dumpadm –d /dev/dsk/c0t2d0s2

Dump content: kernel pages
Dump device: /dev/dsk/c0t2d0s2 (dedicated)
Savecore directory: /var/crash/saturn
Savecore enabled: yes

There are two different items being mentioned: a dump device, and an savecore directory.

The dump device is where the kernel will write the crash dump when it encounters a panic condition. This is always a raw area on disk. Remember that a panic means that the kernel has malfunctioned. No one would want a malfunctioning kernel to attempt to write a very large file into a file system. The very reason for a panic is to stop I/O to filesystems before they are more badly damaged.

The savecore program runs during a reboot (but after fsck's and thus we are sure that the filesystem can be used). Usually the dump device is not dedicated but is a swap area so we want to crash dumps before swap is needed. savecore reads from the dump area and write to the file system. We clearly need to save an image of the damaged kernel before we reboot. After a reboot, the broken kernel is gone. Note that everything quoted mentions that savecore is operating after a reboot. And the savecore description "The savecore utility saves a crash dump of the kernel (assuming that one was made)". Note that "saves" is present tense while "was made" is past tense. If a crash dump was not made, savecore can do nothing. We have already rebooted so a trick like reading from /dev/mem is not going to work... we would only get a copy of the new kernel.

Finally, look at the description of the -f option to savecore: "Attempt to save a crash dump from the specified file instead of from the system's current dump device. This option may be useful if the information stored on the dump device has been copied to an on-disk file by means of the dd(1M) command." Those are the only 2 options for input.

EDIT:
Looking over this thread, I want to make a few more points.

reborg said "It will simply not save a core dump if it has no space." And this is true. Lots of folks simply choose to not save crashdumps at all. Nothing wrong with that. Nothing reborg said implied the use of file system space as an alternative for crashdumps.

jilliagre said "You cannot limit the kernel dump size" and this is false. If there is a 12 GB crashdump to be written out, but there is only a 5 GB dump size available, the kernel will write the first 5 GB. I'm surprised that reborg implies that this used to result in a panic-reboot infinite loop. If it did that was a bug. A panic during a panic is not supposed to attempt a crashdump. A second panic should occur followed by a reboot. In addition to simply not providing a lot of space, options sometimes exist to limit crashdump size. Solaris has a few mentioned on the dumpadm man page. And about those truncated dumps... the kernel is the first thing dumped in every OS I know. I have often debugged problems by reading those truncated dumps (albeit usually on HP-UX). Simply dumping the message buffer which contains the last few messages displayed by the kernel including the panic message is a big help.
# 24  
Old 08-10-2008
I see where I was confused; one writes to rawdisk and other other writes to the filesystem, two different operations.

Thanks for pointing that out (so gently) jlliagre and Perderabo.

So, what jlliagre was saying is that since you must have a place to dump to rawdisk, and this can be the same as swap, you might as well save a partition and use the same one for both swap and dump, if I understand correctly.

That is why jlliagre pushed back a bit against reborgs summary point:

Quote:
You don't need to allocate any swap space to deal with savecores, and have not since Solaris 8.
... because reborg did not mention the raw device dump space, only the savecore space.

I was incorrect in thinking that there was one space, not two. Good discussion, thanks for clarifying.

So, basically what jlliagre gently keep trying to remind me, was that since you must have this raw space, make sure it is the right size and use swap, and the swap space should be big enough to handle a dump.

Is this the same concept for Solaris, HP-UX, Linux, BSD, etc; or does it vary by OS and flavor?
# 25  
Old 08-10-2008
Quote:
Originally Posted by jlliagre
Again, I'm sorry insisting you are wrong.

You are now confusing the kernel dump operation and the savecore one.

The kernel dump operation is done right after a panic on a raw device. This raw device is either the swap area or a dedicated partition.
You might have deduced it from the documentation given the fact slice 2 is given as an example and slice 2 is never used to lay out a filesystem.

The savecore operation is done when the OS reboots and indeed create files on a filesystem.

The latter operation doesn't happen if the former one didn't so I still assert either a swap or a dedicated partition is required for the kernel crash dump to succeed, not a filesystem.

One of the reasons why an option was added to dumpadm to allow not using the swap space was precisely swap space not being allocated large enough for crash dump to succeed.

Jlliagre,

A few points, becasue this a a particular topic that really annoys me when people get it wrong.

1. My original statement that you don't need any swap for core files is 100% accurate. The (updated) underlining of "need" should give an indication that I saying that there are alternatives and I do not feel this was in any way misleading. In the blog I linked to this is also explained. I never said you couldn't or shouldn't use swap for it. I was pointing out that it does not need to be included in the decision of appropriate swap size.

2. Yes, you can use a dedicated partition for the dump device and there are may reasons why you might choose to do that (or not to). I diagree with the statement that there is no reason not to use swap.

3. My argument had nothing whatsoever to do with saving disk space, it was a obeservation base on the fallacious claim that 2x memory is required for swap, and one of the reasons given was that you need swap for savecores, this is simply untrue.

4. It is easy to make logical arguments based on half truth. The reason dumpadm was introduced to allow this behaviour was indeed based in part on the fact that swap was not large enough to hold coredumps. However with full disclosure you would note that the reason that this was the case was becasue the swap space was often reduced becasue it was not needed operationally because physical memory could be used and therefore not allocated.

5. You can to some extent control the size of a core, by restricting the contents:
Code:
     -c content-type         Modify  the  dump  configuration  so
                             that  the crash dump consists of the
                             specified dump content.  The content
                             should be one of the following:

                             kernel          Kernel memory  pages
                                             only.

                             all             All memory pages.

                             curproc         Kernel memory pages,
                                             and the memory pages
                                             of the process whose
                                             thread was currently
                                             executing on the CPU
                                             on  which  the crash
                                             dump was  initiated.
                                             If  the  thread exe-
                                             cuting on  that  CPU
                                             is  a  kernel thread
                                             not associated  with
                                             any   user  process,
                                             only  kernel   pages
                                             will be dumped.

And unless you specify "all" you will never need as much space as you have memory. From experience I agree with pupp, it will only be a few gigs for the standard "kernel pages" dump with the possible exception where an unconstrained ARC cache is used for ZFS.

6. The argument about saving disk really doesn't apply in most instances where dumpadm is actually used. In my systems one of the main reasons for using dumpadm to specify a partition other than swap is becasue swap may be configured to use high speed external disks, a SAN for example, becasue I want performace to be as good as possible if I didn't use enough memory or I have an occasional tasks which throws me over the physical memory boundary, or as is more common I am using ISM and I need disk backed swap. If I am not booting from SAN I defintely don't want this on my local disk(s). Under certain circumstances the external storage would not be available after a crash so I do want to dump to local disk. I might need the core file to debug or for a support case. Secondly if I use a dedicated partition and I don't have enough space to save the core file I can at any time manually invoke savecore and save the file after freeing up space, or after using dumpadm to specify an alternate location, if I use swap it's gone once the server comes up.

At the end of the day I was talking about using disk backed swap, not about kernel dumps, and I had really hoped to avoid having to go into deatil on that topic.

I still maintain that swap should be dimensioned based on operational requirements of the system and applications, the operational requirements do not need to, but may, include space for core dumps.
# 26  
Old 08-10-2008
Quote:
Originally Posted by Perderabo
I'm surprised that reborg implies that this used to result in a panic-reboot infinite loop.
Did I? That was not my intent. I meant exactly the opposite. This would never result in a panic reboot cycle.

Here is what I said:
Quote:
Originally Posted by reborg
Solaris will never go into a panic-reboot cycle as a result of not having savecore space. It will simply not save a core dump if it has no space.
I assume it was the proximity to the previous point that made it read that way?
# 27  
Old 08-10-2008
Again, I do agree with reborg.

This was a thread about sizing for swap space, not about a side bar about numerous technical points that are interesting, but logically, a fallacy, because the basic premise that reborg said is correct.

You simply do not need swap for panic dumps, and certainly not for savecores (see I got the distinction right this time).

For example, I think reborg is working on a site upgrade with 16GB of RAM. If he said, hey Neo, I am going to configure ZERO swap, I would say, OK, you will get no push back from me.

But, knowing reborg over the years here (but not yet over a BBQ with lots of beer), he will more-than-likely configure around 4GB Linux swap for 16GM of RAM, but I could be wrong.
# 28  
Old 08-10-2008
I still like to have some disk backed swap as a contingency to keep a server up if a process goes mad. 4G was about what I was planning for tempfs, but I don't really expect that any of it will ever get used to page.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Solaris 10 swap device and filesystem

Hi all, Q1) Due to application requirement, i am required to have more swap space. Currently my swap is on a partition with 32GB. I have another partition with 100GB, but it already has a UFS filesystem on it. Can i just swap -d /dev/dsk/current32gb and swap -a /dev/dsk/ufs100gb ? Will... (17 Replies)
Discussion started by: javanoob
17 Replies

2. Solaris

Swap Solaris 5.10

I have a customers that is getting grid alerts that swap is over 95% utilized. When I do swap -l on the machine I get the following results. $ swap -l swapfile dev swaplo blocks free /swap/swapfile - 16 6291440 6291440 /swap/swapfile2 - 16 8191984... (18 Replies)
Discussion started by: Michael.McGraw
18 Replies

3. Solaris

Problem with Swap consumption

Hi Experts, I have M4000 server with 132 GB Physical memory. 4 sparse zones are running under this server, which are running multiple applications. I am not getting any pointer, where swap space is getting consumed. Almost 97% of swap space is being used. I checked all /tmp (of zones as well),... (7 Replies)
Discussion started by: solaris_1977
7 Replies

4. Solaris

How to check power consumption of Solaris servers ?

hi friends, we are relocating our DC and need to plan out electrical power for the new DC. are there ways i could find the actual power consumption from my current servers ? instead of the product specs. (2 Replies)
Discussion started by: Exposure
2 Replies

5. Shell Programming and Scripting

Determining User Consumption in solaris

Inorder to find the user memory consumption I used the command: prstat -s cpu -a -n 10 But now I want to automate it and want to write the output to a file. How can I write the out put of user name and percentage of consumption alone to an output file.? (2 Replies)
Discussion started by: engineer
2 Replies

6. Solaris

Solaris 10 SWAP SPACE

We have a SPARC system which is running on Solaris-9 and Physical memory size is 16GB.We have allocated 32GB SWAP space(2 times of physical memory).But when we use df -h command it shows following output and SWAP space size shows more than our allocated space # df -h Filesystem size used... (2 Replies)
Discussion started by: cyberdemon
2 Replies

7. Solaris

Solaris 10 - Memory / Swap

Hi all Got myself in a pickle here, chasing my own tail and am confused. Im trying to work out memory / swap on my solaris 10 server, that Im using zones on. Server A has 32Gb of raw memory, ZFS across the root /mirror drives. # prtdiag -v | grep mem = Memory size: 32768 Megabytes #... (1 Reply)
Discussion started by: sbk1972
1 Replies

8. AIX

Zerofault terminates and coredumps - Segmentation fault

Hi, I am using zerofault in AIX to find memory leaks for my server. zf -c <forked-server> zf -l 30 <server> <arguments> Then after some (5 mins ) it terminates core dumping and saying server exited abnormally. I could not understand the core file generated: its something like show in below... (0 Replies)
Discussion started by: vivek.gkp
0 Replies

9. Solaris

Solaris Mem Consumption

We have Sun OS running on spark : SunOS ciniwnpr67 5.10 Generic_118833-24 sun4u sparc SUNW,Sun-Fire-V440 Having Physical RAM : Sol10box # prtconf | grep Mem Memory size: 8192 Megabytes My Top Output is : 130 processes: 129 sleeping, 1 on cpu CPU states: 98.8% idle, 0.2% user, 1.0%... (27 Replies)
Discussion started by: rajwinder
27 Replies

10. Programming

Reg: char ptr - Coredumps

#include <stdio.h> void main() { int Index=1; char *Type=NULL; Type = (char *)Index; printf("%s",Type); } Getting coredump (5 Replies)
Discussion started by: vijaysabari
5 Replies
Login or Register to Ask a Question