Coredumps and swap - was part of Solaris Mem Consumption

08-10-2008

Registered User

4,940, 703

Join Date: Dec 2007

Last Activity: 4 October 2020, 5:57 PM EDT

Location: Outside Paris

Posts: 4,940

Thanks Given: 20

Thanked 703 Times in 595 Posts

Again, I'm sorry insisting you are wrong.

You are now confusing the kernel dump operation and the savecore one.

The kernel dump operation is done right after a panic on a raw device. This raw device is either the swap area or a dedicated partition.
You might have deduced it from the documentation given the fact slice 2 is given as an example and slice 2 is never used to lay out a filesystem.

The savecore operation is done when the OS reboots and indeed create files on a filesystem.

The latter operation doesn't happen if the former one didn't so I still assert either a swap or a dedicated partition is required for the kernel crash dump to succeed, not a filesystem.

One of the reasons why an option was added to dumpadm to allow not using the swap space was precisely swap space not being allocated large enough for crash dump to succeed.

jlliagre

View Public Profile for jlliagre

Find all posts by jlliagre

08-10-2008

Administrator Emeritus

9,926, 461

Join Date: Aug 2001

Last Activity: 26 February 2016, 12:31 PM EST

Location: Ashburn, Virginia

Posts: 9,926

Thanks Given: 63

Thanked 461 Times in 270 Posts

Sorry, Neo, but I must concur with jillagre. Look again at the example:

Code:

example# dumpadm –d /dev/dsk/c0t2d0s2

Dump content: kernel pages
Dump device: /dev/dsk/c0t2d0s2 (dedicated)
Savecore directory: /var/crash/saturn
Savecore enabled: yes

There are two different items being mentioned: a dump device, and an savecore directory.

The dump device is where the kernel will write the crash dump when it encounters a panic condition. This is always a raw area on disk. Remember that a panic means that the kernel has malfunctioned. No one would want a malfunctioning kernel to attempt to write a very large file into a file system. The very reason for a panic is to stop I/O to filesystems before they are more badly damaged.

The savecore program runs during a reboot (but after fsck's and thus we are sure that the filesystem can be used). Usually the dump device is not dedicated but is a swap area so we want to crash dumps before swap is needed. savecore reads from the dump area and write to the file system. We clearly need to save an image of the damaged kernel before we reboot. After a reboot, the broken kernel is gone. Note that everything quoted mentions that savecore is operating after a reboot. And the savecore description "The savecore utility saves a crash dump of the kernel (assuming that one was made)". Note that "saves" is present tense while "was made" is past tense. If a crash dump was not made, savecore can do nothing. We have already rebooted so a trick like reading from /dev/mem is not going to work... we would only get a copy of the new kernel.

Finally, look at the description of the -f option to savecore: "Attempt to save a crash dump from the specified file instead of from the system's current dump device. This option may be useful if the information stored on the dump device has been copied to an on-disk file by means of the dd(1M) command." Those are the only 2 options for input.

EDIT:
Looking over this thread, I want to make a few more points.

reborg said "It will simply not save a core dump if it has no space." And this is true. Lots of folks simply choose to not save crashdumps at all. Nothing wrong with that. Nothing reborg said implied the use of file system space as an alternative for crashdumps.

jilliagre said "You cannot limit the kernel dump size" and this is false. If there is a 12 GB crashdump to be written out, but there is only a 5 GB dump size available, the kernel will write the first 5 GB. I'm surprised that reborg implies that this used to result in a panic-reboot infinite loop. If it did that was a bug. A panic during a panic is not supposed to attempt a crashdump. A second panic should occur followed by a reboot. In addition to simply not providing a lot of space, options sometimes exist to limit crashdump size. Solaris has a few mentioned on the dumpadm man page. And about those truncated dumps... the kernel is the first thing dumped in every OS I know. I have often debugged problems by reading those truncated dumps (albeit usually on HP-UX). Simply dumping the message buffer which contains the last few messages displayed by the kernel including the panic message is a big help.

Perderabo

View Public Profile for Perderabo

Find all posts by Perderabo

08-10-2008

Administrator

19,118, 3,359

Join Date: Sep 2000

Last Activity: 15 July 2022, 8:51 AM EDT

Location: Asia Pacific, Cyberspace, in the Dark Dystopia

Posts: 19,118

Thanks Given: 2,351

Thanked 3,359 Times in 1,878 Posts

I see where I was confused; one writes to rawdisk and other other writes to the filesystem, two different operations.

Thanks for pointing that out (so gently) jlliagre and Perderabo.

So, what jlliagre was saying is that since you must have a place to dump to rawdisk, and this can be the same as swap, you might as well save a partition and use the same one for both swap and dump, if I understand correctly.

That is why jlliagre pushed back a bit against reborgs summary point:

Quote:

You don't need to allocate any swap space to deal with savecores, and have not since Solaris 8.

... because reborg did not mention the raw device dump space, only the savecore space.

I was incorrect in thinking that there was one space, not two. Good discussion, thanks for clarifying.

So, basically what jlliagre gently keep trying to remind me, was that since you must have this raw space, make sure it is the right size and use swap, and the swap space should be big enough to handle a dump.

Is this the same concept for Solaris, HP-UX, Linux, BSD, etc; or does it vary by OS and flavor?

Neo

View Public Profile for Neo

Visit Neo's homepage!

Find all posts by Neo

08-10-2008

Administrator Emeritus

4,463, 16

Join Date: Mar 2005

Last Activity: 29 March 2012, 7:00 PM EDT

Location: Ireland

Posts: 4,463

Thanks Given: 0

Thanked 16 Times in 14 Posts

Quote:

Originally Posted by jlliagre

Again, I'm sorry insisting you are wrong.

You are now confusing the kernel dump operation and the savecore one.

The kernel dump operation is done right after a panic on a raw device. This raw device is either the swap area or a dedicated partition.
You might have deduced it from the documentation given the fact slice 2 is given as an example and slice 2 is never used to lay out a filesystem.

The savecore operation is done when the OS reboots and indeed create files on a filesystem.

The latter operation doesn't happen if the former one didn't so I still assert either a swap or a dedicated partition is required for the kernel crash dump to succeed, not a filesystem.

One of the reasons why an option was added to dumpadm to allow not using the swap space was precisely swap space not being allocated large enough for crash dump to succeed.

Jlliagre,

A few points, becasue this a a particular topic that really annoys me when people get it wrong.

1. My original statement that you don't need any swap for core files is 100% accurate. The (updated) underlining of "need" should give an indication that I saying that there are alternatives and I do not feel this was in any way misleading. In the blog I linked to this is also explained. I never said you couldn't or shouldn't use swap for it. I was pointing out that it does not need to be included in the decision of appropriate swap size.

2. Yes, you can use a dedicated partition for the dump device and there are may reasons why you might choose to do that (or not to). I diagree with the statement that there is no reason not to use swap.

3. My argument had nothing whatsoever to do with saving disk space, it was a obeservation base on the fallacious claim that 2x memory is required for swap, and one of the reasons given was that you need swap for savecores, this is simply untrue.

4. It is easy to make logical arguments based on half truth. The reason dumpadm was introduced to allow this behaviour was indeed based in part on the fact that swap was not large enough to hold coredumps. However with full disclosure you would note that the reason that this was the case was becasue the swap space was often reduced becasue it was not needed operationally because physical memory could be used and therefore not allocated.

5. You can to some extent control the size of a core, by restricting the contents:

Code:

     -c content-type         Modify  the  dump  configuration  so
                             that  the crash dump consists of the
                             specified dump content.  The content
                             should be one of the following:

                             kernel          Kernel memory  pages
                                             only.

                             all             All memory pages.

                             curproc         Kernel memory pages,
                                             and the memory pages
                                             of the process whose
                                             thread was currently
                                             executing on the CPU
                                             on  which  the crash
                                             dump was  initiated.
                                             If  the  thread exe-
                                             cuting on  that  CPU
                                             is  a  kernel thread
                                             not associated  with
                                             any   user  process,
                                             only  kernel   pages
                                             will be dumped.

And unless you specify "all" you will never need as much space as you have memory. From experience I agree with pupp, it will only be a few gigs for the standard "kernel pages" dump with the possible exception where an unconstrained ARC cache is used for ZFS.

6. The argument about saving disk really doesn't apply in most instances where dumpadm is actually used. In my systems one of the main reasons for using dumpadm to specify a partition other than swap is becasue swap may be configured to use high speed external disks, a SAN for example, becasue I want performace to be as good as possible if I didn't use enough memory or I have an occasional tasks which throws me over the physical memory boundary, or as is more common I am using ISM and I need disk backed swap. If I am not booting from SAN I defintely don't want this on my local disk(s). Under certain circumstances the external storage would not be available after a crash so I do want to dump to local disk. I might need the core file to debug or for a support case. Secondly if I use a dedicated partition and I don't have enough space to save the core file I can at any time manually invoke savecore and save the file after freeing up space, or after using dumpadm to specify an alternate location, if I use swap it's gone once the server comes up.

At the end of the day I was talking about using disk backed swap, not about kernel dumps, and I had really hoped to avoid having to go into deatil on that topic.

I still maintain that swap should be dimensioned based on operational requirements of the system and applications, the operational requirements do not need to, but may, include space for core dumps.

reborg

View Public Profile for reborg

Find all posts by reborg

08-10-2008

Administrator Emeritus

4,463, 16

Join Date: Mar 2005

Last Activity: 29 March 2012, 7:00 PM EDT

Location: Ireland

Posts: 4,463

Thanks Given: 0

Thanked 16 Times in 14 Posts

Quote:

Originally Posted by Perderabo

I'm surprised that reborg implies that this used to result in a panic-reboot infinite loop.

Did I? That was not my intent. I meant exactly the opposite. This would never result in a panic reboot cycle.

Here is what I said:

Quote:

Originally Posted by reborg

Solaris will never go into a panic-reboot cycle as a result of not having savecore space. It will simply not save a core dump if it has no space.

I assume it was the proximity to the previous point that made it read that way?

reborg

View Public Profile for reborg

Find all posts by reborg

08-10-2008

Administrator

19,118, 3,359

Join Date: Sep 2000

Last Activity: 15 July 2022, 8:51 AM EDT

Location: Asia Pacific, Cyberspace, in the Dark Dystopia

Posts: 19,118

Thanks Given: 2,351

Thanked 3,359 Times in 1,878 Posts

Again, I do agree with reborg.

This was a thread about sizing for swap space, not about a side bar about numerous technical points that are interesting, but logically, a fallacy, because the basic premise that reborg said is correct.

You simply do not need swap for panic dumps, and certainly not for savecores (see I got the distinction right this time).

For example, I think reborg is working on a site upgrade with 16GB of RAM. If he said, hey Neo, I am going to configure ZERO swap, I would say, OK, you will get no push back from me.

But, knowing reborg over the years here (but not yet over a BBQ with lots of beer), he will more-than-likely configure around 4GB Linux swap for 16GM of RAM, but I could be wrong.

Neo

View Public Profile for Neo

Visit Neo's homepage!

Find all posts by Neo

08-10-2008

Administrator Emeritus

4,463, 16

Join Date: Mar 2005

Last Activity: 29 March 2012, 7:00 PM EDT

Location: Ireland

Posts: 4,463

Thanks Given: 0

Thanked 16 Times in 14 Posts

I still like to have some disk backed swap as a contingency to keep a server up if a process goes mad. 4G was about what I was planning for tempfs, but I don't really expect that any of it will ever get used to page.

reborg

View Public Profile for reborg

Find all posts by reborg

Solaris

Coredumps and swap - was part of Solaris Mem Consumption

10 More Discussions You Might Find Interesting

1. Solaris

Solaris 10 swap device and filesystem

Discussion started by: javanoob

2. Solaris

Swap Solaris 5.10

Discussion started by: Michael.McGraw

3. Solaris

Problem with Swap consumption

Discussion started by: solaris_1977

4. Solaris

How to check power consumption of Solaris servers ?

Discussion started by: Exposure

5. Shell Programming and Scripting

Determining User Consumption in solaris

Discussion started by: engineer

6. Solaris

Solaris 10 SWAP SPACE

Discussion started by: cyberdemon

7. Solaris

Solaris 10 - Memory / Swap

Discussion started by: sbk1972

8. AIX

Zerofault terminates and coredumps - Segmentation fault

Discussion started by: vivek.gkp

9. Solaris

Solaris Mem Consumption

Discussion started by: rajwinder

10. Programming

Reg: char ptr - Coredumps

Discussion started by: vijaysabari