Profiling Processes while shutdown


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
# 1  
Hammer & Screwdriver Profiling Processes while shutdown

I was wondering how can I find the culprit of a slow shutdown on my debian box? I am actually looking for a diagnosis tool that might dump the process name and amount of time it took to close the process after signal was send.

As for now I am trying to use journalctl to seek some information, but I would like to narrow the suspects down.
# 2  
We had a problem like this in Solaris 10. There was an issue with using NFS across zones on the same system. It hung in certain circumstances.
The point I am trying to make: it may not be a process but a relationship between processes and their current status.

You are assuming a single process is the problem, which is okay, but you ma want to think "larger", multiple process or a device and some process group.

What you gave us is a start, we need more:
Code:
1. is the box standalone - not clustered, no NFS mounts, no samba mounts, etc?
2. does the box actually come down?
3. how much extended time does it take to come down.
    Ex: yesterday it came down in 30 seconds, today it came down in 10 minutes.
4. did you install new software in the near past, and did you get errors on install

# 3  
More Details

Thanks Jim Smilie,

You are right, it could very much be a co/in-dependent set of processes creating the problem.

I have not installed any packages noticeably. I am sure gdb wouldn't have this issue. However, I do have my own code on the box (multiple demons). The problem started appearing recently when the reboot/shutdown command started taking more than 10 minutes as opposed to 45 second previous reboot time. And now, the delay is almost consistent.

Whether mine or external, I simply need to narrow down the problem.
# 4  
I think you have to work "backwards".
Change startup to be more minimal, do not start any your own daemons.

If the problem goes away, keep adding them back into the mix one by one. If it is resource contention, like waiting on some kind of lock, it may be hard to track down. Any daemons that work cooperatively with others may deserve first attention.

If the problem still exists, you may have to start looking at your configuration by changing to single user boot, then changing startup/shutdown script to boot and shutdown at each runlevel to eliminate process groups and processes as a problem.

I vote for your hand-rolled daemons as a great place to start. Sorry I cannot be more specific.
# 5  
With regard to gdb: it will not have the problem, but if the process it controls does have issues, what then? Why are you shutting down with processes running under gdb? Sounds like a bad plan to me.

Shutdown works by sending signals to processes to go through orderly shutdown. If a process cannot or is in a deadlock because a another process locked a mutex then got killed off, SIGTERM will not shut the process down. There are so-called robust mutexes that can help.

pthread_mutexattr_getrobust
# 6  
There isn't a lot of detail in the thread, but things to consider might be:-
  • If you have a database, is it possible that there is a major transaction rollback being done?
  • Do any of you shutdown scripts have waits in them?
  • NFS (as already mentioned)
  • Is there some sort of notification you are trying to do and the target server is down? Perhaps a closing down report to ensure all transactions are centrally held etc.
  • Has someone introduced a backup into the wrong place, so it runs at shutdown?
  • Has someone created a shutdown script that actually does a startup by mistake? (i.e. never it checks $1 for start or stop, it just starts)
  • Is there an AV scan being triggered in the shutdown?
  • Do you run an fsck during shutdown?
  • Do you try to sync the clock during shutdown?
  • Is this a High Availability node, or worse an HA node where the other node(s) are all off?
There are lots of other possibles too, I'm sure. What more can you tell us about it?



Kind regards,
Robin
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

6 More Discussions You Might Find Interesting

1. AIX

C profiling tool for AIX

Hello everybody, Please let me know if there are any free C profiling tool for AIX environment Thanks in advance (0 Replies)
Discussion started by: SteAlma
0 Replies

2. Programming

Profiling results and SMP

The SCO OSR 5.7 system was migrated from older HP DL360 to new DL380 G7. The SMP feature was not activated on older box, it is activated now on this 4 core Xeon. A s/w we maintain has been copied without any change over to the new box. I noticed that the application profiling does not show any... (4 Replies)
Discussion started by: migurus
4 Replies

3. UNIX for Dummies Questions & Answers

profiling execution of a process

question goes like this : Explain how users can profile execution of a process with help of an example? can some one pls give me the answer?? (1 Reply)
Discussion started by: rakesh1988
1 Replies

4. UNIX for Dummies Questions & Answers

Script to force Oracle database shutdown when shutdown immediate does not work

I have Oracle 9i R2 on AIX 5.2. My Database is running in shared server mode (MTS). Sometimes when I shutdown the database it shutsdown cleanly in 4-5 mints and sometimes it takes good 15-20 minutes and then I get some ora-600 errors and only way to shutdown is by opening another session and... (7 Replies)
Discussion started by: aixhp
7 Replies

5. UNIX for Advanced & Expert Users

Kernel Profiling

I compiled my device driver with the profiling option -p but while linking I am getting undefined reference to _mcount. LD /vobs/femto_drivers/DspBiosLink/dsplinkk/src/dsplinkk.o Building modules, stage 2. MODPOST *** Warning: "_mcount" undefined! Architechture: ppc32 From... (0 Replies)
Discussion started by: Ashok V
0 Replies

6. UNIX for Advanced & Expert Users

Profiling..entry for a function in pthread_create

Hello, i am try to write a profiler for a multithreaded applciation. When i creat e a thread for "function f2()" the profiling information for this function does not get captured in the struct profileManager. i;e i get the exit information for "function f2()" in that thread, but the entry... (2 Replies)
Discussion started by: Vikky Panchal
2 Replies

Featured Tech Videos