Sponsored Content
Full Discussion: process checkpointing
Top Forums UNIX for Dummies Questions & Answers process checkpointing Post 302545067 by pratibha on Saturday 6th of August 2011 06:48:20 AM
Old 08-06-2011
Actually I want the steps or you can say script to carry out process checkpointing.I just know want is process checkpointing but dont know how to implement it .Can you please explain the methodology for process checkpointing?I mean how it is carried out?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to start a process and make it sleep for 5 mins and then kill that process

how to start a process and make it sleep for 5 mins and then kill that process (6 Replies)
Discussion started by: shrao
6 Replies

2. Shell Programming and Scripting

script to monitor process running on server and posting a mail if any process is dead

Hello all, I would be happy if any one could help me with a shell script that would determine all the processes running on a Unix server and post a mail if any of the process is not running or aborted. Thanks in advance Regards, pradeep kulkarni. :mad: (13 Replies)
Discussion started by: pradeepmacha
13 Replies

3. Shell Programming and Scripting

Shell Script to Kill Process(number of process) Unix/Solaris

Hi Experts, we do have a shell script for Unix Solaris, which will kill all the process manullay, it used to work in my previous env, but now it is throwing this error.. could some one please help me to resolve it This is how we execute the script (and this is the requirement) ... (2 Replies)
Discussion started by: jonnyvic
2 Replies

4. Shell Programming and Scripting

script to monitor the process system when a process from user takes longer than 15 min run.

get email notification from from system when a process from XXXX user takes longer than 15 min run.Let me know the time estimation for the same. hi ,any one please tell me , how to write a script to get email notification from system when a process from as mentioned above a xxxx user takes... (1 Reply)
Discussion started by: kirankrishna3
1 Replies

5. Shell Programming and Scripting

How to put FTP process as a background process/job in perl?

Hi, I am using net::ftp for transferring files now i am trying in the same Linux server as a result ftp is very fast but if the server is other location (remote) then the file transferred will be time consuming. So i want try putting FTP part as a background process. I am unaware how to do... (5 Replies)
Discussion started by: vanitham
5 Replies

6. BSD

Process remians in Running state causing other similar process to sleep and results to system hang

Hi Experts, I am facing one problem here which is one process always stuck in running state which causes the other similar process to sleep state . This causes my system in hanged state. On doing cat /proc/<pid>wchan showing the "__init_begin" in the output. Can you please help me here... (0 Replies)
Discussion started by: naveeng
0 Replies

7. UNIX for Advanced & Expert Users

Process remians in Running state causing other similar process to sleep and results to system hang

Hi Experts, I am facing one problem here which is one process always stuck in running state which causes the other similar process to sleep state . This causes my system in hanged state. On doing cat /proc/<pid>wchan showing the "__init_begin" in the output. Can you please help me here... (1 Reply)
Discussion started by: naveeng
1 Replies

8. UNIX for Advanced & Expert Users

Process remians in Running state causing other similar process to sleep and results to system hang

Hi Experts, I am facing one problem here which is one process always stuck in running state which causes the other similar process to sleep state . This causes my system in hanged state. On doing cat /proc/<pid>wchan showing the "__init_begin" in the output. Can you please help me here... (6 Replies)
Discussion started by: naveeng
6 Replies

9. Shell Programming and Scripting

Monitoring processes in parallel and process log file after process exits

I am writing a script to kick off a process to gather logs on multiple nodes in parallel using "&". These processes create individual log files. Which I would like to filter and convert in CSV format after they are complete. I am facing following issues: 1. Monitor all Processes parallelly.... (5 Replies)
Discussion started by: shunya
5 Replies

10. Shell Programming and Scripting

Command to get exact tomcat process I am running ignoring other java process

Team, I have multiple batchjobs running in VM, if I do ps -ef |grep java or tomcat I am getting multiple process list. How do I get my exact tomcat process running and that is unique? via shell script? (4 Replies)
Discussion started by: Ghanshyam Ratho
4 Replies
CHECKPOINT(5)						   Sun Grid Engine File Formats 					     CHECKPOINT(5)

NAME
checkpoint - Sun Grid Engine checkpointing environment configuration file format DESCRIPTION
Checkpointing is a facility to save the complete status of an executing program or job and to restore and restart from this so called checkpoint at a later point of time if the original program or job was halted, e.g. through a system crash. Sun Grid Engine provides various levels of checkpointing support (see sge_ckpt(1)). The checkpointing environment described here is a means to configure the different types of checkpointing in use for your Sun Grid Engine cluster or parts thereof. For that purpose you can define the operations which have to be executed in initiating a checkpoint generation, a migration of a checkpoint to another host or a restart of a checkpointed application as well as the list of queues which are eligible for a checkpointing method. Supporting different operating systems may easily force Sun Grid Engine to introduce operating system dependencies for the configuration of the checkpointing configuration file and updates of the supported operating system versions may lead to frequently changing implementation details. Please refer to the <sge_root>/ckpt directory for more information. Please use the -ackpt, -dckpt, -mckpt or -sckpt options to the qconf(1) command to manipulate checkpointing environments from the command- line or use the corresponding qmon(1) dialogue for X-Windows based interactive configuration. Note, Sun Grid Engine allows backslashes () be used to escape newline ( ewline) characters. The backslash and the newline are replaced with a space (" ") character before any interpretation. FORMAT
The format of a checkpoint file is defined as follows: ckpt_name The name of the checkpointing environment as defined for ckpt_name in sge_types(1). To be used in the qsub(1) -ckpt switch or for the qconf(1) options mentioned above. interface The type of checkpointing to be used. Currently, the following types are valid: hibernator The Hibernator kernel level checkpointing is interfaced. cpr The SGI kernel level checkpointing is used. cray-ckpt The Cray kernel level checkpointing is assumed. transparent Sun Grid Engine assumes that the jobs submitted with reference to this checkpointing interface use a checkpointing library such as provided by the public domain package Condor. userdefined Sun Grid Engine assumes that the jobs submitted with reference to this checkpointing interface perform their private checkpointing method. application-level Uses all of the interface commands configured in the checkpointing object like in the case of one of the kernel level checkpointing interfaces (cpr, cray-ckpt, etc.) except for the restart_command (see below), which is not used (even if it is configured) but the job script is invoked in case of a restart instead. ckpt_command A command-line type command string to be executed by Sun Grid Engine in order to initiate a checkpoint. migr_command A command-line type command string to be executed by Sun Grid Engine during a migration of a checkpointing job from one host to another. restart_command A command-line type command string to be executed by Sun Grid Engine when restarting a previously checkpointed application. clean_command A command-line type command string to be executed by Sun Grid Engine in order to cleanup after a checkpointed application has finished. ckpt_dir A file system location to which checkpoints of potentially considerable size should be stored. ckpt_signal A Unix signal to be sent to a job by Sun Grid Engine to initiate a checkpoint generation. The value for this field can either be a symbolic name from the list produced by the -l option of the kill(1) command or an integer number which must be a valid signal on the systems used for checkpointing. when The points of time when checkpoints are expected to be generated. Valid values for this parameter are composed by the letters s, m, x and r and any combinations thereof without any separating character in between. The same letters are allowed for the -c option of the qsub(1) command which will overwrite the definitions in the used checkpointing environment. The meaning of the letters is defined as follows: s A job is checkpointed, aborted and if possible migrated if the corresponding sge_execd(8) is shut down on the job's machine. m Checkpoints are generated periodically at the min_cpu_interval interval defined by the queue (see queue_conf(5)) in which a job exe- cutes. x A job is checkpointed, aborted and if possible migrated as soon as the job gets suspended (manually as well as automatically). r A job will be rescheduled (not checkpointed) when the host on which the job currently runs went into unknown state and the time interval reschedule_unknown (see sge_conf(5)) defined in the global/local cluster configuration will be exceeded. RESTRICTIONS
Note, that the functionality of any checkpointing, migration or restart procedures provided by default with the Sun Grid Engine distribu- tion as well as the way how they are invoked in the ckpt_command, migr_command or restart_command parameters of any default checkpointing environments should not be changed or otherwise the functionality remains the full responsibility of the administrator configuring the checkpointing environment. Sun Grid Engine will just invoke these procedures and evaluate their exit status. If the procedures do not per- form their tasks properly or are not invoked in a proper fashion, the checkpointing mechanism may behave unexpectedly, Sun Grid Engine has no means to detect this. SEE ALSO
sge_intro(1), sge_ckpt(1), sge__types(1), qconf(1), qmod(1), qsub(1), sge_execd(8). COPYRIGHT
See sge_intro(1) for a full statement of rights and permissions. SGE 6.2u5 $Date$ CHECKPOINT(5)
All times are GMT -4. The time now is 03:40 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy