Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

orte-checkpoint(1) [debian man page]

OMPI-CHECKPOINT(1)						     Open MPI							OMPI-CHECKPOINT(1)

NAME
ompi-checkpoint, orte-checkpoint - Checkpoint a running parallel process using the Open MPI Checkpoint/Restart Service (CRS) NOTE: ompi-checkpoint, and orte-checkpoint are all exact synonyms for each other. Using any of the names will result in exactly identical behavior. SYNOPSIS
ompi-checkpoint [ options ] <PID_OF_MPIRUN> Options orte-checkpoint will attempt to notify a running parallel job (identified by mpirun) that it has been requested that the job checkpoint itself. A global snapshot handle reference is presented to the user, which is used in ompi_restart to restart the job. <PID_OF_MPIRUN> Process ID of the mpirun process. -h | --help Display help for this command -w | --nowait Do not wait for the application to finish checkpointing before returning. -s | --status Display status messages regarding the progression of the checkpoint request. --term After checkpointing the running job, terminate it. -v | --verbose Enable verbose output for debugging. -gmca | --gmca <key> <value> Pass global MCA parameters that are applicable to all contexts. <key> is the parameter name; <value> is the parameter value. -mca | --mca <key> <value> Send arguments to various MCA modules. DESCRIPTION
orte-checkpoint can be invoked multiple, non-overlapping times. It is convenient to note that the user does not need to spectify the checkpointer to be used here, as that is determined completely by each of the running process in the job being checkpointed. SEE ALSO
orte-ps(1), orte-clean(1), ompi-restart(1), opal-checkpoint(1), opal-restart(1), opal_crs(7) 1.4.5 Feb 10, 2012 OMPI-CHECKPOINT(1)

Check Out this Related Man Page

OMPI-RESTART(1) 						     Open MPI							   OMPI-RESTART(1)

NAME
ompi-restart, orte-restart - Restart a previously checkpointed parallel job using the Open PAL Checkpoint/Restart Service (CRS) NOTE: ompi-restart, and orte-restart are all exact synonyms for each other. Using any of the names will result in exactly identical behav- ior. SYNOPSIS
ompi-restart [ options ] <GLOBAL SNAPSHOT HANDLE> Options ompi-restart will attempt to restart a previously checkpointed parallel job from the global snapshot handle reference returned by ompi_checkpoint. <GLOBAL SNAPSHOT HANDLE> The global snapshot handle reference returned by ompi_checkpoint, used to restart the job. This is required to be the last argu- ment to this command. -h | --help Display help for this command -p | --preload Preload the checkpoint files on the remote systems before restarting the application. Disabled by default. --fork Fork off a new process, which is the restarted process. By default, the restarted process will replace ompi-restart. -s | --seq The sequence number of the checkpoint to restart from. By default, the most recent sequence number is used (specified by -1). -hostfile | --hostfile The hostfile from which to restart the application. Useful in unscheduled environments. (Same behavior as --machinefile option) -machinefile | --machinefile The machinefile from which to restart the application. Useful in unscheduled environments. (Same behavior as --hostfile option) -v | --verbose Enable verbose output for debugging. -gmca | --gmca <key> <value> Pass global MCA parameters that are applicable to all contexts. <key> is the parameter name; <value> is the parameter value. -mca | --mca <key> <value> Send arguments to various MCA modules. DESCRIPTION
ompi-restart can be invoked multiple, non-overlapping times. This allows the user to restart a previously running parallel job. SEE ALSO
orte-ps(1), orte-clean(1), ompi-checkpoint(1), opal-checkpoint(1), opal-restart(1), opal_crs(7) 1.4.5 Feb 10, 2012 OMPI-RESTART(1)
Man Page