Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

ompi_crcp(7) [debian man page]

OMPI_CRCP(7)							     Open MPI							      OMPI_CRCP(7)

NAME
OMPI_CRCP - Open MPI MCA Checkpoint/Restart Coordination Protocol (CRCP) Framework: Overview of Open MPI's CRCP framework, and selected modules. Open MPI 1.4.5 DESCRIPTION
The CRCP Framework is used by Open MPI for the encapsulation of various Checkpoint/Restart Coordination Protocols (e.g., Coordinated, Unco- ordinated, Message/Communication Induced, ...). GENERAL PROCESS REQUIREMENTS
In order for a process to use the Open MPI CRCP components it must adhear to a few programmatic requirements. First, the program must call MPI_INIT early in its execution. The program must call MPI_FINALIZE before termination. A user may initiate a checkpoint of a parallel application by using the ompi-checkpoint(1) and ompi-restart(1) commands. AVAILABLE COMPONENTS
Open MPI currently ships with one CRCP component: coord. The following MCA parameters apply to all components: crcp_base_verbose Set the verbosity level for all components. Default is 0, or silent except on error. coord CRCP Component The coord component implements a Coordinated Checkpoint/Restart Coordination Protocol similar to the one implemented in LAM/MPI. The coord component has the following MCA parameters: crcp_coord_priority The component's priority to use when selecting the most appropriate component for a run. crcp_coord_verbose Set the verbosity level for this component. Default is 0, or silent except on error. none CRCP Component The none component simply selects no CRCP component. All of the CRCP function calls return immediately with ORTE_SUCCESS. This component is the last component to be selected by default. This means that if another component is available, and the none component was not explicity requested then Open MPI will attempt to activate all of the available components before falling back to this component. SEE ALSO
ompi-checkpoint(1), ompi-restart(1), opal-checkpoint(1), opal-restart(1), orte_snapc(7), orte_filem(7), opal_crs(7) 1.4.5 Feb 10, 2012 OMPI_CRCP(7)

Check Out this Related Man Page

OMPI-RESTART(1) 						     Open MPI							   OMPI-RESTART(1)

NAME
ompi-restart, orte-restart - Restart a previously checkpointed parallel job using the Open PAL Checkpoint/Restart Service (CRS) NOTE: ompi-restart, and orte-restart are all exact synonyms for each other. Using any of the names will result in exactly identical behav- ior. SYNOPSIS
ompi-restart [ options ] <GLOBAL SNAPSHOT HANDLE> Options ompi-restart will attempt to restart a previously checkpointed parallel job from the global snapshot handle reference returned by ompi_checkpoint. <GLOBAL SNAPSHOT HANDLE> The global snapshot handle reference returned by ompi_checkpoint, used to restart the job. This is required to be the last argu- ment to this command. -h | --help Display help for this command -p | --preload Preload the checkpoint files on the remote systems before restarting the application. Disabled by default. --fork Fork off a new process, which is the restarted process. By default, the restarted process will replace ompi-restart. -s | --seq The sequence number of the checkpoint to restart from. By default, the most recent sequence number is used (specified by -1). -hostfile | --hostfile The hostfile from which to restart the application. Useful in unscheduled environments. (Same behavior as --machinefile option) -machinefile | --machinefile The machinefile from which to restart the application. Useful in unscheduled environments. (Same behavior as --hostfile option) -v | --verbose Enable verbose output for debugging. -gmca | --gmca <key> <value> Pass global MCA parameters that are applicable to all contexts. <key> is the parameter name; <value> is the parameter value. -mca | --mca <key> <value> Send arguments to various MCA modules. DESCRIPTION
ompi-restart can be invoked multiple, non-overlapping times. This allows the user to restart a previously running parallel job. SEE ALSO
orte-ps(1), orte-clean(1), ompi-checkpoint(1), opal-checkpoint(1), opal-restart(1), opal_crs(7) 1.4.5 Feb 10, 2012 OMPI-RESTART(1)
Man Page