MPIRUN(1) LAM COMMANDS MPIRUN(1)
mpirun - Run MPI programs on LAM nodes.
mpirun [-fhvO] [-c <#> | -np <#>] [-D | -wd <dir>] [-ger | -nger] [-c2c | -lamd] [-nsigs]
[-nw | -w] [-nx] [-pty | -npty] [-s <node>] [-t | -toff | -ton] [-x
VAR1[=VALUE1][,VAR2[=VALUE2],...]] [<where>] <program> [-- <args>]
mpirun [-fhvO] [-D | -wd <dir>] [-ger | -nger] [-lamd | -c2c] [-nsigs] [-nw | -w] [-nx]
[-pty | -npty] [-t | -toff | -ton] [-x VAR1[=VALUE1][,VAR2[=VALUE2],...]] <schema>
There are two forms of the mpirun command -- one for programs (i.e., SPMD-style applica-
tions), and one for application schemas (see appschema(5)). Both forms of mpirun use the
following options by default: -c2c -nger -w. These may each be overriden by their coun-
terpart options, described below.
Additionally, mpirun will send the name of the directory where it was invoked on the local
node to each of the remote nodes, and attempt to change to that directory. See the "Cur-
rent Working Directory" section, below.
-c <#> Synonym for -np (see below).
-c2c Use "client to client" (c2c) mode for MPI communication in the user program.
This mode can significantly speed up some applications, as messages will be
passed directly from the source rank to the destination rank; the LAM daemons
will not be used as third-party message passing agents. However, this disables
monitoring and debugging capabilities; see MPI(7). This option is mutually ex-
clusive with -lamd.
-D Use the executable program location as the current working directory for created
processes. The current working directory of the created processes will be set
before the user's program is invoked. This option is mutually exclusive with
-f Do not configure standard I/O file descriptors - use defaults.
-h Print useful information on this command.
-ger Enable GER (Guaranteed Envelope Resources) communication protocol and error re-
porting. See MPI(7) for a description of GER. This option is mutually exclu-
sive with -nger.
-lamd Use the LAM "daemon mode" for MPI communication. See -c2c (above) and MPI(7)
for a description of the "daemon mode" communication.
-nger Disable GER (Guaranteed Envelope Resources). This option is mutually exclusive
-nsigs Do not have LAM catch signals.
-np <#> Run this many copies of the program on the given nodes. This option indicates
that the specified file is an executable program and not an application schema.
If no nodes are specified, all LAM nodes are considered for scheduling; LAM will
schedule the programs in a round-robin fashion, "wrapping around" (and schedul-
ing multiple copies on a single node) if necessary.
-npty Disable pseudo-tty support. Unless you are having problems with pseudo-tty sup-
port, you probably do not need this option. Mutually exlclusive with -pty.
-nw Do not wait for all processes to complete before exiting mpirun. This option is
mutually exclusive with -w.
-nx Do not automatically export LAM_MPI_*, LAM_IMPI_*, or IMPI_* environment vari-
ables to the remote nodes.
-O Multicomputer is homogeneous. Do no data conversion when passing messages.
-pty Enable pseudo-tty support. Among other things, this enabled line-buffered out-
put (which is probably what you want). This is the default. Mutually exclusive
-s <node> Load the program from this node. This option is not valid on the command line
if an application schema is specified.
-t, -ton Enable execution trace generation for all processes. Trace generation will pro-
ceed with no further action. These options are mutually exclusive with -toff.
-toff Enable execution trace generation for all processes. Trace generation will be-
gin after processes collectively call MPIL_Trace_on(2). This option is mutually
exclusive with -t and -ton.
-v Be verbose; report on important steps as they are done.
-w Wait for all applications to exit before mpirun exits.
-wd <dir> Change to the directory <dir> before the user's program executes. Note that if
the -wd option appears both on the command line and in an application schema,
the schema will take precendence over the command line. This option is mutually
exclusive with -D.
-x Export the specified environment variables to the remote nodes before executing
the program. Existing environment variables can be specified (see the Examples
section, below), or new variable names specified with corresponding values. The
parser for the -x option is not very sophisticated; it does not even understand
quoted values. Users are advised to set variables in the environment, and then
use -x to export (not define) them.
<where> A set of node and/or CPU identifiers indicating where to start <program>. See
bhost(5) for a description of the node and CPU identifiers. mpirun will sched-
ule adjoining ranks in MPI_COMM_WORLD on the same node when CPU identifiers are
used. For example, if LAM was booted with a CPU count of 4 on n0 and a CPU
count of 2 on n1 and <where> is C, ranks 0 through 3 will be placed on n0, and
ranks 4 and 5 will be placed on n1.
<args> Pass these runtime arguments to every new process. These must always be the
last arguments to mpirun. This option is not valid on the command line if an
application schema is specified.
One invocation of mpirun starts an MPI application running under LAM. If the application
is simply SPMD, the application can be specified on the mpirun command line. If the ap-
plication is MIMD, comprising multiple programs, an application schema is required in a
separate file. See appschema(5) for a description of the application schema syntax, but
it essentially contains multiple mpirun command lines, less the command name itself. The
ability to specify different options for different instantiations of a program is another
reason to use an application schema.
Application Schema or Executable Program?
To distinguish the two different forms, mpirun looks on the command line for <where> or
the -c option. If neither is specified, then the file named on the command line is as-
sumed to be an application schema. If either one or both are specified, then the file is
assumed to be an executable program. If <where> and -c both are specified, then copies of
the program are started on the specified nodes/CPUs according to an internal LAM schedul-
ing policy. Specifying just one node effectively forces LAM to run all copies of the pro-
gram in one place. If -c is given, but not <where>, then all available CPUs on all LAM
nodes are used. If <where> is given, but not -c, then one copy of the program is run on
By default, LAM searches for executable programs on the target node where a particular in-
stantiation will run. If the file system is not shared, the target nodes are homogeneous,
and the program is frequently recompiled, it can be convenient to have LAM transfer the
program from a source node (usually the local node) to each target node. The -s option
specifies this behavior and identifies the single source node.
LAM looks for an executable program by searching the directories in the user's PATH envi-
ronment variable as defined on the source node(s). This behavior is consistent with log-
ging into the source node and executing the program from the shell. On remote nodes, the
"." path is the home directory.
LAM looks for an application schema in three directories: the local directory, the value
of the LAMAPPLDIR environment variable, and LAMHOME/boot, where LAMHOME is the LAM instal-
LAM directs UNIX standard input to /dev/null on all remote nodes. On the local node that
invoked mpirun, standard input is inherited from mpirun. The default is what used to be
the -w option to prevent conflicting access to the terminal.
LAM directs UNIX standard output and error to the LAM daemon on all remote nodes. LAM
ships all captured output/error to the node that invoked mpirun and prints it on the stan-
dard output/error of mpirun. Local processes inherit the standard output/error of mpirun
and transfer to it directly.
Thus it is possible to redirect standard I/O for LAM applications by using the typical
shell redirection procedure on mpirun.
% mpirun C my_app < my_input > my_output
Note that in this example only the local node (i.e., the node where mpirun was invoked
from) will receive the stream from my_input on stdin. The stdin on all the other nodes
will be tied to /dev/null. However, the stdout from all nodes will be collected into the
The -f option avoids all the setup required to support standard I/O described above. Re-
mote processes are completely directed to /dev/null and local processes inherit file de-
scriptors from lamboot(1).
The -pty option enabled pseudo-tty support for process output (it is also enabled by de-
fault). This allows, among other things, for line buffered output from remote nodes
(which is probably what you want). This option can be disabled with the -npty switch.
Current Working Directory
The default behavior of mpirun has changed with respect to the directory that processes
will be started in.
The -wd option to mpirun allows the user to change to an arbitrary directory before their
program is invoked. It can also be used in application schema files to specify working
directories on specific nodes and/or for specific applications.
If the -wd option appears both in a schema file and on the command line, the schema file
directory will override the command line value.
The -D option will change the current working directory to the directory where the exe-
cutable resides. It cannot be used in application schema files. -wd is mutually exclu-
sive with -D.
If neither -wd nor -D are specified, the local node will send the directory name where
mpirun was invoked from to each of the remote nodes. The remote nodes will then try to
change to that directory. If they fail (e.g., if the directory does not exists on that
node), they will start with from the user's home directory.
All directory changing occurs before the user's program is invoked; it does not wait until
MPI_INIT is called.
Processes in the MPI application inherit their environment from the LAM daemon upon the
node on which they are running. The environment of a LAM daemon is fixed upon booting of
the LAM with lamboot(1) and is inherited from the user's shell. On the origin node, this
will be the shell from which lamboot(1) was invoked and on remote nodes this will be the
shell started by rsh(1). When running dynamically linked applications which require the
LD_LIBRARY_PATH environment variable to be set, care must be taken to ensure that it is
correctly set when booting the LAM.
Exported Environment Variables
All environment variables that are named in the form LAM_MPI_*, LAM_IMPI_*, or IMPI_* will
automatically be exported to new processes on the local and remote nodes. This exporting
may be inhibited with the -nx option.
Additionally, the -x option to mpirun can be used to export specific environment variables
to the new processes. While the syntax of the -x option allows the definition of new
variables, note that the parser for this option is currently not very sophisticated - it
does not even understand quoted values. Users are advised to set variables in the envi-
ronment and use -x to export them; not to define them.
Two switches control trace generation from processes running under LAM and both must be in
the on position for traces to actually be generated. The first switch is controlled by
mpirun and the second switch is initially set by mpirun but can be toggled at runtime with
MPIL_Trace_on(2) and MPIL_Trace_off(2). The -t (-ton is equivalent) and -toff options all
turn on the first switch. Otherwise the first switch is off and calls to MPIL_Trace_on(2)
in the application program are ineffective. The -t option also turns on the second
switch. The -toff option turns off the second switch. See MPIL_Trace_on(2) and lam-
trace(1) for more details.
MPI Data Conversion
LAM's MPI library converts MPI messages from local representation to LAM representation
upon sending them and then back to local representation upon receiving them. If the case
of a LAM consisting of a homogeneous network of machines where the local representation
differs from the LAM representation this can result in unnecessary conversions. The -O
switch can be used to indicate that the LAM is homogeneous and turn off data conversion.
Direct MPI Communication
For much improved performance but much decreased observability, the -c2c option directs
LAM's MPI library to use the most direct underlying mechanism to communicate with other
processes, rather than use the network message-passing of the LAM daemon. Unreceived mes-
sages will be buffered in the destination process instead of the LAM daemon. MPI process
and message monitoring commands and tools will be much less effective, usually reporting
running processes and empty message queues. Signal delivery with doom(1) is unaffected.
Guaranteed Envelope Resources
By default, LAM will guarantee a minimum amount of message envelope buffering to each MPI
process pair and will impede or report an error to a process that attempts to overflow
this system resource. This robustness and debugging feature is implemented in a machine
specific manner when direct communication (-c2c) is used. For normal LAM communication
via the LAM daemon, a protocol is used. The -nger option disables GER and the measures
taken to support it. The minimum GER is configured by the system administrator when LAM
is installed. See MPI(7) for more details.
mpirun N prog1
Load and execute prog1 on all nodes. Search the user's $PATH for the executable file
on each node.
mpirun -c 8 prog1
Run 8 copies of prog1 wherever LAM wants to run them.
mpirun n8-10 -v -nw -s n3 prog1 -q
Load and execute prog1 on nodes 8, 9, and 10. Search for prog1 on node 3 and transfer
it to the three target nodes. Report as each process is created. Give "-q" as a com-
mand line to each new process. Do not wait for the processes to complete before exit-
mpirun -v myapp
Parse the application schema, myapp, and start all processes specified in it. Report
as each process is created.
mpirun -npty -wd /work/output -x DISPLAY C my_application
Start one copy of "my_application" on each available CPU. The number of available
CPUs on each node was previously specified when LAM was booted with lamboot(1). As
noted above, mpirun will schedule adjoining rank in MPI_COMM_WORLD on the same node
where possible. For example, if n0 has a CPU count of 8, and n1 has a CPU count of 4,
mpirun will place MPI_COMM_WORLD ranks 0 through 7 on n0, and 8 through 11 on n1.
This tends to maximize on-node communication for many parallel applications; when used
in conjunction with the multi-protocol network/shared memory RPIs in LAM (see the RE-
LEASE_NOTES and INSTALL files with the LAM distribution), overall communication per-
formance can be quite good. Also disable pseudo-tty support, change directory to
/work/output, and export the DISPLAY variable to the new processes (perhaps my_appli-
cation will invoke an X application such as xv to display output).
mpirun: Exec format error
A non-ASCII character was detected in the application schema. This is usually a com-
mand line usage error where mpirun is expecting an application schema and an exe-
cutable file was given.
mpirun: syntax error in application schema, line XXX
The application schema cannot be parsed because of a usage or syntax error on the giv-
en line in the file.
<filename>: No such file or directory
This error can occur in two cases. Either the named file cannot be located or it has
been found but the user does not have sufficient permissions to execute the program or
read the application schema.
bhost(5), mpimsg(1), mpitask(1), lamexec(1), lamtrace(1), MPIL_Trace_on(2), loadgo(1)
LAM 6.5.8 November, 2002 MPIRUN(1)