Unix/Linux Go Back    

RedHat 9 (Linux i386) - man page for lam (redhat section 7)

Linux & Unix Commands - Search Man Pages
Man Page or Keyword Search:   man
Select Man Page Set:       apropos Keyword Search (sections above)

LAM(7)					   LAM OVERVIEW 				   LAM(7)

       LAM - introduction to Local Area Multicomputer (LAM)

       LAM  is an MPI programming environment and development system for a message-passing paral-
       lel machine constituted with heterogeneous UNIX computers on a network.	With LAM, a dedi-
       cated cluster or an existing network computing infrastructure can act as one parallel com-
       puter solving one compute-intensive problem.  LAM emphasizes productivity in the  applica-
       tion  development cycle with extensive control and monitoring functionality.  The user can
       easily debug the common errors in parallel programming and is well  equipped  to  diagnose
       more difficult problems.

       LAM  features  a full implementation of the MPI communication standard, with the exception
       that the MPI_CANCEL function will not properly cancel messages that have been sent.

   Overview of Commands and Libraries
       introu(1), introc(2), INTROF(2)

   Starting / Stopping LAM
       recon(1), lamboot(1), lamhalt(1), lamnodes(1), wipe(1), tping(1), lamgrow(1), lamshrink(1)

   Compiling MPI Applications
       mpicc(1), mpiCC(1), mpif77(1)

   Running MPI Applications
       mpirun(1), lamclean(1)

   Running Non-MPI Applications

   Monitoring MPI Processes
       mpitask(1),  mpimsg(1),	lamtrace(1),  fstate(1),  doom(1),   bfctl(1),	 MPIL_Comm_id(2),

   LAM's MPI Implementation

   Reference Documents
       "LAM Frequently Asked Questions"
	      at http://www.lam-mpi.org/faq/

       "MPI Primer / Developing with LAM", Ohio Supercomputer Center

       "MPI: A Message-Passing Interface Standard", Message-Passing Interface Forum, version 1.1
	      at http://www.mpi-forum.org/

       "MPI-2:	Extensions  to	the  Message Passing Interface", Message Passing Interface Forum,
	      version 2.0
	      at http://www.mpi-forum.org/

   MPI Quick Tutorials
       "LAM/MPI ND User Guide / Introduction"
	      at http://www.lam-mpi.org/mpi/tutorials/lam/

       "MPI: It's Easy to Get Started"

       "MPI: Everyday Datatypes"

       "MPI: Everyday Collective Communication"

       The user creates a file listing the participating machines in the cluster.

	      % cat lamhosts
	      # a 2-node LAM

       Each machine will be given a node identifier (nodeid) starting with 0 for the first listed
       machine, 1 for the second, etc.

       The recon(1) tool verifies that the cluster is bootable.

	      % recon -v lamhosts
	      recon: -- testing n0 (beowulf1.lam-mpi.org)
	      recon: -- testing n1 (beowulf2.lam-mpi.org)

       The lamboot(1) tool actually starts LAM on the cluster.

	      % lamboot -v lamhosts
	      LAM 6.5.8 - Indiana University
	      Executing hboot on n0 (beowulf1.lam-mpi.org)...
	      Executing hboot on n1 (beowulf2.lam-mpi.org)...

       lamboot(1) returns to the UNIX shell prompt.  LAM does not force a canned environment or a
       "LAM shell".  The tping(1) command builds user confidence that the  cluster  and  LAM  are

	      % tping -c1 N
		1 byte from 2 nodes: 0.009 secs

   Compiling MPI Programs
       mpicc(1),  mpicp(1), and mpif77(1) are wrappers for the C, C++, and F77 compilers, respec-
       tively.	They link the LAM libraries and set up header  and  library  search  directories.
       Beginning  with	LAM  version  6.3,  the  MPI library is also automatically linked to user
       applications; the use of the -lmpi command line argument is no longer necessary.

	      % mpicc -o foo foo.c
	      % mpif77 -o foo foo.f

   Executing MPI Programs
       An MPI application is started by one invocation of the mpirun(1) command.  An SPMD  appli-
       cation can be started on the mpirun(1) command line.

	      % mpirun -v -c 2 trivial
	      2445 trivial running on n0 (o)
	      361 trivial running on n1

       An  application	with multiple programs must be described in an application schema, a file
       that lists each program and its target node(s).	See appschema(5).

	      % cat appfile
	      # 1 master, 2 slaves
	      n0 master
	      n0-1 slave

	      % mpirun -v appfile
	      3292 master running on n0 (o)
	      3296 slave running on n0 (o)
	      412 slave running on n1

       Applications can choose, at run-time, to use the "daemon" mode  of  communication  or  the
       "client-to-client"  mode.   Each  has advantages and disadvantages, which are discussed in

   Monitoring MPI Applications
       The full MPI synchronization status of all processes and messages can be displayed at  any
       time.   This includes the source and destination ranks, the message tag, the communicator,
       and the function invoked.

% mpitask
0/0 trivial	     Ssend	   1/1	      123    WORLD  64	    INT
1/1 trivial	     Recv	   0/0	      456    WORLD  64	    INT

       Process rank 0 is blocked sending a synchronous message (MPI_Ssend()) to process rank 1 on
       tag 123 using the MPI_COMM_WORLD communicator.  The message contains 64 integers.  Process
       rank 1 is blocked on MPI_Recv() on the same communicator with a different tag.

% mpimsg
0/0	       1/1	      123     WORLD   64	INT	    n1,#0

       The unreceived message can be examined with mpimsg(1).  The expected tag and  communicator
       are  shown,  along  with a message identifier that can be used to display the message con-

   Terminating Applications
       All user processes and messages can be removed, without restarting LAM.

	      % lamclean -v
	      killing processes, done
	      sweeping messages, done
	      closing files, done
	      sweeping traces, done

       This command is frequently used between MPI runs, especially while developing  and  debug-
       ging MPI programs.

   Terminating LAM
       The lamhalt(1) tool removes all traces of the LAM session on the network.

	      % lamhalt
	      LAM 6.5.8 - Indiana University

       Alternatively,  if  for	some  reason  lamhalt(1) is not able to shut the running LAM down
       properly, the deprecated wipe(1) command can be used with the boot schema that was used to
       originally boot LAM:

	      % wipe -v lamhosts
	      Executing tkill on n0 (beowulf1.lam-mpi.org)...
	      Executing tkill on n1 (beowulf2.lam-mpi.org)...

       LAM runs on each computer as a single UNIX daemon uniquely structured as a nano-kernel and
       hand-threaded virtual processes.  The nano-kernel component  provides  a  simple  message-
       passing,  rendez-vous  service to local processes.  Some of the in-daemon processes form a
       network communication subsystem, which transfers messages to and from other LAM daemons on
       other  machines.   The network subsystem adds features like packetization and buffering to
       the base synchronization.  Other in-daemon processes are servers for remote  capabilities,
       such  as  program execution and parallel file access.  The layering is quite distinct: the
       nano-kernel has no connection with the network subsystem, which has no connection with the
       servers.  Users can configure in or out services as necessary.

       The  unique software engineering of LAM is transparent to users and system administrators,
       who only see a conventional daemon.  System developers can de-cluster the  daemon  into	a
       daemon  containing  only  the  nano-kernel and several full client processes.  This devel-
       oper's mode is still transparent to users but exposes LAM's highly modular  components  to
       simplified individual debugging.  It also reveals LAM's evolution from Trollius, which ran
       natively on scalable multicomputers and joined them to the UNIX network through a  uniform
       programming interface.  Trollius is the ultimate heterogeneous parallel environment.

       The  network layer in LAM is a documented, primitive and abstract layer on which to imple-
       ment a more powerful communication standard like MPI.

       A most important feature of LAM is hands-on control of the multicomputer.  There  is  very
       little  that cannot be seen or changed at runtime.  Programs residing anywhere can be exe-
       cuted anywhere, stopped, resumed, killed, and watched the whole	time.	Messages  can  be
       viewed  anywhere  on the multicomputer and buffer constraints tuned as experience with the
       application dictates.  If the synchronization of a process and a  message  can  be  easily
       displayed, mismatches resulting in bugs can easily be found.  These and other services are
       available both as a programming library and as utility programs run from any shell.

   MPI Implementation
       MPI synchronization boils down to four variables: context, tag, source  rank,  destination
       rank.   These  are  mapped  to  LAM's  abstract synchronization at the network layer.  MPI
       debugging tools interpret the LAM information with the knowledge of  the  LAM/MPI  mapping
       and present detailed information to MPI programmers.

       A  significant  portion	of  the  MPI specification can be (and is) implemented completely
       within the runtime system and independent of the underlying environment.

       As with all MPI implementations, LAM must synchronize the launch of  MPI  applications  so
       that  all  processes locate each other before user code is entered.  The mpirun(1) command
       achieves this after finding and loading the program(s) which constitute	the  application.
       A  simple  SPMD	application  can be specified on the mpirun(1) command line, while a more
       complex configuration is described in a separate file, called an application schema.

       MPI programs developed on LAM can be moved without source code changes to any other  plat-
       form that supports MPI.

LAM 6.5.8				  November, 2002				   LAM(7)
Unix & Linux Commands & Man Pages : ©2000 - 2018 Unix and Linux Forums

All times are GMT -4. The time now is 10:12 AM.