Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

lamshrink(1) [mojave man page]

LAMSHRINK(1)							   LAM COMMANDS 						      LAMSHRINK(1)

NAME
lamshrink - Shrink a LAM universe. SYNOPSIS
lamshrink [-dhv] [-w delay] nodeid OPTIONS
-d Print detailed debugging information. -h Print useful information on this command. -v Be verbose. -w delay Notify processes on the doomed node and pause for delay seconds before proceeding. nodeid Remove the LAM node with this ID. DESCRIPTION
An existing LAM session, initiated by lamboot(1), can be shrunk to include less nodes with lamshrink. One node is removed for each invoca- tion. At a minimum, the node ID is given on the command line. Once lamshrink completes, the node ID is invalid across the remaining nodes (as can be seen by running lamnodes(1)). Existing application processes on the target node can be warned of impending shutdown with the -w option. A LAM signal (SIGFUSE) will be sent to these processes and lamshrink will then pause for the given number of seconds before proceeding with removing the node. By de- fault, SIGFUSE is ignored. A different handler can be installed with ksignal(2). All application processes on all remaining nodes are always informed of the death of a node. This is also done with a signal (SIGSHRINK), which by default causes a process's runtime route cache to be flushed (to remove any cached information on the dead node). If this signal is re-vectored for the purpose of fault tolerance, the old handler should be called at the beginning of the new handler. The signal does not, by itself, give the process information on which node has been removed. One technique for getting this information is to query the router for information on all relevant nodes using getroute(2). The dead node will cause this routine to return an error. FAULT TOLERANCE If enabled with lamboot(1), LAM will watch for nodes that fail. The procedure for removing a node that has failed is the same as lamshrink after the warning step. In particular, the SIGSHRINK signal is delivered. EXAMPLES
lamshrink -v n1 Remove LAM on n1. Report about important steps as they are done. lamshrink n30 -w 10 Inform all processes on LAM node 30, that the node will be dead in 10 seconds. Wait 10 seconds and remove the node. Operate silently. SEE ALSO
lamboot(1), lamnodes(1), ksignal(2), getroute(2) LAM 7.1.4 July, 2007 LAMSHRINK(1)

Check Out this Related Man Page

LAMNODES(1)							     LAM TOOLS							       LAMNODES(1)

NAME
lamnodes - Resolve LAM node/CPU notation to Unix hostnames. SYNOPSIS
lamnodes [-chin] [where] OPTIONS
-c Suppress printing the CPU count for each node. -h Print the command help menu. -i Print IP addresses (instead of IP names) -n Suppress printing CPU count for each node DESCRIPTION
The lamnodes command is used to resolve LAM node/CPU nomenclature to Unix hostnames. It can be used to determine the current running con- figuration of the LAM/MPI run-time environment, and generate a boot schema that can be used to launch LAM in the future. By default, lamnodes will print out the node number, default IP name, CPU count, and per-node flags for each node in the running LAM. gethostbyaddr(3) is used to obtain default hostnames. If gethostbyaddr(3) fails, the IP number is displayed instead. This command can be used by setup shell scripts (and the like) to determine information from a currently-running LAM universe. For exam- ple, use lamnodes to resolve particular CPUs and/or nodes to specific unix hostnames. In a batch environment, lamnodes can be used to determine which CPUs share a common node (note that MPI_GET_PROCESSOR_NAME can be used for a similar effect in an MPI program). lamnodes also shows per-node flags. Currently defined flags are: origin The node where lamboot was executed. this_node The node where lamnodes is running. no_schedule The node will not be used to run MPI and serial processes when N and C are used to mpirun and lamexec. EXAMPLES
lamnodes N -n Display IP names and CPU counts for all nodes. This output can be saved and later used with lamboot(1). lamnodes C -n -c Display the IP name of the nodes containing each CPU, and suppress the LAM node number and CPU count. This output can be saved and later used with lamboot(1). SEE ALSO
bhost(5), gethostbyaddr(3), lamboot(1) LAM 7.1.4 July, 2007 LAMNODES(1)
Man Page