04-27-2009
It really doesn't make much sense to me. MPICH should suppose to run many nodes and there is a big possibility that a node can fail during the execution. It should at least continue the processing with the remaining nodes.
Thanks for the answers though. I'll keep looking for the solution.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
I noticed this in a search for more security tools...
It IS possible to "undelete" a file; I suppose recover would be a better term for it. I suppose we've all made the boo-boo (that we all hopefully learned from) of deleting a file, and finding that you do not have a backup. I wouldn't... (1 Reply)
Discussion started by: LivinFree
1 Replies
2. SCO
I am helping a company recover a system that is SCO OS 5.0.5 - they have their backup media, cd copies of SCO, but they do not have their license keys to install and SCO is being difficult in validating their license.
Does anyone have an install license key for 5.0.5 that they would be willing... (1 Reply)
Discussion started by: ggraham
1 Replies
3. SCO
I've been working with SCO Unix for several years now but have never had to restore a system from a bare drive.
I have a bootable CD that contains what appears to be the correct files necessary to recover the boot and root filesystems.
I've got the BIOS setup such that the CD is the first... (12 Replies)
Discussion started by: teamhog
12 Replies
4. AIX
Hi,My system is not booting and at the startup it is getting struck.In HMC error code is coming as 0000, I know the reason of failing.I have few queries on recovery, please answer:1. I have mksysb of the system from which I can restore the system but problem is my few application mount point was a... (5 Replies)
Discussion started by: aixpank
5 Replies
5. Shell Programming and Scripting
I deleted one of the job from the cron tab. I want to get it back. How can i do this.
pplease suggest me..
thanks (1 Reply)
Discussion started by: pranabrana
1 Replies
6. SCO
I'm sorting out the disaster recovery plan for a critical server. It's a Dell PowerEdge 2850 running Openserver 5.0.6a.
We have a disaster recovery agreement with HP and they have just confirmed that in the event of a total disaster such as the server being totally wiped out, they would NOT... (2 Replies)
Discussion started by: mmcardle
2 Replies
7. UNIX for Advanced & Expert Users
I accidentally deleted a very important directory today with this rm -r. What would be the recommended way to recover my directory? After a lot of googleing I have seen these choices. Could I get some recommendations please?
Testdisk
Photorec- Doesn't recover file name like I would like. ... (10 Replies)
Discussion started by: cokedude
10 Replies
8. Solaris
Hi,
Is it possible to have a Solaris cluster of 2 nodes at SITE-A using SVM and creating metaset using say 2 LUNs (on SAN). Then replicating these 2 LUNs to remote site SITE-B via storage based replication and then using these LUNs by importing them as a metaset on a server at SITE-B which is... (0 Replies)
Discussion started by: dn2011
0 Replies
9. Homework & Coursework Questions
Hi Experts,
I am in need of running a script from one node say node 1 via node 2.
My scheduling tool dont have access to node2 , so i need to invoke the list file from node1 but the script needs to run from node2. because the server to which i am hitting, is having access only for the node... (5 Replies)
Discussion started by: arun1377
5 Replies
10. HP-UX
Hi,
We have HP UX service guard cluster on OS 11.23. Recently 40+ LUNs presented to both nodes by SAN team but I was asked to mount them on only one node. I created required VGs/LVs, created VxFS and mounted all of them and they are working fine. Now client requested those FS on 2nd node as... (4 Replies)
Discussion started by: prvnrk
4 Replies
LEARN ABOUT HPUX
cmhaltnode
cmhaltnode(1m) cmhaltnode(1m)
NAME
cmhaltnode - halt a node in a high availability cluster
SYNOPSIS
cmhaltnode [-f] [-v] [-t] [node_name...]
DESCRIPTION
cmhaltnode causes a node to halt its cluster daemon and remove itself from the existing cluster.
To halt cluster on the node, a user must either be superuser(UID=0), or have an access policy of FULL_ADMIN allowed in the cluster configu-
ration file. See access policy in cmquerycl.
When cmhaltnode is run on a node, the cluster daemon is halted and, optionally, all packages that were running on that node are moved to
other nodes if possible.
If node_name is not specified, the cluster daemon running on the local node will be halted and removed from the existing cluster.
If you issue this command while a cluster is still in the process of forming, the command will fail with the message "Unable to connect to
daemon." If this happens, wait for the cluster to form successfully, then issue the command again.
Options
cmhaltnode supports the following options:
-f Force the node to halt even if there are packages or group members running on it. The group members on the node will be
terminated. The halt scripts for all packages running on the node will be run; based on priority or dependency relation-
ships, this may affect packages on other nodes. In other words, packages on other nodes may either start or halt based on
this package halting. If the package configuration and current cluster membership permit, and if the package halt script
succeeds, the packages will be started on other nodes. Without this option, if packages are running on the given node,
the command will fail. If a package fails to halt, the node halt will also fail.
-v Verbose output will be displayed.
-t Test only. Provide an assessment of the package placement without affecting the current state of the nodes or packages.
This option validates the node's eligibility with respect to the package dependencies as well as the external dependencies
such as EMS resources, package subnets, and storage before predicting any package placement decisions. If there is a pack-
age in maintenance mode running on the nodes being halted, the package will always be halted and not failover to another
node; the report will not display an assessment for that package.
node_name...
The name of the node(s) to halt.
RETURN VALUE
cmhaltnode returns the following value:
0 Successful completion.
1 Command failed.
EXAMPLES
Halt the cluster daemon on two other nodes:
cmhaltnode node2 node3
AUTHOR
cmhaltnode was developed by HP.
SEE ALSO
cmquerycl(1m), cmhaltcl(1m), cmruncl(1m), cmrunnode(1m), cmviewcl(1m), cmeval(1m).
Requires Optional Serviceguard Software cmhaltnode(1m)