04-04-2020
You problem is with network timeout settings; either the cluster or the clients.
Manually shutting down one of the cluster nodes may not give you the same result as a true CPU/power/whatever failure because the cluster software suite will probably see you do that. It would be better to simply pull out the RJ45 network connection to one of them simulating a network connection failure.
Anyway, the point is that a cluster failover takes time. During this time the virtual ip address is switched from one node to the other. Depending on the cluster suite this will take seconds/minutes. The fact that the client will reconnect to the surviving cluster node after you restart it proves that, had it waited long enough, it would have been able to reconnect on its own.
So the solution is to either (1) configure the cluster to failover faster, or (2) increase the timeout that clients will wait before giving up. That means that a new connection to the virtual ip address can be made before the configured timeout period ends.
6 More Discussions You Might Find Interesting
1. Windows & DOS: Issues & Discussions
Hi All,
I use two Network Connections at work: Wireless and LAN.
Wireless network has no limitations, but LAN internet has a web filter.
I start a download using my Wireless conn. (At this point, LAN is disabled)
But when I activate my LAN connection my download stops immediately.
LAN... (4 Replies)
Discussion started by: kalavkalav
4 Replies
2. AIX
Hi all,
I am new to HACMP. So sorry for the newie question. But I did search the forum and it seems that no one asks this before.
So if a 2-node cluster runs in active-active mode (and the same application), what is the benefit of using HACMP ?
If it runs in active-stanby, it is easy to... (9 Replies)
Discussion started by: qiulang
9 Replies
3. Solaris
Hi,
I need to configure 4 ip address (same subnet and mask) in one ipmp group (two interfaces) in an active active formation (link based). Can some one provide the steps or a tutorial link.
Thanks (2 Replies)
Discussion started by: Mack1982
2 Replies
4. Shell Programming and Scripting
Hi All,
From the title you may know that this question has been asked several times and I have done lot of Googling on this.
I have a Wikipedia dump file in XML format. All the contents are in one XML file i.e. all different topics have been put in one XML file. Now I need to separate them and... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies
5. Linux
Hi,
We have one java client which connects to a windows server through ftp in active mode and gets files. When we run this client on hp-ux, it is able to transfer 100k files. But when we run the same client on Linux server it is able to transfer only 200 files at max and it is hanging there... (1 Reply)
Discussion started by: urspradeep330
1 Replies
6. Shell Programming and Scripting
#!/bin/bash
for digit in $(seq 1 10)
do
if ping -c1 -w2 192.168.1.$digit &> /dev/null
then
echo "192.168.1.$digit is UP"
else
echo "192.168.1.$digit is DOWN"
fi
done (3 Replies)
Discussion started by: fusetrips
3 Replies
LEARN ABOUT OPENSOLARIS
scds_fm_sleep
scds_fm_sleep(3HA) Sun Cluster HA and Data Services scds_fm_sleep(3HA)
NAME
scds_fm_sleep - wait for a message on a fault monitor control socket
SYNOPSIS
cc [flags...] -I /usr/cluster/include file -L /usr/cluster/lib -l dsdev
#include <rgm/libdsdev.h>
scha_err_t scds_fm_sleep(scds_handle_t handle, time_t timeout
DESCRIPTION
Thescds_fm_sleep() function waits for a data service application process tree that running under control of the process monitor facility to
die. If no such death occurs within the specified timeout period, the function returns SCHA_ERR_NOERR.
If a data service application process tree death occurs, scds_fm_sleep() records SCDS_COMPLETE_FAILURE in the failure history and either
restarts the process tree or fails it over according to the algorithm described in the scds_fm_action(3HA) man page. If a failover attempt
is unsuccessful, a restart of the application is attempted.
If an attempted restart fails, the function returns SCHA_ERR_INTERNAL.
Note that if the failure history causes this function to do a failover, and the failover attempt succeeds, scds_fm_sleep() never returns.
PARAMETERS
The following parameters are supported:
handle The handle returned from scds_initialize(3HA).
timeout The timeout period measured in seconds.
RETURN VALUES
The scds_fm_sleep() function returns the following:
0 The function succeeded.
nonzero The function failed.
ERRORS
SCHA_ERR_NOERR Indicates that the process tree has not died.
SCHA_ERR_INTERNAL Indicates that the data service application process tree has died and failed to restart.
Other values Indicate the function failed. See scha_calls(3HA) for the meaning of failure codes.
FILES
/usr/cluster/include/rgm/libdsdev.h
Include file
/usr/cluster/lib/libdsdev.so
Library
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWscdev |
+-----------------------------+-----------------------------+
|Interface Stability |Evolving |
+-----------------------------+-----------------------------+
SEE ALSO
scha_calls(3HA), scds_fm_action(3HA), scds_initialize(3HA), attributes(5)
Sun Cluster 3.2 7 Sep 2007 scds_fm_sleep(3HA)