10 More Discussions You Might Find Interesting
1. UNIX for Beginners Questions & Answers
Hi Experts,
I wanted to extend a veritas file system which is running on veritas cluster and mounted on node2 system.
#hastatus -sum
-- System State Frozen
A node1 running 0
A node2 running 0
-- Group State
-- Group System Probed ... (1 Reply)
Discussion started by: Skmanojkum
1 Replies
2. Solaris
M running solaris 10 u8 my vncserver is running on :0 .. and when i try to connect it through tight vncview i can see just see the screen .. with no terminal .. what could be the issue for it ? and what i need to check for it ? (2 Replies)
Discussion started by: fugitive
2 Replies
3. Solaris
One of our system is running 3 oracle db instances. And as per prstat o/p the system is approximately using 78G of swap memory
# prstat -J -n 2,15
PROJID NPROC SWAP RSS MEMORY TIME CPU PROJECT
4038 557 31G 29G 22% 113:23:43 10% proj1
4036 466 20G 19G... (2 Replies)
Discussion started by: fugitive
2 Replies
4. Solaris
I have a T5240 server with following swap configuration
$ grep tmp /etc/vfstab
swap - /tmp tmpfs - yes -
$ swap -l
swapfile dev swaplo blocks free
/dev/swap 4294967295,4294967295 16 213909472 213909472
... (4 Replies)
Discussion started by: fugitive
4 Replies
5. Solaris
my system has 128G of installed memory. top, vmstat shows the system has just over 10G of free memory on the system. but as per prstat o/p the usage is just 50-55G is there anyway i can find which process/zone is using more memory ?
System has 3 zones and all running application servers.
... (1 Reply)
Discussion started by: fugitive
1 Replies
6. Solaris
Hi All
How can we verify if any of the parameters we have change in Solaris10 after reboot. Like is there any command? Please advice
Thanks (3 Replies)
Discussion started by: imran721
3 Replies
7. Solaris
Hi Community,
Do you know a procedure to modify the swap on Solaris10 on a volume VERITAS?
Please help me I'm currently working on this issue.
Thank you for your availability! (1 Reply)
Discussion started by: Sunb3
1 Replies
8. Shell Programming and Scripting
the following simple scripts work fine on linux but fail on solaris:
#!/bin/bash
eval /usr/bin/time -f \'bt=\"%U + %S\"\' ./JUNK >> ./LOG 2>&1
cp ./LOG ./LOG_joe
LC_joe=`cat ./LOG | wc -l`
LC_joe=`echo $LC_joe-1|bc`
tail -1 ./LOG > ./tmp
head -$LC_joe ./LOG_joe > ./LOG
rm ./LOG_joe
... (1 Reply)
Discussion started by: joepareti
1 Replies
9. Red Hat
Hello;
I have 2 node Redhat Cluster (RHEL4 U4 and Cluster Suite) I'm using mc_data fiber channel switch for fencing
when I want to fence manually using
fence_mcdata -a x.x.x. -l xxx -p xxxx -n 5 -o disable
following messages appears
fencing node "test1"
agent "fence_mcdata" reports:... (0 Replies)
Discussion started by: sakir19
0 Replies
10. Solaris
sorry,my english is poor.
who can install win2003 and solaris10 in one pc ?
my win2000server in hda1
so
frist install win2003 in hda5
second install solaris10 in hda2
but after install over,the win2003 can't logon in. alway let me press<ctrl>+<alt>+<del>.
why? (1 Reply)
Discussion started by: keyi
1 Replies
FENCE_SANLOCK(8) System Manager's Manual FENCE_SANLOCK(8)
NAME
fence_sanlock - fence agent using watchdog and shared storage leases
SYNOPSIS
fence_sanlock [OPTIONS]
DESCRIPTION
fence_sanlock uses the watchdog device to reset nodes, in conjunction with three daemons: fence_sanlockd, sanlock, and wdmd.
The watchdog device, controlled through /dev/watchdog, is available when a watchdog kernel module is loaded. A module should be loaded for
the available hardware. If no hardware watchdog is available, or no module is loaded, the "softdog" module will be loaded, which emulates
a hardware watchdog device.
Shared storage must be configured for sanlock to use from all hosts. This is generally an lvm lv (non-clustered), but could be another
block device, or NFS file. The storage should be 1GB of fully allocated space. After being created, the storage must be initialized with
the command:
# fence_sanlock -o sanlock_init -p /path/to/storage
The fence_sanlock agent uses sanlock leases on shared storage to verify that hosts have been reset, and to notify fenced nodes that are
still running, that they should be reset.
The fence_sanlockd init script starts the wdmd, sanlock and fence_sanlockd daemons before the cluster or fencing systems are started (e.g.
cman, corosync and fenced). The fence_sanlockd daemon is started with the -w option so it waits for the path and host_id options to be
provided when they are available.
Unfencing must be configured for fence_sanlock in cluster.conf. The cman init script does unfencing by running fence_node -U, which in
turn runs fence_sanlock with the "on" action and local path and host_id values taken from cluster.conf. fence_sanlock in turn passes the
path and host_id values to the waiting fence_sanlockd daemon. With these values, fence_sanlockd joins the sanlock lockspace and acquires a
resource lease for the local host. It can take several minutes to complete these unfencing steps.
Once unfencing is complete, the node is a member of the sanlock lockspace named "fence" and the node's fence_sanlockd process holds a
resource lease named "hN", where N is the node's host_id. (To verify this, run the commands "sanlock client status" and "sanlock client
host_status", which show state from the sanlock daemon, or "sanlock direct dump <path>" which shows state from shared storage.)
When fence_sanlock fences a node, it tries to acquire that node's resource lease. sanlock will not grant the lease until the owner (the
node being fenced) has been reset by its watchdog device. The time it takes to acquire the lease is 140 seconds from the victim's last
lockspace renewal timestamp on the shared storage. Once acquired, the victim's lease is released, and fencing completes successfully.
Live nodes being fenced
When a live node is being fenced, fence_sanlock will continually fail to acquire the victim's lease, because the victim continues to renew
its lockspace membership on storage, and the fencing node sees it is alive. This is by design. As long as the victim is alive, it must
continue to renew its lockspace membership on storage. The victim must not allow the remote fence_sanlock to acquire its lease and con-
sider it fenced while it is still alive.
At the same time, a victim knows that when it is being fenced, it should be reset to avoid blocking recovery of the rest of the cluster.
To communicate this, fence_sanlock makes a "request" on storage for the victim's resource lease. On the victim, fence_sanlockd, which
holds the resource lease, is configured to receive SIGUSR1 from sanlock if anyone requests its lease. Upon receiving the signal,
fence_sanlockd knows that it is a fencing victim. In response to this, fence_sanlockd allows its wdmd connection to expire, which in turn
causes the watchdog device to fire, resetting the node.
The watchdog reset will obviously have the effect of stopping the victim's lockspace membership renewals. Once the renewals stop,
fence_sanlock will finally be able to acquire the victim's lease after waiting a fixed time from the final lockspace renewal.
Loss of shared storage
If access to shared storage with sanlock leases is lost for 80 seconds, sanlock is not able to renew the lockspace membership, and enters
recovery. This causes sanlock clients holding leases, such as fence_sanlockd, to be notified that their leases are being lost. In
response, fence_sanlockd must reset the node, much as if it was being fenced.
Daemons killed/crashed/hung
If sanlock, fence_sanlockd daemons are killed abnormally, or crash or hang, their wdmd connections will expire, causing the watchdog device
to fire, resetting the node. fence_sanlock from another node will then run and acquire the victim's resource lease. If the wdmd daemon is
killed abnormally or crashes or hangs, it will not pet the watchdog device, causing it to fire and reset the node.
Time Values
The specific times periods referenced above, e.g. 140, 80, are based on the default sanlock i/o timeout of 10 seconds. If sanlock is con-
figured to use a different i/o timeout, these numbers will be different.
OPTIONS
-o action
The agent action:
on
Enable the local node to be fenced. Used by unfencing.
off
Disable another node.
status
Test if a node is on or off. A node is on if it's lease is held, and off is it's lease is free.
metadata
Print xml description of required parameters.
sanlock_init
Initialize sanlock leases on shared storage.
-p path
The path to shared storage with sanlock leases.
-i host_id
The host_id, from 1-128.
STDIN PARAMETERS
Options can be passed on stdin, with the format key=val. Each key=val pair is separated by a new line.
action=on|off|status
See -o
path=/path/to/shared/storage
See -p
host_id=num
See -i
FILES
Example cluster.conf configuration for fence_sanlock.
(For cman based clusters in which fenced runs agents.)
Also see cluster.conf(5), fenced(8), fence_node(8).
<clusternode name="node01" nodeid="1">
<fence>
<method name="1">
<device name="wd" host_id="1"/>
</method>
</fence>
<unfence>
<device name="wd" host_id="1" action="on"/>
</unfence>
</clusternode>
<clusternode name="node02" nodeid="2">
<fence>
<method name="1">
<device name="wd" host_id="2"/>
</method>
</fence>
<unfence>
<device name="wd" host_id="2" action="on"/>
</unfence>
</clusternode>
<fencedevice name="wd" agent="fence_sanlock" path="/dev/fence/leases"/>
Example dlm.conf configuration for fence_sanlock.
(For non-cman based clusters in which dlm_controld runs agents.)
Also see dlm.conf(5), dlm_controld(8).
device wd /usr/sbin/fence_sanlock path=/dev/fence/leases
connect wd node=1 host_id=1
connect wd node=2 host_id=2
unfence wd
TEST
To test fence_sanlock directly, without clustering:
1. Initialize storage
node1: create 1G lv on shared storage /dev/fence/leases
node1: fence_sanlock -o sanlock_init -p /dev/fence/leases
2. Start services
node1: service fence_sanlockd start
node2: service fence_sanlockd start
3. Enable fencing
node1: fence_sanlock -o on -p /dev/fence/leases -i 1
node2: fence_sanlock -o on -p /dev/fence/leases -i 2
This "unfence" step may take a couple minutes.
4. Verify hosts and leases
node1: sanlock status
s fence:1:/dev/fence/leases:0
r fence:h1:/dev/fence/leases:1048576:1 p 2465
node2: sanlock status
s fence:2:/dev/fence/leases:0
r fence:h2:/dev/fence/leases:2097152:1 p 2366
node1: sanlock host_status
lockspace fence
1 timestamp 717
2 timestamp 678
node2: sanlock host_status
lockspace fence
1 timestamp 738
2 timestamp 678
5. Fence node2
node1: fence_sanlock -o off -p /dev/fence/leases -i 2
This may take a few minutes to return.
When node2 is not dead before fencing, sanlock on node1 will log errors
about failing to acquire the lease while node2 is still alive. This is
expected.
6. Success
node1 fence_sanlock should exit 0 after node2 is reset by its watchdog.
SEE ALSO
fence_sanlockd(8), sanlock(8), wdmd(8)
2013-05-02 FENCE_SANLOCK(8)