Whatever the resource is that's going unavailable, the CPU would reschedule the process if the resource returned. Of course, although not likely, this could be a hardware problem (eg, faulty network card).
When this problem occurs can you check the network interfaces of both the local and remote nodes are still up, and the network infrastructure between them is working. The OS isn't seeing this as an error as such otherwise it would report it. It sees it as a suspension of service whilst a resource is unavailable, pending resumption.
should give you a clue as to why the process is sleeping.
Hello,
I host a couple of Call of Duty gameing servers. There are some hackers who love the crash them. When they crash them it simply causes a segmentaion fault and kills the PID. I was wondering it you could help me write a script to simply restart the program after it has been crashed. The... (9 Replies)
Hi Friends,
I am new to this forum as well as new to shell scripting.
I have a problem here and i need someone to solve this.
Let us consider there are two processes(abc & def).There is a script which kills these two processes(i.e killtheprocess abc). Here abc is the argument .
There is a... (1 Reply)
I had issues with processes locking up. This script checks for processes and kills them if they are older than a certain time.
Its uses some functions you'll need to define or remove, like slog() which I use for logging, and is_running() which checks if this script is already running so you can... (0 Replies)
have two scripts on Unix; one that starts some processes and the other one for killing a process. At first, I ran the .sh without WILY in it and it worked perfectly; in this way, I could also ran my stopper process. However I need WILY in this so I added it to my script but this time, a message... (1 Reply)
Hi,
How is it possible to restart only your process. I can get the process killed but I am not able to start it.
For eg : i first did this ps -ef|grep _out --displays all the process with _out in the name
then I killed kill -15 36044 -- process id.
Now how can i start the same... (1 Reply)
Hi,
I'm new to Solaris. I have an issue with ssh service. When I restart the service it exits with an exit status of 0
$svcadm restart svc:/network/ssh:default
$echo $?
0
$
However, the service goes into maintenance mode after restart. I'm able to connect even though the service is in... (3 Replies)
Hi friends,
I have one unix command which is used to check the network status manually.
followig is the command
check_Network this command give follwoing status
Network 1 is ok
Network 2 is ok
network 3 is ok
network 4 is ok
.
.
.
.
Network 10 is... (8 Replies)
Hi Experts,
I am facing one problem here which is one process always stuck in running state which causes the other similar process to sleep state . This causes my system in hanged state.
On doing cat /proc/<pid>wchan showing the "__init_begin" in the output.
Can you please help me here... (0 Replies)
Hi Experts,
I am facing one problem here which is one process always stuck in running state which causes the other similar process to sleep state . This causes my system in hanged state.
On doing cat /proc/<pid>wchan showing the "__init_begin" in the output.
Can you please help me here... (1 Reply)
Hi Experts,
I am facing one problem here which is one process always stuck in running state which causes the other similar process to sleep state . This causes my system in hanged state.
On doing cat /proc/<pid>wchan showing the "__init_begin" in the output.
Can you please help me here... (6 Replies)
Discussion started by: naveeng
6 Replies
LEARN ABOUT DEBIAN
ocf_heartbeat_ethmonitor
OCF_HEARTBEAT_ETHMON(7) OCF resource agents OCF_HEARTBEAT_ETHMON(7)NAME
ocf_heartbeat_ethmonitor - Monitors network interfaces
SYNOPSIS
ethmonitor [start | stop | status | monitor | meta-data | validate-all]
DESCRIPTION
Monitor the vitality of a local network interface.
You may setup this RA as a clone resource to monitor the network interfaces on different nodes, with the same interface name. This is not
related to the IP adress or the network on which a interface is configured. You may use this RA to move resources away from a node, which
has a faulty interface or prevent moving resources to such a node. This gives you independend control of the resources, without involving
cluster intercommunication. But it requires your nodes to have more than one network interface.
The resource configuration requires a monitor operation, because the monitor does the main part of the work. In addition to the resource
configuration, you need to configure some location contraints, based on a CIB attribute value. The name of the attribute value is
configured in the 'name' option of this RA.
Example constraint configuration: location loc_connected_node my_resource_grp rule ="rule_loc_connected_node" -INF: ethmonitor eq 0
The ethmonitor works in 3 different modes to test the interface vitality. 1. call ip to see if the link status is up (if link is down ->
error) 2. call ip an watch the RX counter (if packages come around in a certain time -> success) 3. call arping to check wether any of the
IPs found in the lokal ARP cache answers an ARP REQUEST (one answer -> success) 4. return error
SUPPORTED PARAMETERS
interface
The name of the network interface which should be monitored (e.g. eth0). (unique, required, string, no default)
name
The name of the CIB attribute to set. This is the name to be used in the constraints. Defaults to "ethmonitor-'interface_name'".
(unique, optional, string, no default)
multiplier
Multiplier for the value of the CIB attriobute specified in parameter name. (optional, integer, default 1)
repeat_count
Specify how often the interface will be monitored, before the status is set to failed. You need to set the timeout of the monitoring
operation to at least repeat_count * repeat_interval (optional, integer, default 5)
repeat_interval
Specify how long to wait in seconds between the repeat_counts. (optional, integer, default 10)
pktcnt_timeout
Timeout for the RX packet counter. Stop listening for packet counter changes after the given number of seconds. (optional, integer,
default 5)
arping_count
Number of ARP REQUEST packets to send for every IP. Usually one ARP REQUEST (arping) is send (optional, integer, default 1)
arping_timeout
Time in seconds to wait for ARP REQUESTs (all packets of arping_count). This is to limit the time for arp requests, to be able to send
requests to more than one node, without running in the monitor operation timeout. (optional, integer, default 1)
arping_cache_entries
Maximum number of IPs from ARP cache list to check for ARP REQUEST (arping) answers. Newest entries are tried first. (optional,
integer, default 5)
SUPPORTED ACTIONS
This resource agent supports the following actions (operations):
start
Starts the resource. Suggested minimum timeout: 20s.
stop
Stops the resource. Suggested minimum timeout: 20s.
status
Performs a status check. Suggested minimum timeout: 20s. Suggested interval: 10s.
monitor
Performs a detailed status check. Suggested minimum timeout: 20s. Suggested interval: 10s.
meta-data
Retrieves resource agent metadata (internal use only). Suggested minimum timeout: 5s.
validate-all
Performs a validation of the resource configuration. Suggested minimum timeout: 20s.
EXAMPLE
The following is an example configuration for a ethmonitor resource using the crm(8) shell:
primitive p_ethmonitor ocf:heartbeat:ethmonitor
params
interface=string
op monitor depth="0" timeout="20s" interval="10s"
SEE ALSO
http://www.linux-ha.org/wiki/ethmonitor_(resource_agent)
AUTHOR
Linux-HA contributors (see the resource agent source for information about individual authors)
resource-agents UNKNOWN 03/09/2014 OCF_HEARTBEAT_ETHMON(7)