Sponsored Content
Top Forums Shell Programming and Scripting Failure rate of a node / Data center Post 303020079 by chercheur111 on Thursday 12th of July 2018 05:37:42 PM
Old 07-12-2018
What operating system are you using?


Linux OS (Ubuntu distribution)


What shell are you using?

shell bash

How do you expect to deduce a failure rate from a single point in time? Are you instead maybe looking for a percentage of network node failures at this point in time?


This is only a simple example. I will generate an history of some days.


What output are you hoping to produce from the sample input you have provided?

The MTBF (mean-time-between-failures) of each node.


What have you tried on your own to get the output you want?
 

6 More Discussions You Might Find Interesting

1. Virtualization and Cloud Computing

Cloud Enabling Computing for the Next Generation Data Center

Hear how the changing needs of massive scale-out computing is driving a transfomation in technology and learn how HP is supporting this new evolution of the web. More... (1 Reply)
Discussion started by: Linux Bot
1 Replies

2. HP-UX

Need to set up a HP cluster system in a data center

What are the server requirements, Software requirements, Network requirements etc, Please help me.. as 'm new 'm unable to get things done @ my end alone. Please refrain from typing subjects completely in upper case letters to get more attention, ty. (5 Replies)
Discussion started by: Sounddappan
5 Replies

3. Red Hat

Problem in RedHat Cluster Node while network Failure or in Hang mode

Hi, We are having many RedHat linux Server with Cluster facility for availability of service like HTTPD / MySQL. We face some issue while some issue related to power disturbance / fluctuation or Network failure. There is two Cluster Node configured in... (0 Replies)
Discussion started by: hirenkmistry
0 Replies

4. What is on Your Mind?

Cut Over to New Data Center and Upgraded OS Done. :)

Three days ago we received an expected notice from our long time data center that they were going dark on Sept 12th. About one and a half hours ago, after three days of marathon work, I just cut over the unix.com to a new data center with a completely new OS and Ubuntu distribution. (22 Replies)
Discussion started by: Neo
22 Replies

5. What is on Your Mind?

Resolved: Issue in Server Data Center

Dear All, There was a problem in the data center data, which caused the server to be unreachable for about an hour. Server logs show the server did not crash or go down. Hence, I assume there was a networking issue at the data center. Still waiting for final word on what happened. ... (4 Replies)
Discussion started by: Neo
4 Replies

6. What is on Your Mind?

OUTAGE: Data Center Problem Resolved.

There was a problem with our data center today, creating a site outage (server unreachable). That problem has been resolved. Basically, it seems to have been a socially engineered denial-of-service attack against UNIX.com; which I stopped as soon as I found out what the problem was. Total... (2 Replies)
Discussion started by: Neo
2 Replies
scds_fm_action(3HA)					 Sun Cluster HA and Data Services				       scds_fm_action(3HA)

NAME
scds_fm_action - take action after probe completion function SYNOPSIS
cc [flags...] -I /usr/cluster/include file -L /usr/cluster/lib -l dsdev #include <rgm/libdsdev.h> scha_err_t scds_fm_action(scds_handle_t handle, int probe_status, long elapsed_milliseconds DESCRIPTION
The scds_fm_action() function uses the probe_status of the data service in conjunction with the past history of failures to take one of the following actions: o Restart the application. o Fail over the resource group. o Do nothing. Use the value of the input probe_status argument to indicate the severity of the failure. For example, you might consider a failure to con- nect to an application as a complete failure, but a failure to disconnect as a partial failure. In the latter case you would have to spec- ify a value for probe_status between 0 and SCDS_PROBE_COMPLETE_FAILURE. The DSDL defines SCDS_PROBE_COMPLETE_FAILURE as 100. For partial probe success or failure, use a value between 0 and SCDS_PROBE_COM- PLETE_FAILURE. Successive calls to scds_fm_action() compute a failure history by summing the value of the probe_status input parameter over the time interval defined by the Retry_interval property of the resource. Any failure history older than Retry_interval is purged from memory and is not used towards making the restart or failover decision. The scds_fm_action() function uses the following algorithm to choose which action to take: Restart If the accumulated history of failures reaches SCDS_PROBE_COMPLETE_FAILURE, scds_fm_action() restarts the resource by calling the STOP method of the resource followed by the START method. It ignores any PRENET_START or POSTNET_STOP meth- ods defined for the resource type. The status of the resource is set to SCHA_RSSTATUS_DEGRADED by making a scha_resource_setstatus() call, unless the resource is already set. If the restart attempt fails because the START or STOP methods of the resource fail, a scha_control() is called with the GIVEOVER option to fail the resource group over to another node or zone. If the scha_control() call succeeds, the resource group is failed over to another cluster node or zone, and the call to scds_fm_action() never returns. Upon a successful restart, failure history is purged. Another restart is attempted only if the failure history again accumulates to SCDS_PROBE_COMPLETE_FAILURE. Failover If the number of restarts attempted by successive calls to scds_fm_action() reaches the Retry_count value defined for the resource, a failover is attempted by making a call to scha_control() with the GIVEOVER option. The status of the resource is set to SCHA_RSSTATUS_FAULTED by making a scha_resource_setstatus() call, unless the resource is already set. If the scha_control() call fails, the entire failure history maintained by scds_fm_action() is purged. If the scha_control() call succeeds, the resource group is failed over to another cluster node or zone, and the call to scds_fm_action() never returns. No Action If the accumulated history of failures remains below SCDS_PROBE_COMPLETE_FAILURE, no action is taken. In addition, if the probe_status value is 0, which indicates a successful check of the service, no action is taken, irrespective of the failure history. The status of the resource is set to SCHA_RSSTATUS_OK by making a scha_resource_setstatus() call, unless the resource is already set. PARAMETERS
The following parameters are supported: handle The handle that is returned from scds_initialize(3HA). probe_status A number you specify between 0 and SCDS_PROBE_COMPLETE_FAILURE that indicates the status of the data service. A value of 0 implies that the recent data service check was successful. A value of SCDS_PROBE_COMPLETE_FAILURE means complete failure and implies that the service has completely failed. You can also supply a value in between 0 and SCDS_PROBE_COMPLETE_FAILURE that implies a partial failure of the service. elapsed_milliseconds The time, in milliseconds, to complete the data service check. This value is reserved for future use. RETURN VALUES
The scds_fm_action() function returns the following values: 0 The function succeeded. nonzero The function failed. ERRORS
SCHA_ERR_NOERR No action was taken, or a restart was successfully attempted. SCHA_ERR_FAIL A failover attempt was made but it did not succeed. SCHA_ERR_NOMEM System is out of memory. FILES
/usr/cluster/include/rgm/libdsdev.h Include file /usr/cluster/lib/libdsdev.so Library ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWscdev | +-----------------------------+-----------------------------+ |Interface Stability |Evolving | +-----------------------------+-----------------------------+ SEE ALSO
scds_fm_sleep(3HA), scds_initialize(3HA), scha_calls(3HA), scha_control(3HA), scha_fm_print_probes(3HA), scha_resource_setstatus(3HA), attributes(5) Sun Cluster 3.2 7 Sep 2007 scds_fm_action(3HA)
All times are GMT -4. The time now is 05:14 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy