Prediction of failures

08-07-2009

Registered User

148, 1

Join Date: Apr 2008

Last Activity: 14 June 2013, 7:12 AM EDT

Location: EMEA region

Posts: 148

Thanks Given: 13

Thanked 1 Time in 1 Post

Interesting...I'll be looking at these tools. Thanks guys !

Sun Fire

View Public Profile for Sun Fire

Find all posts by Sun Fire

08-07-2009

Registered User

2,693, 19

Join Date: May 2008

Last Activity: 24 August 2014, 5:15 AM EDT

Location: SINGAPORE.. The "FINE" City

Posts: 2,693

Thanks Given: 1

Thanked 19 Times in 19 Posts

Quote:

Originally Posted by jlliagre

It looks like both of you overlook the second part of my previous reply. The tools you are looking for already exist and are included with Solaris.

Some more links:

Solaris Fault Manager (Solaris 10 What's New) - Sun Microsystems
Getting notified when hardware breaks
SCSI DISK FMA Project Part 1: SCSI Device Drivers as FMA Telemetry Detectors

You still dont get my point. I want prevention rather than reactive action after things happen.

incredible

View Public Profile for incredible

Find all posts by incredible

08-07-2009

Registered User

4,940, 703

Join Date: Dec 2007

Last Activity: 4 October 2020, 5:57 PM EDT

Location: Outside Paris

Posts: 4,940

Thanks Given: 20

Thanked 703 Times in 595 Posts

You are still missing mine. Unless you expect a crystal ball to predict what will happen in the future with currently healthy components, the only reasonable way to prevent their future faults is by monitoring events coming from them. This is what SMF is designed to do.

Alternatively, if your goal is really to react to something that hasn't happened yet, you can pro-actively replace each disk after a period of use significantly smaller than its MTBF.

If you just care about your data, use something like RAIDZ2 with hot spares. Your system will happily survive two disks crashing at the same time and will automatically replace them by the spares.

jlliagre

View Public Profile for jlliagre

Find all posts by jlliagre

08-07-2009

Registered User

148, 1

Join Date: Apr 2008

Last Activity: 14 June 2013, 7:12 AM EDT

Location: EMEA region

Posts: 148

Thanks Given: 13

Thanked 1 Time in 1 Post

yes I agree, there's no magic way to really predict each and every hardware failure.

If the data is so critical, then you should invest more in redundancy and HA, and RAS.

Sun Fire

View Public Profile for Sun Fire

Find all posts by Sun Fire

08-09-2009

Registered User

2,693, 19

Join Date: May 2008

Last Activity: 24 August 2014, 5:15 AM EDT

Location: SINGAPORE.. The "FINE" City

Posts: 2,693

Thanks Given: 1

Thanked 19 Times in 19 Posts

Thanks for your valuable feedback.

incredible

View Public Profile for incredible

Find all posts by incredible

08-10-2009

Registered User

148, 1

Join Date: Apr 2008

Last Activity: 14 June 2013, 7:12 AM EDT

Location: EMEA region

Posts: 148

Thanks Given: 13

Thanked 1 Time in 1 Post

One more question:

After finishing installation, customer asked me to do "Network stress test" ...

any ideas ?

Sun Fire

View Public Profile for Sun Fire

Find all posts by Sun Fire

Solaris

Prediction of failures

4 More Discussions You Might Find Interesting

1. Solaris

11.0 to 11.2 update failures

Discussion started by: CptCarrot

2. Post Here to Contact Site Administrators and Moderators

Event Prediction - Euro 2012

Discussion started by: ni2

3. Post Here to Contact Site Administrators and Moderators

Event Prediction - New Sports Events

Discussion started by: ni2

4. HP-UX

Communication Failures

Discussion started by: barun agarwal