Hi. I found an issue with my appvg present in my server.my server is a single node and not part of hacmp.
The below mentioned 2 filesystems(/websp and /opt/websp) are present under appvg
/dev/fslv06 5.00 1.92 62% 4491 1% /websp
/dev/fslv07 10.00 4.20 58% 37905 4% /opt/websp
I guess there is something wrong with the hdisks present under appvg.
Can someone tell which hdisk(hdisk1 or hdisk2) under appvg is having a problem ?
Also what should I do to fix this quorum issue ?
Please let me know if you need the output of any commands from this server. I have informed the application team that
I need downtime to fix the issue in this server and I'm waiting for their reply. I'm afraid that I may loose the data present under appvg.
The system told you to issue a "varyoffvg" and then a "varyonvg". Have you done that? What was the outcome? Were there any error messages?
Which disk (if a disk at all) has maybe caused a possible problem i can't tell from here, because my line of sight to Bangalore is blocked and my crystal ball is in repair.
I suggest you start advanced troubleshooting instead, by applying your reading skills to the OS output. Your data are as safe as they can be, given the circumstances, because an inactive VG with only unaccessible filesystems can't get any worse than it already is: either you can revive it or the data on it is already lost.
You should really describe your problem better / more exactly. The following was in no way obvious from you first posting:
Quote:
Originally Posted by newtoaixos
before i do the varyoffvg i need to unmount the 2 filesystems present under that VG.
Are the FSs mounted and accessible?? What is the output of:
Somehow i doubt that the filesystems are still available when the VG has been closed.
What does "errpt" tell you? The "quorum" is the minimum number of disks that have to present to make a VG valid. Once less disks than this quorum are there the VG is forced offline, which means all the FSs belonging to it are unmounted (which is why i doubt they really are there). Further, there must be some entry in the "errpt" log regarding a hdisk device failing, otherwise the quorum wouldn't have been lost.
Quote:
I have informed the application team already about this issue. probably when they give me the downtime I will unmount and then try to varyoff the VG.
NO!
When they decided to commission a system where disks are not redundant they forfeited any right to have an uninterruptible service. Hardware is failing from time to time, that is old news. Either you have hardware (regardless of what it is: network cards, disks, processors, power supplies, ... ) redundant, so that when one part fails the other is still there or you have hardware not redundant: then you have to expect the service to be interrupted from time to time. Everything else is "wash me, but don't make me wet in the process": rubbish. No admin in his right mind lets get himself in such a double-bind situation.
Your 2 disks could not have been redundant, because in this case the quorum should have been deactivated: a VG consisting of two mirrored disks is safe even if there is only one of these disk present. (If the disks were indeed mirrored: i suggest firing the idiot who configured such a horse manure on the spot for proven incompetence.)
Additional question: what are these disks? LUNs? (provided via VIOS?, NPIV? other?) Physical disks? RAID-sets? Show the output of these commands:
Quote:
lsdev -Cc disk
lsattr -El hdisk1 / hdisk2
Background is: is there a chance that the inavailability of the disk(s) might be temporary in nature? It might work if you issue
The output confirms that hdisk1 is the cause of your problem. Your volume group is definitely offline and gone with it are the filesystems it may have (once) contained. If they appear to be still mounted: don't believe it, they are gone.
What you see here is a description of the disk (hdisk1) in increasing detail:
Quote:
Originally Posted by newtoaixos
And this is the probable cause for hdisk1 failing. I SNIPped to the interesting part:
Quote:
Originally Posted by newtoaixos
Looks like your SCSI disk was failing somehow - this could be everything from a broken cable, a terminator gone bad to the disk itself become broken. First, make shure that the SCSI link is up again. Delete the hdisk1 devices and run "cfgmgr" to rediscover it. If it won't come back the disk is not connected (or broken), if it is in status "available" the disconnection is gone. You still should investigate, because a symptom gone is not a problem solved. Find the reason for the disconnection, only this will solve your problem.
Still, don't be shy to start repair action - this server will do nothing without the data necessary for carrying out its function anyway. If business complains: see above. If they are too greedy to pay for mirrored disks they will have to live with failing ones and the time necessary for repair. If the disks are indeed mirrored whoever forgot to (un)set the quorum is to blame and business will have every right to be angry. This is administration basics and should not happen at all.
Hi
I need to know what are the precaution we should take during quorum server reboot as this quorum server is providing quorum devices to five different solaris two node clusters.
Also let me know do I have to follow below procedure as well before and after reboot of quorum server
Do I... (3 Replies)
Some storage/disks have been added to an existing AIX 6.1 server. The admin sent me the list of hdisk#'s for the new disks, but I need the corresponding rhdisk# for the same hdisk. (I know from past experience that the rhdisk that maps to an hdisk is not always the same number. For instance,... (5 Replies)
Hi all, i have 3 nodes cluster (Centos 5 cluster suit) with out quorum disk,
node vote = 1,
the value of a quorum = 2,
when 2 nodes going offline, cluster services are destoys.
How i can save the cluster and all services(move all services to one alive node)
with out quorum disk when other... (3 Replies)
Hi all,
I'm getting some errors on AIX regarding Flashcopy and volume group hard disks.
The script that activates flashcopy showed this errors:
Recreating Flashcopy for lun01_A1
Performing syntax check...
Syntax check complete.
Executing script...
Script execution complete.
SMcli... (1 Reply)
Hi there,
I have three servers and I'm puzzled by the oputput I get from lsvg rootvg.
Server 1 : QUORUM: 2 (Enabled)
Server 2 : QUORUM: 1 (Disabled)
Server 3 : QUORUM: 1
All VG are build on 2 PV and are mirroring.
What could cause the number to be different?... (2 Replies)
Hi all
Just a question about quorum.
I am running AIX 5.3
Rootvg has 2 PV - not mirrored. quorum is switched on.
What happens when one disk fails?, can i replace the disk and bring the entire VG back up. with all the data intact. knowing that the VG will be unavailable until i replace the... (3 Replies)
How do you create a dummy hdisk with AIX 6.1? In previous versions, I've used this and works, but now I get this error.
hostname:/:# mkdev -l hdisk57 -c disk -t osdisk -s scsi -p fscsi0 -w 0,10 -d
Method error (/etc/methods/define):
0514-022 The specified connection is not valid.
Any... (2 Replies)
Hi,
I am running AIX 5.3 TL8. After a disk failure, one of my mirrored application volumegroups went down. Unfortunately we have quorum switched on on this VG and the defective disk holds the majority.
I have set MISSINGPV_VARYON to TRUE and tried a forced varyon but it's still failing. I... (3 Replies)
Hi all,
I would like to ensure that a volume group has an effective quorum setting of 1 (or off). I know you can change the quorum setting using the chvg -Q command but want to know if the setting has been changed before the vg was varied on or a reboot.
In other words how can I ensure that... (3 Replies)