FileSystems under HACMP


 
Thread Tools Search this Thread
Operating Systems AIX FileSystems under HACMP
# 1  
Old 11-14-2017
FileSystems under HACMP

Dear Fellows,

I'm now working under a HACMP Cluster (version 7.1) with 2 nodes (Node1 active / Node2 passive), and 1 Resource Group on active node (Node1), which is UNMANAGED for boths nodes.
So, all VG Data are on Node1.
Then I had a JFS2 FileSystem full (located on one of these VG Data called "VG_Clust") that I had to increase it with LVM commands (chfs), not CSPOC ones (cl_chfs) and this worked fine.

This "VG_Clust" is in "read/write" permission, "Enhanced-Capable" and VG Mode "Concurrent" .

In case of Failover to Node2, should I need to "Verify and Synchronize HACMP" or perform Script Failure from Node1 ?

For so, this requires HACMP to be down ?

Could you confirm if these steps below is correct ?
smitty hacmp -> Custom Cluster Configuration -> Verify and Synchronize Cluster Configuration (Advanced)

Thanks for your kind reply.
# 2  
Old 11-15-2017
Assuming you have paid-for support from IBM (given that you are thinking of sending them a snap in this ticket) then that would probably be the best option, or just open a PMR and give them all the details from here.

I don't know if your cluster is in production/productive (non-prod but still important) use, but if it is then they will help you avoid risks of downtime and if something goes wrong then management will have a contractual place to blame rather than just yourself.


Sorry to shy away, but I no longer work with AIX clusters, so I can't explore for you.




Kind regards,
Robin

Last edited by rbatte1; 11-17-2017 at 05:47 AM..
# 3  
Old 11-17-2017
Your exploration on AIX Clusters would be quite appreciated.
Thanks rbatte1
# 4  
Old 11-22-2017
It seems you do not really understand how a HACMP cluster works, so a few words for clarification. Bear with me if this is already known. Also note that i will leave out a lot of details as i can't write a complete PowerHA documentation here.

OK, let us start with the central term in HACMP, which is "resource group". What is it?

Look at an application, say, a database: for it to run you first need some file systems, where the DB files are stored. Then you need some processes (the DB process(es) running - basically the application has to be started. Finally you need an IP address under which clients from outside can connect to the database and use its services.

Exactly these three components - file systems, (started) processes and an IP address is what a "resource group" consists of.

File systems: one or more volume groups go into a resource group. When a resource group goes active all these VGs are activated on one cluster node and all the file systems in it are mounted there. In case of a resource group move the FSs are unmounted, the VGs are deactivated on the node, then activated on another node and all the FSs mounted there. HACMP does this itself for all the VGs defined in a resource group.

Processes: for each resource group there is a so-called "application monitor", a collection of a start- and a stop-script. Whenever a resource group is deactivated the stop script is executed. It should make sure the application is down, so that afterwards the file systems can be unmounted. When a RG is activated its start script is executed and should start the application. These start-/stop-scripts are provided by you and are simple shell scripts so it is easy to integrate all sorts of applications into HACMP.

Finally the IP address: every RG can have one (or several, but typically one) "service addresses". These service addresses are normal IP addresses which are added to a certain network adapter when the RG starts and remove when it stops. Technically they are IP aliases which are added to network interfaces.

An RG-start now works like this: the VGs are acquired (varyon), the filesystems are mounted, then the start-script of the application monitor is executed. finally the service-IP-address is put onto a network interface and the clients can use it to connect to the application. If the RG is moved, the service-IP is taken down, the stop-part of the application monitor stops the application, FS are unmounted and VGs deactivated, then the start-procedure is done on another node. The client will notice that the service-IP is (after a short time) available again. If a node crashes the same as in an RG-move hapeens, only the stop-part is skipped (obviously). HACMP can handle that but you need to take care of the application part in your start-script eventually, like a cleanup in a DB in case of a sudden system shutdown, etc..

When your RG is in the state "UNMANAGED" it means that its FSs, processes, etc. are there, but not started via HACMP. Stop it using HACMP so that it becomes "OFFLINE" (meaning: not active on any node), then bring it online again. Now it should be in status "ONLINE" on a certain node. You can move it to another node from there.

A word about CSPOC: you should absolutely, positively use these commands, not the normal commands, to do LVM management. The reason is that for all the components i talked about above to work all the cluster nodes needs to share consistent information about how the parts of the resoufrce groups look like. The cluster commands do the same as the normal commands, but they distribute the changed information to the other nodes too. THIS IS VITAL!

You can get away with doing LVM operations if you do a "learning import" on the passive node afterwards, eventually a cluster synchronisation too. But why take such risks if there are commands to do exactly this without any risks involved at all?

I hope this helps.

bakunin

Last edited by bakunin; 11-22-2017 at 10:04 PM.. Reason: typo
These 2 Users Gave Thanks to bakunin For This Post:
# 5  
Old 11-23-2017
Hello Bakunin,
Thanks for your kind reply.

Actually, in my low-budget Customer environnement, this very HACMP cluster is only configured and used when needed, that's why boths nodes are UNMANAGED for instance.
And what risks you mentioned above could happended when using LVM commands instead of CSPOC ones ?

Kind Regards
# 6  
Old 11-23-2017
So, do they both have access to a shared disk? The volume group is the smallest disk entity that can define to share between them, so you can't usually have one logical volume/filesystem accessed on NodeA with a different one in the same VG accessed on NodeB.

If you force the issue, you can have both servers accessing the shared disk at the same time, but as you can imagine, there will be conflicts because there is no locking between them. Imagine that NodeA reads a directory. NodeB then updates it. NodeA is not aware (because it will have cached it) and may make a different change that NodeB is then not aware of. There will very quickly be conflict over the free block list, file names, timestamps etc. It is possible that replaced files will be seen separately and then have random parts overwritten as time progresses. You will end up with a filesystem that is corrupted badly and will require 'fixing', but it is pot luck what gets salvaged and what is lost/damaged.

Can you describe what resources you have? Do you have a shared IP address that clients connect to and you can move to the 'Active' node?


If you want an Active-Active style cluster for load balancing you may be looking at Oracle RAC (with the associated costs) or maybe achieve this with more servers. The servers running the application that needs the data would NFS mount from an HA cluster set up to serve up the disk and they handle passing the volume group & IP address that the application servers connect to. The NFS mount on the applications servers will wait if the NFS server (appears to be singular) goes away and should recover when it (probably the other node) makes it available again.

Of course, there is then the performance cost of NFS if that is an issue to you.


A better description of your configuration and application needs might get a more useful response to help you.



Kind regards,
Robin
# 7  
Old 11-26-2017
Quote:
Originally Posted by LoLo92
Actually, in my low-budget Customer environnement, this very HACMP cluster is only configured and used when needed
What do you mean by that? The whole point of a cluster is high-availability. If one of the nodes break the application still runs. If you know in advance when your node breaks you don't a cluster at all (although i don't believe such astute foretelling skills exist).

Quote:
Originally Posted by LoLo92
that's why boths nodes are UNMANAGED for instance.
I don't understand this. "nodes" are the systems taking part in the cluster. They cannot be "unmanaged". They can only have their cluster services started ("joined the cluster") or not.

"Unmanaged" is a state only a resource group can be in.

Quote:
Originally Posted by LoLo92
And what risks you mentioned above could happended when using LVM commands instead of CSPOC ones ?
I thought i described that in pretty detail: you have a cluster for the situations where something has (quite drastically) gone wrong. To make it possible that filesystems, volumes, etc. are taken over safely and started on the other node they share the information about how these FSes, LVs, etc. are built and in which state exactly they are right now. If you make changes to a LV (like increasing its size, etc.) and use normal LVM commands this information will not be propagated to the other nodes because these commands are not cluster-aware. If you use the respective CSPOC commands which indeed are cluster-aware they will do the same as the normal LVM commands but also use the clusters communication services (RSCT) to propagate this changed information to the other nodes immediately.

Again, you can get away with using "learning imports" on the other nodes to make the information consistent again, but why not just use the cluster commands, which do that automatically?

I hope this helps.

bakunin
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Clustering filesystems

SunOS 5.10 Generic_142900-15 sun4u sparc SUNW,SPARC-Enterprise How can I tell if "clustering" is being used in my shop? I have to file systems that are identical. These filesystems are nfs mounted. But how can I tell if they are being kept in sync as a result of clustering or some other... (2 Replies)
Discussion started by: Harleyrci
2 Replies

2. Shell Programming and Scripting

filesystems > 70%

I need a scrip that will show me the filesystems that are greater than 70%...but not sure how to filter using the df -h | grep Thank you for your help!! (6 Replies)
Discussion started by: eponcedeleonc
6 Replies

3. Shell Programming and Scripting

Filesystems using more than 75% capacity

i need to write a shell script for printing the list of filesystems whose disk utilization is more than 75%...i tried using df -h along with awk but cud'nt make the combination work.....:wall: when we do df -h then the filesystems which are using more than 75% capacity shud be printed according to... (11 Replies)
Discussion started by: xtatic
11 Replies

4. AIX

HACMP does not start db2 after failover (db2nodes not getting modified by hacmp)

hi, when I do a failover, hacmp always starts db2 but recently it fails to start db2..noticed the issue is db2nodes.cfg is not modified by hacmp and is still showing primary node..manually changed the node name to secondary after which db2 started immediately..unable to figure out why hacmp is... (4 Replies)
Discussion started by: gkr747
4 Replies

5. UNIX for Dummies Questions & Answers

mounted filesystems

how to check the record of previously mounted remote filesystems after the filesystems are unmounted .operating system is solaris 10 (0 Replies)
Discussion started by: ravijanjanam12
0 Replies

6. AIX

HACMP and Filesystems question

Hello all. Yes, I searched the forum first :) I have hacmp running on 2 nodes. I wish to add a filesystem to the primary node to and existing shared VG. Not sure how to do this in a cluster environment. My scenario: Create the fs in the hacmp screens in smit add in the fs to and... (1 Reply)
Discussion started by: mhenryj
1 Replies

7. UNIX for Advanced & Expert Users

resize filesystems

Dear All We have HP9000 rp7400 Server running with hp-ux 11i. Our Disk storage has two volume groups and are allocated as follows: 1. /#vgdisplay -v --- Volume groups --- VG Name /dev/vg00 VG Write Access read/write VG Status available... (1 Reply)
Discussion started by: mhbd
1 Replies

8. Filesystems, Disks and Memory

Vdump of two filesystems

Dear Experts, Is it possible to take a backup of two file systems using a single vdump command? Thanks Wilson (4 Replies)
Discussion started by: geraldwilson
4 Replies

9. Shell Programming and Scripting

Filesystems GT 95%

Hi How can I only print the file systems that are more than 95% full. I used the df -k output and tried to check for each file system and then print only the ones that meet the criteria... But my solution seems cloodgie ... (3 Replies)
Discussion started by: YS2002
3 Replies

10. UNIX for Advanced & Expert Users

Filesystems

my partner change the server's ip address and now i can't to mount the oracle's filesystem, what i do? i don't want to reinstall Unix. My unix is SCO UNIX 5.0.5 (9 Replies)
Discussion started by: marun
9 Replies
Login or Register to Ask a Question