Above is the mount for the production system. We use SRDF to copy this data to a DR site and then clone the storage and boot up the same hardware at the DR site; Cisco blade servers (same models) as in production.
When we boot up the DR nodes, the system understandably gets a little 'confused' and will mount /boot to a RAW device.
/dev/sdy1 976M 175M 751M 19% /boot
Note /dev/sd* instead of the MPATH device /dev/mapper/***
So to fix this:
First, I run a 'multipath -W' and this corrects the extra WWIDs in /etc/multipath/wwids file.
successfully reset wwids
Now the WWIDS from the production side are gone and only the new ones that are at the DR site exist - So that file is now a happy file.
After that, I need to add a filter to /etc/lvm/lvm.conf to ignore any devices aside from the /dev/mapper devices (RedHat support suggested that I add the global_filter as well - but it seemed to work ok with just 'filter', but it didn't hurt either..
The server comes back up in either rescue or emergency mode (I'll pay more attention next time) and EACH TIME, running grub2-mkconfig fixes it - and the server boots just fine.
I need to figure out what's going on for my own geeky-obsessive-ness. The thing is, I saved a backup copy of /boot/grub2/grub.cfg and compared it to the new one that was generated in emergency mode and there are zero differences. I used notepad ++ and did a file comparison - even adding a character to verify the plug-in was working right and I can find no difference at all between the two files.
I thought that grub2-mkconfig just generated a new grub.cfg file, but it almost seems like something else is going on here as well.
It's not that I can't get these servers back online, it's just that I would like to skip the reboot into rescue mode - as we are looking to automate this process as much as possible.
We have recovered these 4 nodes a couple of times - this process seems consistent. I just can't figure out what change grub2-mkconfig is making to the system to get it to boot!
I have no experience with boot from SAN on metal, to put a small disclaimer upfront.
Not related to the problem...what would really help you is to have a separated installed operating system on both environments (DCs), with data part replicated in separate volume groups.
That way, only thing you do is import the volume group and (possibly or not depends) an ip assignment, depending on the topology and services such as vrrp or higher layer used.
I never endorsed SAN boot on metal personally (but in vms of course), always had couple of local disks to mirror ...
Point being, i would go with service separation and clean install on each of the sites, while cloning data part via storage methods.
Hopefully someone else who used linux SAN boot will be of more assistance with the actual problem.
The Following User Says Thank You to Peasant For This Useful Post:
This is the method that was decided upon... we have become pretty endeared to using recovered Virtual Machines or volumes from our Production side; overall it has reduced our RTO significantly due to the ease of importing synchronized data.
Part of the long standing issue we have had with maintaining the OS portion at the DR site is changes made in production that don't get replicated to the DR environment. Those changes are supposed to be done, but often still get missed. While there is always the 'so and so forgot to...' and then management whines about it; doing a full replication of even the boot environments simply factors out those human problems. For instance - a new mount point is added or an old mount point is removed.. a ton of little annoying things. Then when we test out DR processes, sometimes these 'little things' that we are unaware were changed, can cost us large amounts of time. Lately, we have been consolidating ZPOOLS on Solaris and every time we test - we run into issues with them.
But overall this works very well. Everything is 100% except for this /boot MPATH issue; and honestly - the systems will run on a single path just fine. But you know.. that's not good enough for a geek such as myself! lol