Solaris 10 - Cluster Problem


 
Thread Tools Search this Thread
Special Forums UNIX and Linux Applications High Performance Computing Solaris 10 - Cluster Problem
# 8  
Old 10-26-2008
I was away for a while ... enjoying nice autumn weather here...

Well, you got to the reboot stage. Good.
Strange that the cluster does not come back after the reboot...

I think you no longer receive the first error regarding communication problems do you? Can you see any error message now? The log provided looks fairly OK, what happens after reboot? Suncluster is rather sensitive and talkative, when it boots.

If you can't see exact error message I will try to provide you with some basic guidance.
You sound like someone familiar with Solaris, but you said you're newbie, maybe you're doing some simple error like I did several times setting up my sun clusters Smilie

1/ Are you sure the quorum device is picked up?
Does cldevice list -v / cldevice refresh / clquorum list commands return anything other then error?

2/ Chances are your cluster is still in "installmode", scinstall gives a very clear statement after completing all initial config and leaving "installmode".

3/ Are your /etc/hosts, nsswitch.conf, resolv.conf, /etc/domainname files identical? Make them identical dot after dot on all nodes, once I solved a problem by re-ordering lines although I wasn't able to reproduce it and it was in early days of myself playing with Suncluster.

4/ Are you sure that the underlaying storage is connected correctly?
Perhaps have a chat about basic cluster concepts (shared storage) with someone experienced?
Verify that both nodes can see the same disks, (play with cfgadm -al, luxadm display, cfgadm -al -o show_FCP_dev, format, probe-scsi-all commands on both nodes to ensure they see the same storage).

5/ Did you labeled the disks?

6/ Did you sliced and then mounted filesystems identically ?
Disks have to be sliced identically but SVM names (/dev/md/dsk/d??) have to be uniqe within cluster!

7/ last resort: Do you work for a Sun Service Partner or have a close relationship to any?
I am thinking of EIS (Enterprise Installation Standards) DVD - this would greatly help you setting this up.

8/ (maybe this should go first): are you suing fairly new Solaris10 update?
Forget about all the early first, second releases, have somethig fresh and patched, Oooh I mentioned patches.. large topic, install lots of patchesm recommended, security, and finally SunCluster's patches (they're not for free).

When I install cluster I just install it alone in the first place, and after that I set up application-related things quorum, IPMP, IP, HAStoragePlus, various agents etc...).

I'll give you one of my install-logs below:

Code:
[root@node1:/]# scinstall 

  *** Main Menu ***

    Please select from one of the following (*) options:

      * 1) Create a new cluster or add a cluster node
        2) Configure a cluster to be JumpStarted from this install server
        3) Manage a dual-partition upgrade
        4) Upgrade this cluster node
        5) Print release information for this cluster node

      * ?) Help with menu options
      * q) Quit

    Option:  
    Option:  1


  *** New Cluster and Cluster Node Menu ***

    Please select from any one of the following options:

        1) Create a new cluster
        2) Create just the first node of a new cluster on this machine
        3) Add this machine as a node in an existing cluster

        ?) Help with menu options
        q) Return to the Main Menu

    Option:  1


  *** Create a New Cluster ***


    This option creates and configures a new cluster.

    You must use the Java Enterprise System (JES) installer to install 
    the Sun Cluster framework software on each machine in the new cluster 
    before you select this option.

    If the "remote configuration" option is unselected from the JES 
    installer when you install the Sun Cluster framework on any of the 
    new nodes, then you must configure either the remote shell (see 
    rsh(1)) or the secure shell (see ssh(1)) before you select this 
    option. If rsh or ssh is used, you must enable root access to all of 
    the new member nodes from this node.

    Press Control-d at any time to return to the Main Menu.


    Do you want to continue (yes/no) [yes]?  


  >>> Typical or Custom Mode <<<

    This tool supports two modes of operation, Typical mode and Custom. 
    For most clusters, you can use Typical mode. However, you might need 
    to select the Custom mode option if not all of the Typical defaults 
    can be applied to your cluster.

    For more information about the differences between Typical and Custom 
    modes, select the Help option from the menu.

    Please select from one of the following options:

        1) Typical
        2) Custom

        ?) Help
        q) Return to the Main Menu

    Option [1]:  1


  >>> Cluster Name <<<

    Each cluster has a name assigned to it. The name can be made up of 
    any characters other than whitespace. Each cluster name should be 
    unique within the namespace of your enterprise.

    What is the name of the cluster you want to establish [frontend]?  


  >>> Cluster Nodes <<<

    This Sun Cluster release supports a total of up to 16 nodes.

    Please list the names of the other nodes planned for the initial 
    cluster configuration. List one node name per line. When finished, 
    type Control-D:

    Node name:  node1
    Node name:  node2
    Node name (Control-D to finish):  ^D


    This is the complete list of nodes:

        node1
        node2

    Is it correct (yes/no) [yes]?  


    Attempting to contact "node2" ... done

    Searching for a remote configuration method ... done

    The secure shell (see ssh(1)) will be used for remote execution.

    
Press Enter to continue:  


  >>> Cluster Transport Adapters and Cables <<<

    You must identify the cluster transport adapters which attach this 
    node to the private cluster interconnect.

    Select the first cluster transport adapter for "node1":

        1) bge1
        2) bge2
        3) bge3
        4) Other

    Option:  2

    Will this be a dedicated cluster transport adapter (yes/no) [yes]?  

    Searching for any unexpected network traffic on "bge2" ... done
    Verification completed. No traffic was detected over a 10 second 
    sample period.

    Select the second cluster transport adapter for "node1":

        1) bge1
        2) bge2
        3) bge3
        4) Other

    Option:  3

    Will this be a dedicated cluster transport adapter (yes/no) [yes]?  

    Searching for any unexpected network traffic on "bge3" ... done
    Verification completed. No traffic was detected over a 10 second 
    sample period.



  >>> Quorum Configuration <<<

    Every two-node cluster requires at least one quorum device. By 
    default, scinstall will select and configure a shared SCSI quorum 
    disk device for you.

    This screen allows you to disable the automatic selection and 
    configuration of a quorum device.

    The only time that you must disable this feature is when ANY of the 
    shared storage in your cluster is not qualified for use as a Sun 
    Cluster quorum device. If your storage was purchased with your 
    cluster, it is qualified. Otherwise, check with your storage vendor 
    to determine whether your storage device is supported as Sun Cluster 
    quorum device.

    If you disable automatic quorum device selection now, or if you 
    intend to use a quorum device that is not a shared SCSI disk, you 
    must instead use scsetup(1M) to manually configure quorum once both 
    nodes have joined the cluster for the first time.

    Do you want to disable automatic quorum device selection (yes/no) [no]?  yes



    Is it okay to create the new cluster (yes/no) [yes]?  

    During the cluster creation process, sccheck is run on each of the 
    new cluster nodes. If sccheck detects problems, you can either 
    interrupt the process or check the log files after the cluster has 
    been established.

    Interrupt cluster creation for sccheck errors (yes/no) [no]?  yes


  Cluster Creation

    Log file - /var/cluster/logs/install/scinstall.log.3361

    Testing for "/globaldevices" on "node1" ... done
    Testing for "/globaldevices" on "node2" ... done

    Checking installation status ... done

    The Sun Cluster software is installed on "node1".
    The Sun Cluster software is installed on "node2".

    Starting discovery of the cluster transport configuration.

    The following connections were discovered:

        node1:bge2  switch1  node2:bge2
        node1:bge3  switch2  node2:bge3

    Completed discovery of the cluster transport configuration.

    Started sccheck on "node1".
    Started sccheck on "node2".

    sccheck completed with no errors or warnings for "node1".
    sccheck completed with no errors or warnings for "node2".


    Configuring "node2" ... done
    Rebooting "node2" ...

I don't know what application are you installing, but you're not even at half the way. So don't get frustrated too early, try to get interested in it, and treat it as a valuable challenge. It really is. I'll try to help you if I will be able to, and other here willl do the same. Please share some feedback with us, as we do it in hope for our own development. Looking forwared to hear from you!
# 9  
Old 10-27-2008
One more thing came to my mind overnight: it is important to note, that SC uses regular IP for inter-node communication, the addresses and subnets are pre-defined (may be changed) and are: 172.16.0.0/21.
As you can see the subnet is rather short resulting in large subnet beign used, the subnets behave like any other IP addresses on the system: they pop up in system's routing table and may ,,hide'' other routes. Please refer to:
Private Network (Sun Cluster Software Installation Guide for Solaris OS) - Sun Microsystems
# 10  
Old 11-17-2008
Hi buddy,

Sorry for replying so so late. I have gone through all your check list and trying figure out what is blocking me in this. Also I faced some resource(One of the node went to other team) problem at my test lab after last communication.

Anyways, I've got both servers now and will try to configure from scratch. Smilie

Thanks for your all guidance. Will present you my results soon.

Cheers
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. Solaris

Patching Procedure in Solaris 10 with sun cluster having Solaris zone

Hi Gurus I am not able to find the patching procedure for solaris 10 ( sol10 u11) to latest patchset with sun cluster having failover zones so that same I should follow. Take an instance, there are sol1 and sol2 nodes and having two failover zones like sozone1-rg and sozone2-rg and currently... (1 Reply)
Discussion started by: nick101
1 Replies

2. Solaris

Solaris Cluster Device Problem

I build up two node cluster (node1, node2) in virtualbox. For these two nodes I add 5 shared disk. (Also each node have own OS disk). 1 shared disk for vtoc 2 shared disk for NFS resource group 2 shared disk for WEB resource group When I finished my work; two nodes was ok and shared disk... (4 Replies)
Discussion started by: sonofsunra
4 Replies

3. Solaris

SVM metaset on 2 node Solaris cluster storage replicated to non-clustered Solaris node

Hi, Is it possible to have a Solaris cluster of 2 nodes at SITE-A using SVM and creating metaset using say 2 LUNs (on SAN). Then replicating these 2 LUNs to remote site SITE-B via storage based replication and then using these LUNs by importing them as a metaset on a server at SITE-B which is... (0 Replies)
Discussion started by: dn2011
0 Replies

4. Solaris

Solaris Cluster

Hi All, i have 2 zone 1- Oracle DB Primary Server 2- Oracle DB Secondary Server i make script do r sync between this 2 zones but i am planning to do is make Solaris cluster between this zones if the primary Server field the secondary server up and running automatically without any... (1 Reply)
Discussion started by: jamisux
1 Replies

5. High Performance Computing

Sun Solaris cluster

Hi all, I need ur help plz how sun cluster work and i need good reference for that. ur cooperate is appreciated (4 Replies)
Discussion started by: AshAdmin
4 Replies

6. High Performance Computing

Building a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris

Provides a description of how to set up a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris. More... (0 Replies)
Discussion started by: Linux Bot
0 Replies

7. Filesystems, Disks and Memory

Solaris and Cluster Patches

Simple question: After applying a cluster patch to a sun solaris box I am left with a root volume 81% full. I could run through the hassle of resizing the slices which is way too much work for a Ultra 5 running DNS only. Is there a way to clean up the /var/sadm/pkg area, aka dump the save info. (5 Replies)
Discussion started by: edkung
5 Replies
Login or Register to Ask a Question