This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

High availability Active - Passive Cluster. Can't create cluster..

Hi All,

Due to the data disk filling up, I'm in the process of creating a new Active Passive cluster, with no success.

I've built the two nodes (UTM1 &2) The only configuration I've done is named them.  When I try to create the cluster by using UTM1 by setting it with ID1, specifying the correct nic (eth2) and specifying a password, it saves the change, though then states "There are no master/slaves nodes connected to the system at the moment." 

Nothing appears in the H/A log.

If I try and create the cluster on UTM 2, It recognises that there is one node in the cluster, UTM2 which is the master, and If I then try and add UTM1, UTM1 will shut down but the H/A logs show the following error:

2016:07:06-10:19:09 l5utm2-new-2 ha_mode[24080]: [ ok ]
2016:07:06-10:19:09 l5utm2-new-2 ha_mode[24080]: master done (started at 10:19:05)
2016:07:06-10:19:21 l5utm2-new-2 ha_daemon[23717]: id="38A0" severity="info" sys="System" sub="ha" seq="M: 22 21.575" name="Monitoring interfaces for link beat: eth0"
2016:07:06-10:21:38 l5utm2-new-2 ha_daemon[23717]: id="38A1" severity="warn" sys="System" sub="ha" seq="M: 23 38.870" name="Got misformed HA message type = 9 len = 64, msg.len = 28"
2016:07:06-10:21:38 l5utm2-new-2 ha_daemon[23717]: id="38A0" severity="info" sys="System" sub="ha" seq="M: 24 38.870" name="Autojoin of 198.19.250.113 granted! Searching for unused node ID..."
2016:07:06-10:21:38 l5utm2-new-2 ha_daemon[23717]: id="38A0" severity="info" sys="System" sub="ha" seq="M: 25 38.870" name="Found unused node id 1!"
2016:07:06-10:21:38 l5utm2-new-2 ha_daemon[23717]: id="38A0" severity="info" sys="System" sub="ha" seq="M: 26 38.870" name="New node 1"
2016:07:06-10:21:50 l5utm2-new-2 ha_daemon[23717]: id="38A1" severity="warn" sys="System" sub="ha" seq="M: 27 50.953" name="Got misformed HA message type = 9 len = 64, msg.len = 28"
2016:07:06-10:21:50 l5utm2-new-2 ha_daemon[23717]: id="38A0" severity="info" sys="System" sub="ha" seq="M: 28 50.953" name="Autojoin of 198.19.250.193 granted! Searching for unused node ID..."
2016:07:06-10:21:50 l5utm2-new-2 ha_daemon[23717]: id="38A0" severity="info" sys="System" sub="ha" seq="M: 29 50.953" name="198.19.250.193 cant join HA system, no free node id available!"

Have I missed a step somwehere?

Thanks in advance!



This thread was automatically locked due to age.
Parents
  • Hi Anth.

    The process should be (copied from https://community.sophos.com/products/unified-threat-management/f/52/t/30103):

    1) Remove the HA cluster config from main node (no need for restart), just to start fresh
    2) Factory reset the slave UTM or reinstall it (make sure that they are patched up to the same patch level).
    3) Set the main UTM as master HA node
    4) Add the secondary node to the HA

    Just make sure that your slave node has a lower uptime then your master node and that no configuration, except for the HA part, is done in the slave node. Basically, reset it to factory, reboot a few times and then try joining it to the HA again.

    Regards - Giovani

  • The HA connection is usually on eth3, but I believe an alternative should work now.  You should do NO configuration on the Slave node - just do the Factory Reset as Giovani notes.  In the Slave VM, make the Ethernet connections before you power it up - it needs the same as the Master and for the sync NIC to be connected to the sync NIC in the VM for the Master - this is what Giovani implies with step 4.

    In fact, I don't know why you couldn't expand the disk space for the Slave node, turn it back on, allow it to sync, force a failover to that node and then expand the disk in the first node.  If that's not possible, just follow Giovani's list with a new VM and then do it again after the new node is synced and READY.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
Reply
  • The HA connection is usually on eth3, but I believe an alternative should work now.  You should do NO configuration on the Slave node - just do the Factory Reset as Giovani notes.  In the Slave VM, make the Ethernet connections before you power it up - it needs the same as the Master and for the sync NIC to be connected to the sync NIC in the VM for the Master - this is what Giovani implies with step 4.

    In fact, I don't know why you couldn't expand the disk space for the Slave node, turn it back on, allow it to sync, force a failover to that node and then expand the disk in the first node.  If that's not possible, just follow Giovani's list with a new VM and then do it again after the new node is synced and READY.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
Children
No Data