This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

New RMA unit in HA pair

We've just received an RMA replacement for a SG430 in a HA pair , everything I've read says just remove old unit , connect all interfaces and power on then HA will automatically resync both units etc.

Done that but new unit not being recognised /syncing and only showing eth0 as physically up , spoke to Sophos support who stipulate the new RMA unit needs HA configuring regardless ?

There is a difference in firmware , current unit 9.501 - RMA unit 9.610 , so guess we need to update current to match but do we then still need to configure new for HA or should it just re-sync as suggested ?

Any advise would be welcome.



This thread was automatically locked due to age.
  • Hi Gregg - your first post - welcome to the  UTM Community!

    Yes, you will want have the RMA unit at the same version as the current Master.  I assume you meant that your Master is at 9.510 and the RMA unit is at 9.601.  The least-disruptive would be to re-image the RMA unit.  The 9.510 ISO is available at UTM Support Downloads.

    If the RMA unit isn't immediately taken over and re-syncing started by the current Master, first check that High Availability is still activated on the current Master.  If so, then try a Factory Reset on the RMA unit.

    You definitely don't want to configure anything in WebAdmin on the RMA unit.

    Please share your results here.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi Bob,

    thanks for answering , patched current master to 9.601 to match RMA unit and checked that HA was still activated and configured which it is then tried factory reset on RMA but still no sign of connectivity or syncing. Have changed cable used on the HA interfaces just in-case but no joy.

    Is it worth turning HA off / on at the master unit ?

    This is getting fairly critical now any other advice would be appreciated.

    thanks - Gregg

  • Hi Gregg,

    what’s the status of HA in the webadmin? Is the replaced node marked as dead?

    Best regards

    Alex

    -

  • Hi Alex,

     

    bit the bullet and did as suggested by Sophos support , logged in to RMA unit and turned on HA ( was off by default ) and bingo now all synced and functional. Thought that this process was supposed to be automated and didnt require any config on replacement RMA unit but obviously did in this case , also noticed this on Sophos "UTM Disaster Recovery" document:

    1. On the auxiliary node to be added Eth3 should be already configured for Automatic Configuration. If not then you may want to do a factory reset and then follow the steps below:
      • Access the appliance, configure the HA operation mode for Automatic Configuration

    So dont know if something has changed in this respect ?

    thanks - Gregg

  • Gents,

    I am curious about what happened, and gloriously - I am awaiting a new UTM to replace a dead one in an HA config so I can try and reproduce what I think happened here in a couple of days.  

    As GreggB said, it should be just plug it in and stand back and wait for it to work.   The only gotchas I know of are: 

    1. Different firmware version in the new box.  Must fix first. 
    2. Forgetting to delete the dead node from the running master.  Must remember next time.  
    3. Occasionally, database corruptions on the master cause the syncing of the new node for ever and ever and ever.  But this is so rare that I only check for this if the syncing never stops.  Then follow the process for fixing corrupt databases in an HA cluster.  Stop repctl on both.  Reset DB on both.   Start repctl on both.  

    I suspect what happened here was the dead node was not deleted from the running master.  BUT configuring the new box as a master will cause a different conversation that goes like this:

    • New Box: Hello I am a master (with no config or logs) submit to my will.  
    • Old Box:  I too am a master (with lots of config and logs we don't want to lose) submit to my will you scoundrel. 
    • Both:  Okay lets do the "who's uptime is the longest" check because the biggest is always better and wins..... 
    • Old Box:  Here is my enormous uptime. 
    • New Box: Dang nabbit, I am slave.   Please configure me, master. 
    • Old Box: I am quietly just going to forget about the dead none thing and replace it with this small-uptime upstart.
    • New Box: Syncing
    • You:  Phew that worked. 

    UPTIME is very very very important - because the master with the longest uptime wins.   Biggest is not always best.  Especially if you chop uptime to zero on the one you want to keep by rebooting it. 

    So when I get the new box, I will (do all the other stuff but) 

    1. I will not delete the dead node. 
    2. I will connect the new box without configuring it.  This should fail because dead node is blocking adding it to the cluster.  
    3. Configure it as master. 
    4. Connect it a second time and see if this overrides the dead node block. 

    Will keep you posted.

    All the best, 

    Adrien. 

  • Here's what I give my clients, Adrien.  Note that 3a implies that your malfunctioning UTM should be disconnected.

    1. If needed, do a quick, temporary install so that the new device can download Up2Dates.
    2. Apply the Up2Dates to the same version as the current unit, do a factory reset and shutdown.
    3. On the current UTM in use, on the 'Configuration' tab of 'High Availability':
       a. Disable and then enable Hot-Standby
       b. Select eth3 as the Sync NIC
       c. Configure it as Node_1
       d. Enter an encryption key (I've never found a need to remember it)
       e. Select 'Enable automatic configuration of new devices'
       f. I prefer to use 'Preferred Master: None' and 'Backup interface: Internal'
    4. Cable eth3 to eth3 on the new device.
    5. Cable all of the other NICs exactly as they are on the original UTM.
    6. Power up the new device and wait for the good news. 

     If you did try the approach you mentioned above, please let us know how it went.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA