Replacing faulty SG210 in HA setup

Hi,

I have a set of SG210 running UTM 9.510-5 firmware with active subscription.

Recently 1 of the SG210 had problem and we RMA the unit, a replacement unit was sent to us, but with a higher firmware version (9.705). I checked the Sophos UTM download page and seems UTM 9.510-5 is no longer available for download any more.

May I know what is the correct procedure to join the replacement unit back to the HA cluster?

1. Backup config file from existing working SG210
2. Go to MyUTM, license for the old faulty unit and change the serial number to the new unit
3. Go to High Availability setting in the existing working SG210 and change the operation mode to Off
4. Upgrade existing SG210 to same firmware as the replacement unit (downtime expected)
5. Connect the HA ports for both units
6. Configure HA setting at existing unit
7. Connect the WAN and LAN port of replacement unit

Is the above steps correct?

Thanks.

Patrick.

  • Hi Patrick.

    You won't have to change the serial in myutm because your license is normally included in the backup.

    First of all I would advise you to check if the replacement unit has the same hardware revision like the one in production. Currently we are getting sometimes RMA devices for HA clusters where the hardware revision does not match. Therefore check this here (Sophos Firewall, UTM, AP, RED: Find the revision number)

    There would be another option to bring your cluster back to production, but with the risk, that you have to upgrade the running one first:

    1. Delete the faulty device from your HA cluster
    2. Upgrade your running device to the same version like the RMA replacement unit
    3. Connect the replacement unit and your cluster should be back again

    Another option would be:

    1. Ask support to provide the needed firmware. They did provide me one in the past.
    2. But I would recommend to update afterwards.

    Regards,

    Thomas


    Sophos Gold Partner
    4TISO GmbH, Germany
    If a post solves your question click the 'Verify Answer' link.
  • Another option would be:

    1. import the backup to the new/replacement device // check configuration within new device 

    2. power up and change cables from old running to new device

    3. upgrade old device and rebuild cluster


    Dirk

    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

  • Hi Patrick and welcome to the UTM Community!

    The solutions proposed by Thom and Dirk are your choices.  Since your current firewall is running an ancient version, I bet Up2Dating it would cause a lockup with an out-of-space message.  I would use Thom's approach with two additions:

    • First, do a quick install on the new device so that you can Up2Date it to 9.707.  Do a Factory Reset and then do Thom's first two steps.
    • Continue once you're comfortable that all is well with the new device in place.
    • Instead of Up2Dating the current device, re-image it with the 9.707 ISO.  After you've re-imaged it, power it down, connect all the Ethernet cables for HA and power it up.

    Cheers - Bob

    PS You can come to the Community to see if an Up2Date can be trusted.  You're more secure to be running the newest, trusted version.

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Thanks guys, I may be going for Dirk's way as it may have a shorter down time?

    Questions:

    1. can the config file of the existing version (9.510-5) be loaded to the newer version (9.705 or newer)?
    2. will the subscription be active once I power up the replacement unit and load the config file? or I just need to download the license file from MyUTM website and upload to the replacement unit?

    My revised steps should be?

    1. Power up the replacement unit, update to latest firmware version.
    2. Load the config file to replacement unit, from existing unit
    3. Load the license file to replacement unit.
    4. Switch LAN and WAN cable to the replacement unit.
    5. Disable HA at the existing unit
    6. Update existing unit firmware to be the same as the replacement unit.
    7. Reset existing unit to factory default (required?)
    8. Connect HA cables between the 2 units and configure HA from replacement unit

    I've checked and both units are on the same hardware revision, so I guess should be fine.

    Thanks.

    Patrick.

  • Hi Patrick.

    You have to remember that you will lose a lot of information if you only restore the backup to a new factory reset device:

    • Logs
    • Reporting data
    • Mail in spool, if not empty
    • Quarantine mail

    This is why I try to avoid this kind of replacement and I gave you the advice to get this version from support. That is why I would not go with Dirk's suggested way. My preferred way is to get the cluster up and running again. In your case there are three possibilities to bring up the cluster again without loosing the data:

    1. Reimage the replacment device to the same version like the running one. Then reenable HA
    2. Install updates on the running one up to the version installed on the RMA device (9.705). Renable HA.
    3. Update the running device to latest firmware. Bring the RMA device into production behind running one and upgrade to the latest firmware. Factory reset RMA device. Bring up HA again.

    Regarding those options only one of them will give you a chance to bring the cluster into production without downtime. This is option one.

    Option 2 is the way with the least time and effort for you. But there is downtime while installing updates on the running device and a small risk to get in trouble during updates.

    Option 3 is the one is the one I never use, because it is option 2 with additional work. I would upgrade the cluster to latest version after being back in production again.

    If you agree with losing the data above, the steps would be:

    1. RMA device: power up and update to latest firmware version
    2. RMA device: restore backup and see if the restore is fine
    3. If RMA device is ok, switch over cables to running. You are back in production.
    4. Prepare HA

    If something got wrong you may put cables back.

    Then on the old device:

    1. Factory reset
    2. Temporary setup
    3. Update firmware to latest
    4. Factory reset
    5. Shutdown
    6. Do cabling for HA
    7. Power up
    8. HA will automatically get into production

    Regards,

    Thom


    Sophos Gold Partner
    4TISO GmbH, Germany
    If a post solves your question click the 'Verify Answer' link.
  • 1. Yes, you can load the config file to devices running newer versions

    2. License is included within config-backup.

    3. your steps are ok --- Toms steps for Option 3 are great

    4. YES ... using Toms way (Option 1 or 2) you keep your data.


    Dirk

    Sophos Solution Partner since 2003
    If a post solves your question, click the 'Verify Answer' link at this post.

  • Great posts, Thom and Dirk - very thoughtful.

    Patrick, you can also keep your logs, reporting and quarantine by backing them up and restoring with WinSCP:

    • Logs & Reporting in /var/log
    • Quarantine in /var/chroot-smtp/spool/quarantine

    Indeed, if the SMTP Proxy is in use, you will want to temporarily remove the domains on the 'Routing' tab to prevent the proxy from receiving further mails.  Then use the Mail Manager to ensure that mails in the spool are delivered.  You will want to do this just before you power down the existing unit and connect and power up the new unit

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi guys,

    Was thinking of doing the following steps:

    1. Load UTM 9.510-5 to replacement unit (sent email to Sophos support to request for ISO)
    2. Backup config from working unit and load to replacement unit
    3. Connect replacement unit and sync for HA

    However there is no response from Sophos for days, so I used the following steps:

    1. Update replacement unit to 9.705-7
    2. Backup config from existing unit and load into replacement unit
    3. Update existing unit (with some downtime)
    4. Connect replacement unit and sync for HA

    So far ok.

    Thanks guys.

    Patrick Law.