This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How to avoid a double reboot when doing a Sophos XG Firmware upgrade?

Hi - I have 2 550 firewalls in HA and at one point years ago I think I uploaded a .sig firmware file and did the upgrade that way which ended up rebooting both firewalls at once.  Since then I just wait till there's a popup window saying there's an update, and proceed to download and install without doing a manual upload.

We're not ready to move to v19 of the firmware, and would like to move from 18.5.2 to 18.5.4, which does show up on the firmware tab in the GUI.  So does the v19 firmware as well.  It doesn't seem like it should make any difference clicking install to the 18.5.4 firmware vs. the v19 when it comes to the HA and how the upgrade process works.  I would expect it to reboot one at a time.

To me - the documentation isn't really clear on what actions will cause both firewalls reboot at once.  I'm used to a few seconds of outage for these tasks, and want to avoid a 25 minute outage.

Can anyone clarify this for me? I don't want to be surprised by a large outage.

Thanks!



This thread was automatically locked due to age.
Parents
  • Hi Moltron5k,

    As stated with the link provided by Bharat. Therefore it won’t cause an outage.

    When you upgrade an HA device, the process is as follows:

    1. The primary device (device A) upgrades the secondary device (device B).
    2. Device B runs the new firmware and takes control of the network. It's now the primary device and device A is the secondary.
    3. Device A then upgrades and runs the new firmware. It's still the secondary device, but if you have configured the other device as a preferred primary, then the cluster will failover.

    Erick Jan
    Community Support Engineer | Sophos Technical Support
    Sophos Support Videos Product Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.

  • Thanks for the responses.  I think I need to rephrase the question.  Is there an upgrade scenario where both firewalls reboot at once while in HA mode?  This happened to me while in HA mode and I just can't remember what steps led to that situation.  I'm pretty sure firmware was manually uploaded in that case.  It was a few major versions back so maybe improvements have been made to eliminate this scenario when in HA mode.

  • Is there an upgrade scenario where both firewalls reboot at once while in HA mode? 

    During the HA Firmware upgrade, both Sophos XG will not reboot at once, HA firmware upgrade is followed by the upgrade process as per the link shared earlier.

    So, therefore, it won’t cause an outage, however, you might face the issue as you have manually uploaded the firmware  Download firmware from Sophos Licensing Portal  as you might not follow Sophos releases firmware updates through a phased mechanism, described here with manual uploading firmware.

    https://community.sophos.com/sophos-xg-firewall/b/blog/posts/firewall-firmware-release-process-and-timeline

    As you informed both firewalls rebooted once with the manual update which is not possible without human error, you would have registered the ticket with Sophos Support on the same day issue to find the root cause.

    It would be great if you create a ticket for HA update with Sophos Team they will boot the Firewall with the latest firmware for you. 

    Thanks and regards

    "Sophos Partner: Networkkings Pvt Ltd".

    If a post solves your question please use the 'Verify Answer' button.

  • Actually there is.

    See: https://docs.sophos.com/releasenotes/output/en-us/nsg/sf_190_rn.html

    NC-94863 CM HA zero downtime upgrade isn't supported if the firmware upgrade is scheduled on Sophos Central. When a scheduled firmware upgrade is run from Sophos Central, both HA devices restart at the same time. Upgrade the HA devices from the firewall's web admin console to maintain zero downtime.

    __________________________________________________________________________________________________________________

  • Thanks for that comment - for adding to the completeness of it all, if anyone were to searches this.

  • Hello Lucar Toni,

    sorry but what is the problem to program CFM:

    - first restart the slave node
    - find out if the slave node is fully functional after the restart and with the upgraded version of SFOS
    - restart the master node and migrate all active connections to the original slave node
    - find out if the original master node is fully functional after the restart and with the upgraded version of SFOS
    - if the original master node is preferred as master node, migrate all active connections to this node and declare it as master node again.

    Is this really such a problem to program CFM?!?

    After all, UTM v9 has been able to do this for more than 15 years...

    Regards

    alda

Reply
  • Hello Lucar Toni,

    sorry but what is the problem to program CFM:

    - first restart the slave node
    - find out if the slave node is fully functional after the restart and with the upgraded version of SFOS
    - restart the master node and migrate all active connections to the original slave node
    - find out if the original master node is fully functional after the restart and with the upgraded version of SFOS
    - if the original master node is preferred as master node, migrate all active connections to this node and declare it as master node again.

    Is this really such a problem to program CFM?!?

    After all, UTM v9 has been able to do this for more than 15 years...

    Regards

    alda

Children
  • Actually it is not a problem in the code. Instead the CM triggers the wrong command on the firewall. So That is something, which should be fixed soon. 

    SFOS can do the same, it triggers the job on the Firewall to upgrade (Same like UTM did). On SFOS, CM triggers the wrong job, which causes this problem. To change this code, there are certain limitations in place, which takes time. (You cannot change something like that over night). 

    __________________________________________________________________________________________________________________

  • What do you mean:  "which should be fixed soon."

    Month, quarter, year, etc.....

    Regards

    alda

  • As far as i know, this will be resolved soon. 

    __________________________________________________________________________________________________________________