Sophos Firewall: Limiting downtime when re-imaging devices in active-passive HA

Disclaimer: This information is provided as-is for the benefit of the Community. Please contact Sophos Professional Services if you require assistance with your specific environment.


Hi Community,

This thread outlines how to limit downtime when re-imaging Sophos Firewall configured in HA.

Warning: This process will still result in an outage. Please plan accordingly to account for this downtime.

Preparation

First, download the latest firmware installation image from Sophos Licensing Portal https://www.sophos.com/mysophos

If the firmware version you are looking is unavailable via Sophos Licensing Portal and you prefer to stay on that version, please contact Sophos Technical Support to request for this image. It may take at least two business days for this.

Note: Sophos Support recommends using the latest firmware version. More info: Sophos Firewall Release Notes & News

The article applies to active-passive HA.

The steps below assume the following:

  • Node 1 is the current HA primary Node, and it is also the initial HA primary Node, which carries a purchased license subscription.
  • Node 2 is the current HA auxiliary Node.

To check if a Sophos Firewall in active-passive HA is the initial HA primary Node:

  1. Log on to the Sophos Firewall SSH terminal using the admin account. Once authenticated, you’ll be presented with the Sophos Firewall console menu
  2. Go to 5. Device Management > 3. Advanced Shell
  3. Run the following commands:
    • nvram get "#li.serial"
      • The serial number of the Sophos Firewall is then displayed
    • nvram get "#li.master"
      • if output of nvram get "#li.master" is YES as shown below, then the Sophos Firewall is the initial HA primary Node:
        XG210_WP02_SFOS 19.0.1 MR-1-Build365# nvram get "#li.master"
        YES
    • Note: The command nvram get "#li.master" is only used on Sophos Firewalls in active-passive HA, to identify which node is the initial primary node. There’s no concept of an initial primary node in active-active HA.

Scenario 1: reimage HA auxiliary node

  1. If Sophos Firewall is registered to Sophos Central, it is recommended to deregister it. Then go to Sophos Central to make sure Sophos Firewall is removed.
  2. Download the configuration backup from Node 1, and save it to the local computer.
  3. On Node 1 web admin GUI, disable HA
    • If Node 2 is connected to Node 1, once HA is disable on Node 1, Node 2 will reboot with factory default settings, except for the admin password and peer administration IP.
    • Don't disable HA on Node 2 webadmin GUI.
  4. Reimage Node 2 to the same firmware version and firmware build number of Node 1. Here is guide for reimage.
  5. Initialize Node 2, configure WAN interface, and connect the cable on it to allow it access to the Internet. Don't configure any LAN or DMZ interface yet.
  6. Ensure Node 2 has the same firmware version and build number as Node 1. Click here for details.
  7. Make sure msync service is UNTOUCHED or STOPPED on both Node 1 and Node 2. Click here for details.
  8. Configure active-passive HA as per product document "Configure active-passive HA using interactive mode".
    • Ensure Node 1 is configured as the primary Node, and Node 2 as the auxiliary node.
    • Note: An outage will occur.

Scenario 2: reimage HA primary node

  1. If Sophos Firewall is registered to Sophos Central, it is recommended to deregister it. Then go to Sophos Central to make sure Sophos Firewall is removed.
  2. Download the configuration backup from Node 1 and save it to the local computer.
  3. On Node 1 webadmin GUI, perform HA failover to make Node 2 the primary node.
    • Click on "Switch to passive device", as below
    • Note: An outage will occur.
  4. Reimage Node 1 with the same firmware version and firmware build number as Node 2. Here is guide for reimage.
  5. Initialize Node 1, configure WAN interface, and connect the cable on it to allow it access to the Internet. Don't configure any LAN or DMZ interface yet.
  6. Unplug all cables from Node 1, except the one connecting to your laptop.
  7. Restore configuration to Node 1.
  8. Cut traffic over from Node 2 to Node 1.
    • Note: 2nd outage will occur.
  9. Factory reset Node 2.
  10. Initialize Node 2, configure WAN interface and connect the cable on it to allow it access to the Internet. Don't configure any LAN or DMZ interface yet.
    • Register Node 2 serial number if it has not been done.
  11. Ensure Node 2 has the same firmware version and build number as Node 1. Click here for details.
  12. Make sure msync service is UNTOUCHED or STOPPED on both Node 1 and Node 2. Click here for details.
  13. Configure active-passive HA as per product document "Configure active-passive HA using interactive mode".
    • Ensure Node 1 is configured as the primary Node and Node 2 as the auxiliary node.
    • Note: 3rd outage will occur.

Steps 9 and 10 are to get Node 2 prepared for HA, and they’re not necessary if you can properly re-configure IP address of all interfaces on Node 2.

Scenario 3: reimage both HA nodes and upgrade them to the latest firmware

  1. If Sophos Firewall is registered to Sophos Central, it is recommended to deregister it. Then go to Sophos Central to make sure Sophos Firewall is removed.
  2. Download the configuration backup from Node 1, and save it to the local computer.
  3. On Node 1 webadmin GUI, perform HA failover to make Node 2 the primary node.
    • Click on "Switch to passive device", as below
    • Note: An outage will occur.
  4. Reimage Node 1 with the latest firmware. Here is the guide for reimage.
  5. Initialize Node 1, configure the WAN interface and connect the cable on it to allow it access to the Internet. Don't configure any LAN or DMZ interface yet.
  6. Unplug all cables from Node 1 except the one connecting to your laptop.
  7. Restore configuration to Node 1.
  8. Cut traffic over from Node 2 to Node 1.
    • Note: 2nd outage will occur.
  9. Reimage Node 2 with the same firmware as Node 1.
  10. Initialize Node 2, configure the WAN interface and connect the cable on it to allow it access to the Internet. Don't configure any LAN or DMZ interface yet.
    • Register Node 2 serial number if it has not been done.
  11. Make sure Node 2 has the same firmware version and firmware build number as Node 1. Click here for details.
  12. Make sure msync service is UNTOUCHED or STOPPED on both Node 1 and Node 2. Click here for details.
  13. Configure active-passive HA as per the product document "Configure active-passive HA using interactive mode".
    • Make sure Node 1 is configured as the primary Node and Node 2 as the auxiliary node.
    • Note: 3rd outage will occur.

Scenario 4: rebuild HA after RMA of auxiliary node

Assume the following

  • Primary Node 1 is running as HA standalone.
  • Auxiliary Node 2 gets RMA, and RMA replacement has arrived.

Here are steps to rebuild HA after RMA of the auxiliary node

  1. Initialize RMA replacement, configure the WAN interface and connect the cable to allow it to access the internet. Don't configure any LAN or DMZ interface yet.
  2. Register it to Sophos Licensing Portal, https://www.sophos.com/mysophos, and activate 30 30-day trial license on it.
  3. Make sure the appliance has the same firmware version and firmware build number as Node 1. Click here for details.
  4. Disable HA on Node 1, if you haven't
  5. Make sure msync service is UNTOUCHED or STOPPED on both RMA replacement and Node 1. Click here for details.
  6. Configure active-passive HA as per the product document "Configure active-passive HA using interactive mode"

Scenario 5: rebuild HA after RMA of the primary node

Assume the following

  • Primary Node 1 gets RMA, and RMA replacement has arrived.
  • Auxiliary Node 2 is running as HA standalone.

Here are steps to rebuild HA after the RMA of the primary node

  1. If Sophos Firewall is registered to Sophos Central, it is recommended to deregister it. Then go to Sophos Central to make sure Sophos Firewall is removed.
  2. Download the configuration backup from Node 2 and save it to the local computer.
  3. Reimage RMA replacement with the same firmware version and build number as Node 2. Here is the guide for reimage.
  4. Initialize RMA replacement, configure the WAN interface and connect the cable on it to allow it access to the Internet. Don't configure any LAN or DMZ interface yet.
  5. Restore configuration to RMA replacement
  6. Cut traffic over from Node 2 to RMA replacement.
    • Note: an outage will occur.
  7. Factory reset Node 2.
  8. Initialize Node 2, configure the WAN interface and connect the cable to allow access to the Internet. Don't configure any LAN or DMZ interface yet.
    • Register the Node 2 serial number if it hasn’t been done.
  9. Ensure Node 2 has the same firmware version and build number as Node 1. Click here for details.
  10. Make sure HA is disabled on RMA replacement.
  11. Ensure msync service is UNTOUCHED or STOPPED on the RMA replacement and Node 2. Click here for details.
  12. Configure active-passive HA as per the product document "Configure active-passive HA using interactive mode".
    • Ensure the RMA replacement is configured as the primary Node and Node 2 as the auxiliary node.
    • Note: 2nd outage will occur.
  13. Transfer license from Node 1 to RMA replacement. Here is the KBA for license transfer.

Steps 7 and 8 are to get Node 2 prepared for HA, and they’re not necessary if you can properly re-configure IP address of all interfaces on Node 2.

Appendix

Check the pattern and hotfix version

  • Please run Advanced Shell command
    cish -c "system diag sh ver"
  • Firmware Version and Firmware Build need to be matched on both HA nodes.
  • Here is an example output
    XGS2100_RL01_SFOS 19.0.1 MR-1-Build365# cish -c "system diag sh ver"

    Serial Number: X21010M00000000
    Device-Id: 34c499dbbff39fa000000000000000000000
    Appliance Model: XGS2100
    Firmware Version: SFOS 19.0.1 MR-1-Build365
    Firmware Build: 365
    Firmware Loader version:
    HW version: RL01
    BIOS Version: Ver-V111 Rev-5.14
    NPU version: rootfs-2022.0526-1509-670-v19.0.Dev.040.Akamaru
    Uboot version: U-Boot 2019.10-10.3.8.0-1 (Jun 13 2021 - 19:39:53 -0400)
    CPLD version: Module version
    AMDA0202-0001 0x05000008
    AQR version: -
    Config DB version: 19.005
    Signature DB version: 19.005
    Report DB version: 19.005
    Web Proxy version: compiled
    SMTP Proxy version: 1.0
    POP/IMAP Proxy version: 1.0.0.3.4
    Logging Daemon version: 0.0.0.17
    AP Firmware: 11.0.019
    ATP: 1.0.0437
    Avira AV: 1.0.420180
    Authentication Clients: 1.0.0020
    Geoip ip2country DB: 2.0.014
    IPS and Application signatures: 18.19.76
    Sophos Connect Clients: 2.2.000
    odt: 1.0.006
    RED Firmware: 3.0.008
    Sophos AntiSpam Interface: 1.0.238
    Sophos AV: 1.0.18191
    SSLVPN Clients: 1.0.009
    Hot Fix version: 5
    Hotfix tag: HF092122.1

    XGS2100_RL01_SFOS 19.0.1 MR-1-Build365#

Check msync service status

  • Please run Advanced Shell command
    service -S | grep msync
  • For HA setup, it needs to be STOPPED or UNTOUCHED.
  • Here is the expected output
    XGS2100_RL01_SFOS 19.0.1 MR-1-Build365# service -S | grep msync
    msync UNTOUCHED
    XGS2100_RL01_SFOS 19.0.1 MR-1-Build365#

Edition History

  • 2022-10-18, removed content of SFOS v17; renamed some scenarios; updated URLs; renamed XG firewall to Sophos Firewall
  • 2022-02-06, updated URLs
  • 2021-10-18, added scenario "rebuild HA after RMA of auxiliary node"
  • 2021-09-07, fixed typo
  • 2021-02-17, major update
  • 2020-02-07, first edition




Grammar
[edited by: emmosophos at 12:15 AM (GMT -8) on 11 Nov 2023]