Sophos Firewall: ISP Gateway Fluctuation

Disclaimer: This information is provided as-is for the benefit of the Community. Please contact Sophos Professional Services if you require assistance with your specific environment.


Overview:

This Recommended Read describes the troubleshooting steps & possible solutions when the ISP gateway is down.

Scenario:

  1. ISP connection goes down.
  2. ISP connection/WAN interface fluctuation.  
  3. The WAN interface goes DEAD.

 Required information:

  • How many ISP/WAN connections are there? 
  • Name of the ISP provider?
  • Are all the WAN ports are fluctuating, or any specific one? 
  • Note down the IP of the interface.
  • Note down the gateway IP.
  • Port number?
  • Is the ISP link directly connected with the Sophos WAN Interface? 

What to do: 

Step: 1 WAN Interfaces

  • Check the configuration of the WAN interface. 
  • Check the failover rule of the WAN interface from Interfaces > WAN link manager
  • To confirm the WAN interface status, review the dgd.log in /log directory via the advanced shell
  • Sophos Firewall: SSH to the firewall using PuTTY

Sample Logs

Nonworking logs:

DEBUG Jan 07 18:31:35 [27153]: Ping Result for: 8.8.8.8
DEBUG Jan 07 18:31:35 [27153]: Ping: F
DEBUG Jan 07 18:31:35 [27153]: Current Status [GW(Airtel,Port4)] : Dead
DEBUG Jan 07 18:31:35 [27153]: Sleep for 60 Seconds
NOTICE Jan 07 18:31:35 [27153]: Actiontree, Live to Dead
NOTICE Jan 07 18:31:35 [27153]: Actiontree, executing: Live_To_Dead @Airtel

Working logs:

DEBUG Jan 07 18:32:35 [27153]: Initiating Ping: 8.8.8.8
DEBUG Jan 07 18:32:35 [27153]: GW (Airtel, Port4) : Waiting for reply
DEBUG Jan 07 18:32:35 [27153]: Success, Retrying(1) Ping : 8.8.8.8
DEBUG Jan 07 18:32:35 [27153]: GW (Airtel, Port4) : Waiting for reply
DEBUG Jan 07 18:32:35 [27153]: Current Status: Dead
DEBUG Jan 07 18:32:35 [27153]: Ping Result for: 8.8.8.8\
DEBUG Jan 07 18:32:35 [27153]: Ping: S
DEBUG Jan 07 18:32:35 [27153]: Current Status [GW(Airtel,Port4)] : Live
DEBUG Jan 07 18:32:35 [27153]: Sleep for 60 Seconds
NOTICE Jan 07 18:32:35 [27153]: Actiontree, Dead to Live
NOTICE Jan 07 18:32:35 [27153]: Actiontree, executing: Dead_To_Live @Airtel

  • As we can see from the output, the Sophos WAN interface fails to ping the defined failover IPs, and then the WAN interface is detected as dead.
  • Try to ping the gateway IP of the specific ISP link and check the connectivity by pinging the gateway IP. 
  • Make sure the IP/subnets aren't clashing with other interfaces.

Note:

  • Keep the gateway address or the global DNS address (8.8.8.8) in the failover rule as a best practice. 
  • While creating the failover rule with two IPs(ISP gateway and global DNS), always prefer to choose the "OR" condition instead of "AND," so Sophos would have redundancy in case of either IP failure.

Step: 2 Negotiation Settings

  • Review the negotiation settings of the WAN interface using the below command. Use the respective port number below to get the specific output.

SFVUNL_SO01_SFOS 19.5.2 MR-2-Build624# ethtool Port1
Settings for Port1:
Supported ports: [ ]
Supported link modes: Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Port: Other
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Link detected: yes

SFVUNL_SO01_SFOS 19.5.2 MR-2-Build624#

  • If auto-negotiation is disabled, validate with the ISP provider for best practice or specific reason to keep it disabled. We suggest it should be enabled so Sophos can detect the negotiation change and adjust accordingly.
  • If the link is detected as "No," please understand how the ISP link is connected to the Sophos WAN interface. We suggest terminating the unmanaged switch between the ISP modem and Sophos WAN interface to ensure the negotiation change is broadcast to both devices.
  • Further, to cross-check the physical connectivity of the Sophos ports and cabling:

Use the below command in a frequency of a couple of minutes to validate the cabling issues. Please use the respective port number in the below command to get the specific output.

Instance 1:

SFVUNL_SO01_SFOS 19.5.2 MR-2-Build624# ifconfig Port1
Port1 Link encap:Ethernet HWaddr FA:16:3E:4E:11:2D
inet addr:192.168.53.165 Bcast:192.168.53.255 Mask:255.255.255.0
inet6 addr: fe80::f816:3eff:fe4e:112d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:688036 errors:548 dropped:88746 overruns:0 frame:0
TX packets:24 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:71195321 (67.8 MiB) TX bytes:2696 (2.6 KiB)

Instance 2:

SFVUNL_SO01_SFOS 19.5.2 MR-2-Build624# ifconfig Port1
Port1 Link encap:Ethernet HWaddr FA:16:3E:4E:11:2D
inet addr:192.168.53.165 Bcast:192.168.53.255 Mask:255.255.255.0
inet6 addr: fe80::f816:3eff:fe4e:112d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:688036 errors: 567 dropped: 88863 overruns:0 frame:0
TX packets:24 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:71195321 (67.8 MiB) TX bytes:2696 (2.6 KiB)

  • As we can see in the output, the drops and errors are increasing. Here, we should consider the cable is faulty, and changing the cable would help.
  • Please make sure that the MTU/MSS settings of the interface are configured as per the requirement or suggested by the ISP.

Step: 3 ARP

  • It could be possible that the ISP gateway is losing the ARP of the firewall WAN IP, which might cause the WAN link to be dead.
  • At the time of the issue, 
    • Either, we can do ARP ping via console using the below steps:
      • Take the SSH session.
      • Select option 4 > Console.
      • Run the command --> system diagnostics utilities arp ping source <WAN_IP> interface <WAN Interface> <gateway IP>
      • (e.g., system diagnostics utilities arp ping source 192.168.10.10 interface Port4 192.168.10.1)
      • This will force ARP updates to the WAN interface.

    • Or, you can do ARP ping via Advance shell using the below steps:
      • Take the SSH session.
      • Select option 5 > option 3 > Advance Shell
      • Run the command --> arping -I Port1 <gateway IPADDRESS>
  • Alternatively, we can open the interface and hit save (without making any changes) to trigger the ARP request.
  • If the issue still occurs after triggering the ARP ping manually, please log a support case with all the above information to investigate the issue further - support.sophos.com.


Revamped RR Revised RR Updated Link & Firmware version Corrected Grammar Corrected Font Size Added Table of Contents & Horizontal Lines
[edited by: Erick Jan at 2:15 PM (GMT -7) on 26 Sep 2023]
Parents
  • In Step 1, notes... using the gateway is not a good destination for determining Internet functionality - it is only next hop; on prem IP - it is not a guarantee of internet.
    Choosing OR is insufficient for determining status. 

    ex.
    8.8.8.8
    or
    Gw IP

    Just because 8.8.8.8 doesn't respond does not mean internet is down - it is a single (usually reliable) IP.  1 IP should not be considered an absolute answer.
    GW IP going down - generally speaking, if it doesn't respond, yes, your internet is down (but I have seen cases where it is not set to respond to ICMP packets)


    8.8.8.8
    or
    1.1.1.1

    Just because one does not respond - does not mean internet is down.  It means a particular route, or the IP is not responding.


    8.8.8.8
    AND
    Gw IP

    Your internet could be down, but Gateway IP will respond because it's next hop and on premise.


    8.8.8.8
    AND
    1.1.1.1

    If BOTH are not responding, then there's a significant chance your internet (through that WAN port) is not working properly.

Reply
  • In Step 1, notes... using the gateway is not a good destination for determining Internet functionality - it is only next hop; on prem IP - it is not a guarantee of internet.
    Choosing OR is insufficient for determining status. 

    ex.
    8.8.8.8
    or
    Gw IP

    Just because 8.8.8.8 doesn't respond does not mean internet is down - it is a single (usually reliable) IP.  1 IP should not be considered an absolute answer.
    GW IP going down - generally speaking, if it doesn't respond, yes, your internet is down (but I have seen cases where it is not set to respond to ICMP packets)


    8.8.8.8
    or
    1.1.1.1

    Just because one does not respond - does not mean internet is down.  It means a particular route, or the IP is not responding.


    8.8.8.8
    AND
    Gw IP

    Your internet could be down, but Gateway IP will respond because it's next hop and on premise.


    8.8.8.8
    AND
    1.1.1.1

    If BOTH are not responding, then there's a significant chance your internet (through that WAN port) is not working properly.

Children
No Data