Intermittent Connectivity Loss Between Sophos XGS and Remote Machines Over IPSec S2S VPN

Description:

Since updating NSX in our remote Virtual Data Center, we have observed intermittent connectivity loss between our Sophos XGS136 (SFOS 20.0.3 MR-3-Build427) and machines on the other side of the IPSec S2S VPN tunnel.

This issue does not affect connectivity between local network computers and remote machines—only the firewall itself loses connection.

This is a critical problem, as our remote network contains an Active Directory (AD) controller. When the firewall loses connectivity, user authentication stops working.

Tests and Observations:

To pinpoint the issue, we have conducted several tests, and here are our findings:

  • When the issue occurs, Sophos cannot reach the AD server (Authentication > Servers > Test connection fails).
  • When the issue occurs, ping from the firewall (10.0.1.1) to any machine in the remote network (including the AD server 10.0.0.110) fails, while pinging from any other local machine works fine.
  • Our remote VDC provider analyzed network logs and confirmed that return packets are being sent from their side.
  • Packet capture in the Sophos web console confirms that, during the outage, return packets are seen but are not reaching the process initiating the ping.

Log Example (Showing the Outage Window):

A ping is sent every 5 seconds from Sophos (10.0.1.1) to the remote AD server (10.0.0.110). Note the gap between sequence 445 and 1004:

Fullscreen
1
2
3
4
5
6
[2025-02-21 11:36:05] 64 bytes from 10.0.0.110: seq=443 ttl=127 time=9.284 ms
[2025-02-21 11:36:10] 64 bytes from 10.0.0.110: seq=444 ttl=127 time=8.971 ms
[2025-02-21 11:36:15] 64 bytes from 10.0.0.110: seq=445 ttl=127 time=9.482 ms
--- Outage ---
[2025-02-21 12:22:45] 64 bytes from 10.0.0.110: seq=1003 ttl=127 time=9.232 ms
[2025-02-21 12:22:50] 64 bytes from 10.0.0.110: seq=1004 ttl=127 time=9.373 ms
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Captured Return Packet (seq 974) in Sophos Packet Capture (During the Issue):

This confirms that the remote server is responding, but the firewall does not process the reply:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Ethernet header
Source MAC address:[removed]
Destination MAC address: [removed]
Ethernet type IPv4 (0x800)
IPv4 Header
Source IP address:10.0.0.110
Destination IP address:10.0.1.1
Protocol: ICMP
Header:20 Bytes
Type of service: 0
Total length: 84 Bytes
Identification:11366
Fragment offset:0
Time to live: 127
Checksum: 63956
ICMP Header:
Type: 0
Code: 0
Echo ID: 51060
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Additional Notes:

  • The issue is intermittent, with outages lasting between 15 minutes and 2 hours.
  • Most of the time, the connection works fine.
  • Restarting the IPSec S2S VPN temporarily fixes the issue, but it eventually recurs.

Request for Assistance:

  1. What could be causing Sophos to stop processing return packets from the remote network while still capturing them?
  2. Why does local network traffic remain unaffected, but the firewall itself loses connectivity?
  3. How can we debug this further to pinpoint the root cause?

Below I attached screenshots of our IPSec VPN settings on both sides for reference:

Local site:



Remote site:



Removed 'subject' from subject
[edited by: Pawel_L at 1:33 PM (GMT -8) on 21 Feb 2025]