Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

VPN with Drayteks constant disconnects

Hi,

I have a bunch of XGS firewalls in main offices of my customers, which have branch/remote offices with Draytek routers, different models. I have not paid attention till now, when one of those reported intermittent issues with Site2Site IPSec VPN. I looked in IPSec logs on few XGS firewalls, and I can see all tunnels have frequent disconnects/reconnects, like can be seen below.

I have IKE key times slightly differently set on both sides, say 28800 on one side and 30000 on the other. Could this be an issue?

Weird fact: out of 8 branch offices, all VPN settings are the same, same Drayteks....but only 2 of 8 constantly terminate, the rest maybe once per day.

Any idea where to start resolving?

Time Log comp Status Message
16.09.2024 11:56 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 11:42 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 11:22 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 11:06 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 10:32 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 10:32 IPSec  Deny Received IKE message with invalid SPI (37D3016A) from the remote gateway.
16.09.2024 10:32 IPSec  Deny Received IKE message with invalid SPI (37D3016A) from the remote gateway.
16.09.2024 10:31 IPSec  Deny Received IKE message with invalid SPI (37D3016A) from the remote gateway.
16.09.2024 10:31 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 10:31 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 10:31 IPSec  Failed VPN_OFFICE-1 - IKE message (20003720) retransmission to 11.22.59.88 timed out. Check if the remote gateway is reachable. (Remote: 11.22.59.88)
16.09.2024 10:31 IPSec  Failed VPN_OFFICE-1 - IKE message (20003720) retransmission to 11.22.59.88 timed out. Check if the remote gateway is reachable. (Remote: 11.22.59.88)
16.09.2024 10:14 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 09:50 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 09:37 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 09:13 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 09:00 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 08:39 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 08:23 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 08:02 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 07:49 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 07:24 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 07:12 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 06:47 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 06:34 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 05:57 IPSec  Established VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 established. (Remote: 11.22.59.88)
16.09.2024 05:57 IPSec  Deny Received IKE message with invalid SPI (E759F7AD) from the remote gateway.
16.09.2024 05:57 IPSec  Deny Received IKE message with invalid SPI (E759F7AD) from the remote gateway.
16.09.2024 05:56 IPSec  Deny Received IKE message with invalid SPI (E759F7AD) from the remote gateway.
16.09.2024 05:56 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 05:56 IPSec  Terminated VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)
16.09.2024 05:56 IPSec  Failed VPN_OFFICE-1 - IKE message (2C50) retransmission to 11.22.59.88 timed out. Check if the remote gateway is reachable. (Remote: 11.22.59.88)
16.09.2024 05:56 IPSec  Failed VPN_OFFICE-1 - IKE message (2C50) retransmission to 11.22.59.88 timed out. Check if the remote gateway is reachable. (Remote: 11.22.59.88)


Edited TAGs
[edited by: Erick Jan at 10:43 AM (GMT -7) on 16 Sep 2024]
  • Hi Andrej,

    Thank you for reaching out to Sophos Community.

    Kindly check the following similar post 

    community.sophos.com/.../site-to-site-vpn-problem-invalid-spi

    Erick Jan
    Community Support Engineer | Sophos Technical Support
    Sophos Support Videos Product Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.

  • Hi Eric,

    Thank you, but that was my base for setting up VPN tunnels. I used recommended Key lifetimes and retransmit times from built-in BranchOffice and MainOffice examples in XGS. And as this is the same logic as described in the document, I can be sure it is setup properly.

    But from logs I can see VPN tunnel dropping every few minutes, so it's not key lifetime.

    EDIT: Well, weird things are going on here on my XGS. I should probably open another discussion - All PING packets have frequent dropouts, be it from LAN --> Sophos LAN adapter, or from LAN --> to WAN or VPN tunnel. While everything within LAN network, except Sophos LAN adapter, pings indefinitelly without any dropped packet, so switches are OK, no loops, no duplicate IPs.

    There must be something with this XGS, either config failure, or hardware failure (new box, weird).

  • Hi again,

    two days later still weird problems on VPN.

    The packets dropping was an issue on XGS, it was port failure, and we replaced whole unit with new XGS.

    Now the issue with IPSec VPN tunnels remains:

    • Some of Site2Site IPSec VPN tunnels keep disconnecting and connecting every 1-2 minutes, while other tunnels are rock stable, not even one single disconnect. All are configured the same.
    • Another problem is one server within LAN, which dials out OpenVPN tunnel by itself. It also keeps dropping every now and then. And I can see a ton of these errors in IPSec Logs. I have no idea what VPN this relates to...is it one of VPN tunels configured on Sophos, or is it the one, which only traverses it?
      messageid="18050" log_type="Event" log_component="IPSec" log_subtype="System" status="Deny" user="" con_name="" con_type="0" src_ip="" gw_ip="" local_network="" dst_ip="" remote_network="" additional_information="" message="Received IKE message with invalid SPI (CC85BCAE) from the remote gateway." 
      What could this be caused by?
  • Hi  ,

    Around this time of the log, could you get the /log/charon.log and see if there is any disconnect being initiated by either SFOS or Draytek and what is the possible reason for disconnect?

    16.09.2024 11:56 IPSec  Terminated

    VPN_OFFICE-1 - IPSec Connection VPN_OFFICE-1 between 11.22.59.88 and 33.44.33.44 for Child VPN_OFFICE-1 terminated. (Remote: 11.22.59.88)

    Take any one tunnel that is experiencing frequent disconnects, and correlate the logs on Draytek and SFOS around the disconnect time; 

    Regarding timers: if you set the IPsec tunnel on Draytek as Initiator (assuming each of Dratek are Initiators aka Branches), keep the tunnel as 'Responder' (Head office) on SFOS; on Draytek use less value for Phase1 and Phase2 values in comparison with the Phase1 and Phase2 values of SFOS. This will ensure each IKE level rekey is always done by Draytek and collision of IKE rekey by Draytek and SFOS is avoided, that would cause tunnel instability/disconnects.

    OpenVPN and the log you mentioned "Received IKE message with invalid SPI" are two different things and unrelated. 

    You may ignore - "Received IKE message with invalid SPI" message, in case if some unknown sender tries to initiate IKE session to your SFOS and if there is no relevant config present on your SFOS, then such incoming packets are considered as invalid and gets dropped.

  • Hi  ,

    thank you for explanation! 

    Yes, I have used all IKE rekey and rey expiration values lower on Sophos (Responder) and higher on Drayteks (Branch offices, initiators). By comparing logs we suspect DPD might be causing troubles, so we created test profile without DPD and will see logs tomorrow.

  • You should use lower phase1 rekey value on the Initiator tunnels (on Draytek) compared to responder tunnel (on SFOS); similarly phase2 rekey value.

  • Oh, after 2 weeks we simply bought 2 new XGS routers and replaced old Drayteks. We thought problems solved....until owner of another company nearby knocked on our door when we were messing up with routers, asking, if maybe we did something to his surveilence system. 

    As times matched up, and despite being absolutelly surprised, no, we cannot influence your internet, they are on different optic fibres, each has own ISP modem...but we examined situation and found out, that somehow someone in some moment in time cross-linked LAN ports on our's optical modem with our Draytek via surveilence system's swtich in their office. And there was a loop, occasional IP conflict, occasional DHCP server failure, all probably due to this loop.

    At the end, we resolved the loop, rewired some cablings, and with new XGS we ended up having almost 10-fold VPN speed compared to those old Drayteks. Slight smile