IPSEC Tunnels Dying After Some Time

Hi

We have around 40 XGs deployed in Azure for our customers.  We host a solution that needs to connect back to the customer datacenter/premise to pull data over.  We have a few tunnels on various XGs that just seem to die off after a couple days.  When reviewing the strongswan.log, the CHILD_SAs rekey just fine, and then it just disappears along with about 75% of them.  The ones that are up appear to have been initiated by the other side, so it appears that the strongswan daemon just stops rekeying things until we take the tunnel down and bring it back up.

 

  • DPD messages are working
  • Seems to affect IKEv2 more/only
  • Traffic initiation from local side does not build the SA
  • Traffic initiation from remote side seems to build the SA, so it seems
  • No event of the SA being deleted, it just vanishes after the last rekey line in the strongswan.log
  • Have verified with the customer that all policy settings match
  • Have verified that the crypto map (local/remote subnets) objects match
  • Have had the Cisco admins disable SA idle-timeout.  ASAs like to remove the SAs after 30 minutes of no traffic, Sophos builds them right back possibly causing stress to strongswan.
  • Cisco admins have removed the "Bytes Lifetime" from the config (set to unlimited), since Sophos doesn't support that life type

I would contact "premium support" but they are never helpful.  Has anyone ever gotten decent help from creating a issue on the support portal?  Is this an issue with Strongswan or the implementation of it inside XG?

  • Hi NateP2,

    We have to review strongswan logs in debug to identify the cause of the issue. 

    Command to put strongswan service in debug : service strongswan:debug -ds nosync

    Please provide the logs in debug around the time of the issue.

    Thanks,

     

  • In reply to H_Patel:

    I don't have one with debug enabled, but every time we turn that on, the logs are so chatty that by the time we are made aware of a VPN issue, the part of the logs we need is long gone and already rotated out.  We are working on getting syslog implemented, but not there currently.

    Is it possible to increase the log rotation history, so there are more logs on hand?  strongswan.log.0 to say... strongswan.log.10, etc.  Our UTMs have them daily for going on 7+ years now, zipped up.

     

    Thanks
    Nate

  • In reply to NateP2:

    Do you have a Screenshot of your IPsec Policy?