v17 MR5: VPN still unstable!

Hi,

 

I Upgraded to MR5 yesterday, all went great, suddenly this evening, tunnels start dropping up and down, and I am being "spammed" with notifications from my SFM that tunnels are terminated.

charon.log shows a lot of theese:

invalid ID_V1 payload length, decryption failed?                                

I have Read here:
Sophos XG Firewall: Cannot handle more than 2 concurrent Quick Mode exchanges per IKE_SA when using IKEv1

That there are issues in MR5, that will be resolved in MR6, but theese errors should read:
"invalid HASH_V1 payload length, decryption failed?"
as stated in the KB above.

I have 4 tunnels on my XG.

Are others seeing this?

A little more log:
2018-01-29 19:54:58 10[ENC] <622> invalid ID_V1 payload length, decryption fail 
ed?                                                                             
2018-01-29 19:54:58 10[ENC] <622> could not decrypt payloads                    
2018-01-29 19:54:58 10[IKE] <622> message parsing failed                        
2018-01-29 19:54:58 10[ENC] <622> generating INFORMATIONAL_V1 request 158523599 
 [ HASH N(PLD_MAL) ]                                                            
2018-01-29 19:54:58 10[NET] <622> sending packet: from x.x.x.x[500] to 5.1 
03.12.171[500] (76 bytes)                                                       
2018-01-29 19:54:58 10[IKE] <622> ID_PROT request with message ID 0 processing  
failed                                                                          
2018-01-29 19:54:58 10[DMN] <622> [GARNER-LOGGING] (child_alert) ALERT: parsing 
 IKE message from x.x.x.x[500] failed                                      
2018-01-29 19:54:58 19[JOB] <622> deleting half open IKE_SA with x.x.x.x a 
fter timeout                                                                    
2018-01-29 19:54:58 19[DMN] <622> [GARNER-LOGGING] (child_alert) ALERT: IKE_SA  
timed out before it could be established                                        
All tunnels are unstable during this, yesterday with MR3, it worked great for weeks!

  • Hi All,

    If you are facing issues due to a matching condition as mentioned in the KBA here: https://community.sophos.com/kb/en-us/128175 then please be assured that it will be fixed in MR 6 release. As stated in the KB article, this is not a Sophos specific issue but it is observed due to a strong swan implementation. 

    If you are facing a different problem then what is stated in the referred KB article then please let us know. 

    Thanks

  • In reply to sachingurung:

    I've been using your XG firewalls for just a month now.  I have XG-XG at 3 locations. Had to rebuild my first implementation because unknown to me the firmware that was installed already had faulty IPSEC.  It's been a damn headache ever since.  I lose at least one site a week, random reboots on one, one site doesn't re-establish the tunnel after internet loss. One RED device that randomly stops sending traffic.  Even after 17 5.  You guys have your stuff together over there or did I make a terrible decision switching to Sophos? My last firewalls never had to be rebooted and fought with this much. 

  • In reply to Adrian De Santi:

    I'm guessing they're modifying max_ikev1_exchanges in the strongswan config.  I think the default is 3, but I dont have access to the cli at the moment to confirm. 

     

    -Scott

  • In reply to sachingurung:

    In the article that you link to in your knowledge base article, that group discussion brings this problem up back in  2015 -  although it does appear that in MR-5 so you are using an older version of Strongswan than the current 5.6 release of strongswan.  Just pointing this out. 

     

    -Scott

  • In reply to Scott_D_L:

    Hi Scott,

    Thanks I appear to not be able to make this change myself via the device console. Fairly annoying as would like to attempt to implement.

  • In reply to sachingurung:

    Hi Sachin,

     

    Can you please advise of config changes and how they can be implemented on current MR5 version we have a support ticket #7909029 open however have not been advised or told to implement this config fix.

     

    Thanks,

     

    Adrian

  • In reply to Nick Fritzler:

    Hi Nick,

    If you have a case logged in support then please PM me the ID and I will take a look to investigate further.

    Thanks

  • In reply to Scott_D_L:

    Anybody else have the "fix" applied and get results one way or the other? 

     I just had one of our XG's  VPN go down last night again and the SA's did not establish themselves automatically. This was going on 6 days without an issue.   I had to manually disable/enable  the connection to bring it up again. (sigh)    I just want accurate data points,  not a bunch of bitching at sophos.  we all get it, there's issue's here,  just trying to get this issue addressed.

    -Scott

  • In reply to Scott_D_L:

    Have any of you using XG to XG swapped to SSL VPN?  I've had some good results with using that.  Biggest issue I have is that if you modify or create a connection, it will bounce every connection as the SSL VPN service restarts or something.  But the Site-to-Site VPNs with SSL have been stable.

  • In reply to Chris Shipley:

    SSL VPN works but not everyone has a Sophos XG on the other side so this isn't always an option.  The REDs have been stable for me in my testing, but I don't have any in production.

     

    I would say if you have critical VPN's you have two options with Sophos XG:

     

    1.  Downgrade to v16 where the IPSec VPN actually works, all of the promises for MR 5 are now being delayed to MR 6.  I'm going to give Sophos the benefit of the doubt on this one and hope the pattern doesn't repeat past the 5th major release, but these strongswan issues were obvious since the GA release and to have so many issues 5 revisions in is not acceptable.  I'm amazed how many people stayed on v17 with how many bugs there were, look at the issues "resolved" list for MR 5 alone.

    2.  Switch to RED or SSL VPN, of course you need a Sophos XG on the other side to support this.

     

    When is the SFM going to support v17 MR-5?  To me, I consider the Sophos XG an open-source project now.  Sophos can blame everyone's "testing procedure" all they want, but when critical bugs are being missed in QA repeatedly, it's time to step up your game Sophos.

  • In reply to Chris Shipley:

    Dont think that will work for us.  We need to encrypt a GRE tunnel through to a cisco device for encapsulation of routing protocols(bgp) and multicast .

     

    -Scott

  • In reply to Ashok Sethi:

    Ashok Sethi
    2.  Switch to RED or SSL VPN, of course you need a Sophos XG on the other side to support this.

    SSL VPN now works with UTM 9 as well (it did not before).  But also only a Sophos product.

    Ashok Sethi
    When is the SFM going to support v17 MR-5?  To me, I consider the Sophos XG an open-source project now.  Sophos can blame everyone's "testing procedure" all they want, but when critical bugs are being missed in QA repeatedly, it's time to step up your game Sophos.


    I agree with your assessment here.  The UTM 9 dev team seems to have this down pat.  What happened to them, did they get cut or are they migrating to the XG dev team?  The Sophos UTM Manager is fantastic compared to the Sophos Firewall Manager for XG/SFOS.  It supports every UTM9 release as soon as they release it.

    Why is the determination on which features to add using the voting platform from the Community posts?  We were promised feature parity in SFOS v15 and we don't have anywhere near that.  The email filtering in XG is so bad, I am installing "Software UTM 9 - 10 IPs" installs in order to get workable email filtering.  Extra license costs just to do something well that I'm being charged already to do in SFOS 15, 16, 16.05, 17...

    There are only TWO REASONS I've been moving clients to the XG platform.  I can get "easy" monthly license pricing and synchronized security.  If they just added these to UTM 9 I'd go right back.

  • In reply to Scott_D_L:

    Scott_D_L
    Dont think that will work for us.  We need to encrypt a GRE tunnel through to a cisco device for encapsulation of routing protocols(bgp) and multicast .

    No SSL VPN for you!  Correct, it'll have to be IPSec.  I have a number of devices running v16.05 MR8 for this reason.

  • In reply to Scott_D_L:

    Hey Scott,

    My issue was due to having Ike message sent out the wrong internet connection and coming in from the far end on the correct port on the XG due to load balancing unsure if you have multiple internet connections but this may be causing your issue if you do?

  • We have a running setup with 6 XG85 Firewalls which do mesh VPN in between. all firewalls run the latest XG 17.0.6 code.

    IPSEC VPN is extremly unstable, the tunnels are flapping within seconds. that leads to the problem that the CPU of the XG85's runs up to 100%. 

    no problem with SSL S2S VPN as a workaround, except that you cannot add/delete tunnel confiigs without disconnecting all sites. furthermore you can not monitor SSL S2S from CFM. 

     

     

     

    Firewall y.y.199.126
    XG85_AM02_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    THY_TO_RHF-1[154]: ESTABLISHED 6 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.174[x.x.172.174]
    THY_TO_RHN-1[153]: ESTABLISHED 16 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHF-1[152]: ESTABLISHED 21 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.174[x.x.172.174]
    THY_TO_RHN-1[151]: ESTABLISHED 32 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_WAR-1[3]: ESTABLISHED 21 minutes ago, y.y.199.126[y.y.199.126]...x.x.172.166[x.x.172.166]
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    THY_TO_RHN-1[155]: ESTABLISHED 6 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHF-1[154]: ESTABLISHED 14 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.174[x.x.172.174]
    THY_TO_RHN-1[153]: ESTABLISHED 24 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHN-1[151]: ESTABLISHED 40 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_WAR-1[3]: ESTABLISHED 21 minutes ago, y.y.199.126[y.y.199.126]...x.x.172.166[x.x.172.166]
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    THY_TO_RHN-1[157]: ESTABLISHED 1 second ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHF-1[156]: ESTABLISHED 10 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.174[x.x.172.174]
    THY_TO_RHN-1[155]: ESTABLISHED 18 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHF-1[154]: ESTABLISHED 26 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.174[x.x.172.174]
    THY_TO_RHN-1[153]: ESTABLISHED 36 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_WAR-1[3]: ESTABLISHED 21 minutes ago, y.y.199.126[y.y.199.126]...x.x.172.166[x.x.172.166]
    XG85_AM02_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    THY_TO_RHF-1[158]: ESTABLISHED 6 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.174[x.x.172.174]
    THY_TO_RHN-1[157]: ESTABLISHED 11 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHF-1[156]: ESTABLISHED 20 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.174[x.x.172.174]
    THY_TO_RHN-1[155]: ESTABLISHED 28 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_WAR-1[3]: ESTABLISHED 21 minutes ago, y.y.199.126[y.y.199.126]...x.x.172.166[x.x.172.166]
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    THY_TO_RHN-1[159]: ESTABLISHED 2 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHF-1[158]: ESTABLISHED 14 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.174[x.x.172.174]
    THY_TO_RHN-1[157]: ESTABLISHED 19 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHN-1[155]: ESTABLISHED 36 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_WAR-1[3]: ESTABLISHED 22 minutes ago, y.y.199.126[y.y.199.126]...x.x.172.166[x.x.172.166]
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6#
    XG85_AM02_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    THY_TO_RHN-1[163]: ESTABLISHED 1 second ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHF-1[162]: ESTABLISHED 14 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.174[x.x.172.174]
    THY_TO_RHN-1[161]: ESTABLISHED 16 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_RHN-1[159]: ESTABLISHED 32 seconds ago, y.y.199.126[y.y.199.126]...x.x.172.170[x.x.172.170]
    THY_TO_WAR-1[3]: ESTABLISHED 22 minutes ago, y.y.199.126[y.y.199.126]...x.x.172.166[x.x.172.166]
    XG85_AM02_SFOS 17.0.6 MR-6#

    Firewall x.x.172.166
    XG85_XN01_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    WAR_TO_RHN-1[12583]: ESTABLISHED 14 seconds ago, x.x.172.166[x.x.172.166]...x.x.172.170[x.x.172.170]
    WAR_TO_THY-1[12494]: ESTABLISHED 25 minutes ago, x.x.172.166[x.x.172.166]...y.y.199.126[y.y.199.126]
    WAR_TO_RHF-1[12008]: ESTABLISHED 2 hours ago, x.x.172.166[x.x.172.166]...x.x.172.174[x.x.172.174]
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    WAR_TO_RHN-1[12584]: ESTABLISHED 5 seconds ago, x.x.172.166[x.x.172.166]...x.x.172.170[x.x.172.170]
    WAR_TO_RHN-1[12583]: ESTABLISHED 20 seconds ago, x.x.172.166[x.x.172.166]...x.x.172.170[x.x.172.170]
    WAR_TO_THY-1[12494]: ESTABLISHED 25 minutes ago, x.x.172.166[x.x.172.166]...y.y.199.126[y.y.199.126]
    WAR_TO_RHF-1[12008]: ESTABLISHED 2 hours ago, x.x.172.166[x.x.172.166]...x.x.172.174[x.x.172.174]
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    WAR_TO_RHN-1[12584]: ESTABLISHED 12 seconds ago, x.x.172.166[x.x.172.166]...x.x.172.170[x.x.172.170]
    WAR_TO_THY-1[12494]: ESTABLISHED 25 minutes ago, x.x.172.166[x.x.172.166]...y.y.199.126[y.y.199.126]
    WAR_TO_RHF-1[12008]: ESTABLISHED 2 hours ago, x.x.172.166[x.x.172.166]...x.x.172.174[x.x.172.174]
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6#

    Firewall x.x.172.170
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    RHN_TO_THY-1[25857]: ESTABLISHED 9 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    RHN_TO_WAR-1[25856]: ESTABLISHED 10 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.166[x.x.172.166]
    RHN_TO_RHF-1[25855]: ESTABLISHED 15 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.174[x.x.172.174]
    RHN_TO_THY-1[25854]: ESTABLISHED 26 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    RHN_TO_WAR-1[25853]: ESTABLISHED 28 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.166[x.x.172.166]
    RHN_TO_THY-1[25851]: ESTABLISHED 41 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    RHN_TO_THY-1[25857]: ESTABLISHED 14 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    RHN_TO_WAR-1[25856]: ESTABLISHED 15 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.166[x.x.172.166]
    RHN_TO_RHF-1[25855]: ESTABLISHED 20 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.174[x.x.172.174]
    RHN_TO_THY-1[25854]: ESTABLISHED 31 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    RHN_TO_THY-1[25860]: ESTABLISHED 2 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    RHN_TO_WAR-1[25859]: ESTABLISHED 2 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.166[x.x.172.166]
    RHN_TO_RHF-1[25858]: ESTABLISHED 5 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.174[x.x.172.174]
    RHN_TO_THY-1[25857]: ESTABLISHED 20 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    RHN_TO_WAR-1[25856]: ESTABLISHED 21 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.166[x.x.172.166]
    RHN_TO_RHF-1[25855]: ESTABLISHED 26 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.174[x.x.172.174]
    RHN_TO_THY-1[25854]: ESTABLISHED 37 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6# ipsec status | grep ESTA
    RHN_TO_THY-1[25860]: ESTABLISHED 7 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    RHN_TO_WAR-1[25859]: ESTABLISHED 7 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.166[x.x.172.166]
    RHN_TO_RHF-1[25858]: ESTABLISHED 9 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.174[x.x.172.174]
    RHN_TO_THY-1[25857]: ESTABLISHED 24 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    RHN_TO_WAR-1[25856]: ESTABLISHED 25 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.166[x.x.172.166]
    RHN_TO_RHF-1[25855]: ESTABLISHED 30 seconds ago, x.x.172.170[x.x.172.170]...x.x.172.174[x.x.172.174]
    RHN_TO_THY-1[25854]: ESTABLISHED 41 seconds ago, x.x.172.170[x.x.172.170]...y.y.199.126[y.y.199.126]
    XG85_XN01_SFOS 17.0.6 MR-6#
    XG85_XN01_SFOS 17.0.6 MR-6#