Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Sophos Firewall: v19.5 GA: Feedback and experiences

Release Post:  Sophos Firewall v19.5 is Now Available 

Old v19.0 MR1 thread:  Sophos Firewall: v19.0 MR1: Feedback and experiences 

EAP Sub thread:  SFOS v19.5 Early Access Program (Read Only) 

EAP 19.5 Thread:  Sophos Firewall: v19.5 EAP1: Feedback and experiences 



This thread was automatically locked due to age.
Parents
  • Someone has problems with two WAN connections, too? We have two PPPoE connections, one active one backup. In former versions there was a problem with the PPPoE that if you plugged it out and back then it didn't reconnect. They made a fix for us in the former version, worked. Now this is fixed globally in 19.5 and the connection is restored fine after. But now there is another problem as it seems that the WAN link manager now has some problems.

    To describe it more clear:

    - unplug active connection

    - backup will take the connection, everything is back online with the backup connection

    - plugin active connection

    - pppoe connection is restored

    - WAN link manager shows green (both links)

    - now all connections from internal to WAN get broken. There is some problem that the traffic can't be matched internally.

    Fun fact: connections from outside or VPNs are not affected, their traffic is like it is expected.

    If I go to console and look for dropped packets there are most of the packets are shown as dropped. Now I went back to WAN link manager, made the backup connection active and it works out of the box. Now back to backup, same again and so on. Now I set the backup link as active with weight 2. Works again from my workstation after the backup link. After that I went to the interface from the normally active link (with weight 1) and only saved it. What happened then was kind of horrible. My node 1 made a hard reboot (not even the firmware text was removed in the screen like it is by normal reboots). After reboot it still served connections via the backup connection (which was active with weight 2 at that time). Now I changed it back and now it is normal again. We had this problem several times in the past while we were doing some maintenance on the links.

    Now we are really scared of (even planned) fail overs or maintenance of our links as every time we have many minutes of no internet connection until the box finally get its internal matching with the two WAN links done...

  • Hello K-M,

    Can you please share appliance access id via PM?

    Please also share dropped packets detail if you have it handy.

    Development team will try to analyze historical logs however would it be possible for you to conduct live debug session, if required? 

    Regards,

    Sanket Shah

    Director, Software Development, Sophos Firewall

  • See PM.

    No, unfortunately I didn't copy the log to a saved file. But what was the problem I think is that the traffic was "matched" to Zone 0 in both directions so it didn't fit to any rule. That was the only thing I could see was special.

    Live debug is only possible in limited conditions as this is our main firewall in productive environment. If needed let us see what we can do.

  • Thanks for sharing access detail.

    Let us investigate the logs and decide next actions.

    Regards,

    Sanket Shah

    Director, Software Development, Sophos Firewall

  • Hello K-M,

    Team have done investigation for the mentioned issues from the logs we had. As we couldn't debug it in live, we formed some hypothesis around behavior you have mentioned and logs we analyzed from the appliance.

    You have highlighted 2 issues

    1) All connections from internal to WAN broken on reconnecting active gateway (failback event)

    2) Appliance rebooted while you were updating gateway

    1st issue seems to be expected behavior as per the configuration you had at that time.

    From the logs of 10th Jan, it looks "Serve all connections through restored gateway" was selected which interrupts existing connections (Image below with highlighted box).

    Generally, admin choose this option if their backup link is too costly or most of the applications are running on UDP. For TCP based traffic (which is stateful by nature), client application (for example, browser) would re-initiate new connection so customer might feel interruption momentarily or permanently (TLS or secure session which might require secure handshake again).

    Default (and less disruptive) option is to choose "Serve new connections through restored gateway". We noticed that you have selected this option currently.

    Reg. 2nd issue about reboot, we observed kernel panic happened at that time. It seems you might be affected with one known issue related to VPN traffic (internally being tracked with NC-108226). Team is actively working on its solution. Once we have pre-fix ready, we would request to apply it in your setup and seek your feedback.

    I would request you to re-test the steps you have mentioned in your initial report after applying this pre-fix (though there is no relation with it and you can try even now as well in non-pick hours as you have mentioned it's production environment). 

    Let me know if you have further query in this regards.

    Regards,

    Sanket Shah

    Director, Software Development, Sophos Firewall



    Corrected image
    [edited by: sanket.shah at 3:08 PM (GMT -8) on 25 Jan 2023]
Reply
  • Hello K-M,

    Team have done investigation for the mentioned issues from the logs we had. As we couldn't debug it in live, we formed some hypothesis around behavior you have mentioned and logs we analyzed from the appliance.

    You have highlighted 2 issues

    1) All connections from internal to WAN broken on reconnecting active gateway (failback event)

    2) Appliance rebooted while you were updating gateway

    1st issue seems to be expected behavior as per the configuration you had at that time.

    From the logs of 10th Jan, it looks "Serve all connections through restored gateway" was selected which interrupts existing connections (Image below with highlighted box).

    Generally, admin choose this option if their backup link is too costly or most of the applications are running on UDP. For TCP based traffic (which is stateful by nature), client application (for example, browser) would re-initiate new connection so customer might feel interruption momentarily or permanently (TLS or secure session which might require secure handshake again).

    Default (and less disruptive) option is to choose "Serve new connections through restored gateway". We noticed that you have selected this option currently.

    Reg. 2nd issue about reboot, we observed kernel panic happened at that time. It seems you might be affected with one known issue related to VPN traffic (internally being tracked with NC-108226). Team is actively working on its solution. Once we have pre-fix ready, we would request to apply it in your setup and seek your feedback.

    I would request you to re-test the steps you have mentioned in your initial report after applying this pre-fix (though there is no relation with it and you can try even now as well in non-pick hours as you have mentioned it's production environment). 

    Let me know if you have further query in this regards.

    Regards,

    Sanket Shah

    Director, Software Development, Sophos Firewall



    Corrected image
    [edited by: sanket.shah at 3:08 PM (GMT -8) on 25 Jan 2023]
Children
No Data