Dropped Connections during Pattern Updates

Since installing multiple XG Firewalls in a multi-site environment, we have been plagued with "random" outages that last between 30-90 seconds.

I have finally correlated this with Pattern updates for either ATP, AV or IPS.  During the time of the definition updates all connectivity to the XG firewall is lost.  This actually brings down our Wide Area network and causes VoIP phones to restart looking for the phone server.

I have an open support ticket with Sophos but I'm awaiting their response.

I have changed the updates to happen less frequently (Daily), however when there are updates it still brings down the connection (albeit less often now).

Is there a way to still have automatic updates turned on but do them on a time schedule?  I find it utterly ridiculous that the system cannot do pattern updates without bringing down the entire network.

If this is "expected" behavior what have others done as workarounds?  I cannot have 30-90 seconds of downtime every other day for pattern updates. 



Added TAGs
[edited by: emmosophos at 9:10 PM (GMT -7) on 28 Jun 2021]
Parents Reply
  • Thanks Bill.  I agree and have seen this article as well.

    But there is currently no fix and no workaround other than to turn off automatic pattern updates?  How can we have a firewall device that drops all connections during pattern updates?  How can I recommend to enterprise?  How do I get more visibility to this?  I've also seen the Sophos Idea to give more control over scheduling these updates which I have upvoted, but frankly, I don't want to lose connection, EVER.

    I'm awaiting Sophos support to get back to me on my questions above as well, but I just can't fathom how this is acceptable on any level.

    I feel like now I am forced to choose between consistent connectivity by turning off automatic pattern updates and security.

Children
  • I agree 100% with you, it is just completely baffling.  

  • Can you try to disable virtual fastpath? 

    console> system firewall-acceleration disable

    __________________________________________________________________________________________________________________

  • I just got off the phone with support and they suggested the same.  I am told this has Bug ID: NC-70896

    I will test this and report back.

  • I have applied this to each of the firewalls experiencing the issue (8 of them).  I'm hopeful this resolves it.

  • Any feedback? 

    __________________________________________________________________________________________________________________

  • LuCar Toni said:

    Any feedback? 

    No.

    https://community.sophos.com/sophos-xg-firewall/f/discussions/123652/internet-traffic-stops-every-time-xg-has-an-ips-or-atp-update

    https://community.sophos.com/sophos-xg-firewall/f/discussions/122951/connection-drops-during-av-pattern-updates

    I have spent days on this issue. Have had it escalated beyond tier two technical support to senior management and eight months later this is still an issue with no solution or even a satisfactory workaround (scheduled updates).

    Why would I waste more time on this? I don't believe there is any issue reproducing this problem so rather than asking your users to do Sophos's job, isn't it about time that Sophos's development team got their finger out, tried your suggestion themselves and fixed the problem?

  • Interesting. I heard no negativ feedback anymore after this switch was disabled. It is about getting the intel, if the issue disappear without the fastpath enabled. Because we can actually investigate this issue in more depth, if we know the module causing this. 

    And if you suffer of this issue and there is a viable workaround, why not using it? 

    __________________________________________________________________________________________________________________

  • Can you tell me why you have to get intel from your users when Sophos can just test this themselves? This has been a serious problem for at least 9 months and I would expect Sophos to be doing everything they can to resolve it themselves.

    This is what Sophos say themselves about Fastpath - "FastPath packet optimization dramatically improves firewall throughput performance by automatically putting trusted and secure packets on the fast path". So why would I want to cripple my XG performance by disabling it?

    I already have a workaround that I have posted here in the forums. If you set updates to every 24 hours and then reboot the XG outside work hours, the updates take place 24 hours after the reboot (and every 24 hours after that). At least you can then avoid the updates happening during working hours and dropping all your internet connections/VOIP sessions when they happen. It's a bit of a fudge because if you have to restart the XG any time during the day, you have to remember to restart it again out of work hours or the updates keep taking place during the day. It also means you can't get updates ASAP but only once every twenty four hours. What would be much better is if Sophos fixed this.

  • I have not had the issue again since running: system firewall-acceleration disable

    I too am concerned I'm missing out on some performance gains by disabling this, but right now it is worth it to me.  I'm hopeful a real fix comes soon.

  • You miss the point. As this issue seems to be not impacting all customers, there only a portion of customer affecting by this. Therefore the installation base affecting of this issue seems to be smaller. Its not the question, if the virtual fast path is causing this issue and if so, on which appliances and in which situations and why. DEV is still looking into this issue and tries to A. Replicate this issue and B. find the reason for this in the first place. 

    While DEV is working on this solution, it is also currently under development to revamp the process of ATP/IPS Pattern process.

    In UTM there was a "easy workaround" for this: 

    Restart policy: Select the policy for connection handling when an IPS engine restart is required, for example when the engine is updated.

    Drop (default): All incoming and outgoing connections will be dropped during engine restart.
    Bypass: All incoming and outgoing connections will bypass IPS scanning while the engine is restarting.

    The point is: customer against the security concerns moved to "bypass", which is actually a bad practice. You could easily say "Why not implement a bypass option in SFOS?". But from a security perspective, there are other approaches to begin with. I would not implement nor enable such a feature in SFOS. 

    As you currently see: This issue was not there before a release. Somehow the virtual fast path seems to have a issue with the reload of the engine and dropping the session in certain edge cases, which still needs to be validated.  

    __________________________________________________________________________________________________________________