Internet becomes unresponsive after several days?

This is the second time this has occurred since using v18 EAP. I've also had this issue occur a couple times when running v17 but it wasn't as frequent. With v18 EAP, after Sophos XG has been running for several days (over a week), sometimes the internet becomes unresponsive as in I can't access anything. For example, if I try to access a website, it just continues trying to load and eventually times out. At first, I thought it was an ISP issue so I would reset my cable modem but that didn't fix the issue. I can still access devices on my local network just fine, such as the Sophos XG web UI. What I did notice in the web UI is the "Sessions" count under System in the Control Center indicates a very high number when I'm having these issues. It seems to fluctuate from ~800 up to 2.5k. I have about 30-40 devices on my network (one computer, mobile devices, smart home devices, etc.). Typically, my Sessions count is somewhere around 20-50. After restarting Sophos XG, the count goes back down to what I normally see and everything works fine.

Anyone else experiencing similar issues? Is there any specific log I can save when this issue occurs? Unfortunately, I'm running this on my home network so I can't just leave it in an unusable state.

Parents
  • Hi together,

    thanks to this thread I noticed, that switching off ATP significantly improves loading speed of websites. Generally I'm more interested in improving latencies and response times than the deeply discussed bandwidth tests - which are always fine anyway with XG :-).

     

    Is this noticeable performance drop an expected behavior when using ATP or is this related to the known bug? (I have "sessionntbl" log entries in the ips.log as well.)

    And a second question: What exactly is "untrusted content" (seems to be a new ATP setting) and when will this be relevant?

     

    Thanks and Best Regards

    Dom

Reply
  • Hi together,

    thanks to this thread I noticed, that switching off ATP significantly improves loading speed of websites. Generally I'm more interested in improving latencies and response times than the deeply discussed bandwidth tests - which are always fine anyway with XG :-).

     

    Is this noticeable performance drop an expected behavior when using ATP or is this related to the known bug? (I have "sessionntbl" log entries in the ips.log as well.)

    And a second question: What exactly is "untrusted content" (seems to be a new ATP setting) and when will this be relevant?

     

    Thanks and Best Regards

    Dom

Children
  • Both ATP and the DPI mode of web are implemented within snort.  ATP also has new functionality for v18 - sorry I don't know any details.

    Right now there are several issues with ATP and how that affects DPI mode.  Some of these are fixed in GA and some are targeted for MR1.

    My personal recommendation right now is that unless you need it, turn off ATP.  After MR1 (or whatever the first major fix release is) then you can turn it on again.

  • Hi Michael,

    thanks for your reply.

    How do I determine if I need it? :)

    (In fact I had no IPS or ATPs alerts ever since I‘m using XG, except for 2-3 false positives)

    Regards

    Dom


  • Advanced Threat Protection is enforced by three systems. DNS (eg domain names), Web (eg URLs), and snort (eg signatures).
    It focuses on one type of malware - that which is already infected a system and is now trying to contact a controlling server.

    The most common time it comes into play is if people have a laptop without an AV scanner which gets infected while not behind the XG, and then is added to your network.

    For example, a corporate laptop that does not have AV on it. Someone brings it home and gets it infected, then brings it back to the office. Once in the office it tried contacting its Command and Control server, which ATP blocks.

    Another example would be a coffee shop with a guest wifi network. Someone connects an infected laptop.

    Home networks where the number of new devices connecting is low, or where all computers have AV software installed are at lower risk and don't need ATP as much.

    Most of what ATP does (the FQDN and URLs) are also blocked by web categories, mostly the "Spyware and Malware" but also "Spam URLs" and "Phishing and Fraud". If you are using a Web Policy than blocks those categories you are almost as protected as with ATP.

    One of the benefits of ATP is that it works even if the Web Policy is None or Allow All. However that is also one of the drawbacks of ATP - you cannot create a rule that turns it off, it is always enforced. That is one of the problems we are facing with this release.

    Because the new DPI mode does port-agnostic HTTP detection that means that ATP can now enforce in new port-agnostic way. But that is in turn causing us to enforce a strict HTTP specification compliance on non-standard ports, which in turn causes problems when apps are using HTTP-like connections but do not conform to the spec.  When a company writes both the client and the server, and do things on non-standard ports (80/443) they sometimes do things against the spec.  By default DPI mode the blocks the connection because it is unscannable.