Sophos Firewall v18, v19, v20: Troubleshooting problems with the DPI engine

Disclaimer: This information is provided as-is for the benefit of the Community. Please contact Sophos Professional Services if you require assistance with your specific environment.


Overview

Here are some guidelines for getting things working with DPI Mode. This is written just after v18.0 GA and focuses on issues that people were experiencing when switching to DPI Mode. In the 18.0 and 18.5 Maintenance Releases, many issues were fixed, and from 19.0, a few issues were reported that were caused by errors on the Sophos Firewall. However, this article continues to contain good information on best practices for bypassing problems with websites or applications.

You may also want to read these:
Sophos Firewall v18: XStream - the new DPI Engine for web proxy explained
HTTPS decrypt and scan FAQ
 

General settings:

IPS - Everything should work with IPS (Intrusion Prevention System) on or off; however, IPS also uses snort to do detections. If something isn’t working, you may want to try turning the IPS off to see if it makes a difference. Please let us know if something is only working when IPS is off.

ATP - Everything should work with ATP (Advanced Threat Protection) globally, on or off. If something isn’t working, you may want to try turning ATP off to see if it makes a difference. ATP detects active malware in your network, making connections to external systems. Because the new DPI mode does port-agnostic HTTP detection, we now enforce ATP in a port-agnostic way. But that is, in turn, causing us to enforce strict HTTP specification compliance on non-standard ports, which causes problems when apps use HTTP-like connections but don’t conform to the spec. When an organization writes the client and the server and does things on non-standard ports (80/443), they sometimes do things against the spec.  If there are problems that go away when ATP is disabled, you should separate the problematic traffic to its own firewall rule.  Then, follow the instructions in Sophos Firewall: Bypass a specific firewall rule for application classification and ATP to create an ATP exception for that firewall rule.

Web > General settings > Block unrecognized SSL protocols. This should be at its default value of unchecked.

Console command line (not ssh shell). 'show http_proxy' should have relay_invalid_http_traffic at its default value of off.


IoT devices:

One of the v18 improvements is our ability to scan traffic from IoT (Internet of Things) devices, especially those using non-standard ports. However, IoT devices may also be more fragile when being scanned.  IoT devices often connect to specific servers and don’t always conform to the specifications or work like web browsers.

If you have an IoT device that does not work, I recommend having it work without filtering/scanning/decryption. Once this works, administrators can make changes that improve security around these devices.

Firewall Rule - The IoT device should hit a rule with no web policy and no malware scanning. When you open the rule, the Web Filtering section should be unexpanded. The Other security features should be set to None.  Then, follow the instructions in Sophos Firewall: Bypass a specific firewall rule for application classification and ATP to create an ATP exception for that firewall rule.

SSL/TLS inspection rules - The IoT device traffic should hit the Don't Decrypt with a profile Maximum Compatibility rule, or it should have no matching rule. If you have some TLS decryption rules for some things, you can create a higher-level rule that doesn't decrypt and uses your device's source, similar to your firewall rule.

If you have an IoT device not working with this configuration, please raise a support Case or forum post. Include the device name, whether it worked in 17.5, your configuration, and error messages on the device and in the Log Viewer, Web filter, and SSL/TLS Inspection.


Phone Apps not working:

If an application isn’t working on a phone, you likely don’t want to turn off all scanning for all of the phone.
HTTPS decryption can cause problems on phones. You can install the CA (Certificate Authority), but depending on the phone/OS/application, it may or may not respect the CA.
As Sophos hears from customers about applications that don’t work, we’ll make improvements with every release.

If an application uses port 80/443 the "Use proxy instead of DPI engine" setting will control whether the traditional web proxy from 17.5 is used or the DPI engine. Problems in both modes should not be reported as a DPI problem.

The DPI engine handles SSL connections on other ports, even if they’re not HTTPS (we don't know if they’re HTTPS unless we decrypt). Applications that make SSL connections on other ports to specific servers may not adhere to the specifications that the Sophos Firewall enforces. Even when the traffic isn’t being decrypted, the DPI engine must look at it, which can cause issues.

Ensure the traffic isn’t being decrypted with TLS Inspection rules or a Web Exception.

If it’s still not working, separate the traffic onto its firewall rule (using destination FQDN) with Web Policy None and no malware scanning.  Then, follow the instructions in Sophos Firewall: Bypass a specific firewall rule for application classification and ATP to create an ATP exception for that firewall rule.

If you have an Application working in Proxy mode and not in DPI mode only, and you’re sure it isn’t being decrypted, please raise a support Case or forum post. Include the phone OS, Application name, whether it worked in 17.5, your configuration, and error messages in the app and the Log Viewer, Web filter, and SSL/TLS Inspection.


Websites not working:

If a website isn’t working and you’re doing HTTPS decryption, the first thing to try is turning off decryption for that site.

Websites typically use port 80/443, and the "Use proxy instead of DPI engine" setting will control whether the traditional web proxy from 17.5 is used or the DPI engine. Problems that occur in both modes should not reported as a DPI problem.

Ensure the traffic isn’t being decrypted with TLS Inspection rules or a Web Exception.

If it is still not working, separate the traffic onto its firewall rule (using destination FQDN) with Web Policy None and no malware scanning.  Then, follow the instructions in Sophos Firewall: Bypass a specific firewall rule for application classification and ATP to create an ATP exception for that firewall rule.

If you have a website working in Proxy mode and not working in DPI mode only, and you are sure it isn’t being decrypted, please raise a support Case or forum post. Include the OS, Browser, whether it worked in 17.5, your configuration, and error messages in the browser and the Log Viewer, Web filter, and SSL/TLS Inspection.


Decryption - Web Exceptions and TLS exclusions:

WebAdmin > Log Viewer. Look at the Web filter and SSL/TLS Inspection modules. Note any errors that appear. Some entries have an Exclude link allowing you to add domains to the Local TLS exclusion list.

WebAdmin > Control Center. Click on the SSL/TLS connections widget. You should see many statistics, including a list of errors in the last seven days. Click Fix Errors.
In the pop-up, you’ll see a list of websites, users, and errors. Select Exclude from decryption to add domains to the Local TLS exclusion list.

WebAdmin > Policy Tester. Put in the details and run the test.  From the results, which include whether the traffic should be decrypted, you can link to the rules that caused that decision and modify them.

You can also exclude by manually editing the Local TLS exclusion list (supports plaintext domain names) or Web Exceptions (supports RegEx and applies to proxy mode).

If the traffic is port 80/443 and matches a firewall rule with "Use proxy instead of DPI engine," then the traditional web proxy is used, and the SSL/TLS Inspection rules are not applied. If "Use proxy instead of DPI engine"  isn’t checked, or if the traffic is on any other port, the DPI engine and the SSL/TLS Inspection rules are used.

Consider a scenario where DPI sees a new TCP connection on a port with a TLS ClientHello. At this point, we have no idea if it’s HTTPS, we only know it is TLS.

The ClientHello in the TLS connection has an optional plaintext field called SNI (Server Name Identification) which should be filled in with the FQDN it connects to. I am guessing, but SNI is present in 99.9% of HTTPS traffic and >90% of non-HTTPS TLS traffic. That FQDN is sent to the categorizer. DPI knows the source IP, destination IP, FQDN, user, and category. It uses those things to determine which TLS Inspection Rules are matched, including the Local TLS exclusion list. Then, it uses those things to determine which exceptions apply, including any HTTPS decryption exceptions. So when using the DPI engine, both TLS exclusion rules and web exceptions apply, and they apply to all TLS connections (not just HTTPS).




Edit Word "Ticket" to "Case"
[edited by: GlennSen at 5:20 AM (GMT -8) on 24 Jan 2024]
  • Thank you for posting this nice summary.  I noticed that my 'show http_proxy' had the 'relay_invalid_http_traffic' set to 'on'.  I've never heard of this setting and I certainly never set it to 'on' if the default setting is 'off'.  I set it to 'off' to see if it makes any noticeable differences.  Is there a list somewhere of what the default settings should be?

  • There is no documented list of default settings, and the defaults sometimes change.  If you really want, install a new box and you can compare the defaults there.

    To my recollection the default for relay_invalid_http_traffic was always off.  There is no way to turn it on in WebAdmin, only using console.  Of course it is also in Backup/Restore and in the XML API.  This controls the behavior that occurs when we detect/expect HTTP but it does not conform to the specification (which means we cannot scan it).

    Thanks for letting me know...  I will update the text in the original post.

  • I am troubleshooting DPI engine problems and I can confirm that my relay_invalid_http_traffic is set to on and I have never changed it nor had heard of it before this article.  I'll turn it off to see what happens.