Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

Sophos Firewall v18, v19, v20: Troubleshooting problems with the DPI engine

Disclaimer: This information is provided as-is for the benefit of the Community. Please contact Sophos Professional Services if you require assistance with your specific environment.


Overview

This recommended read focused on the issues experienced upon switching to DPI Mode. 

You may also want to read these:
Sophos Firewall: XStream - the new DPI Engine for web proxy explained
HTTPS decrypt and scan FAQ
 

General settings:

IPS - Everything must work with IPS (Intrusion Prevention System) on or off; however, IPS also uses snort to do detections. If something isn’t working, you may want to try turning the IPS off to see if it makes a difference. Kindly let us know if something only works when IPS is off.

ATP - Everything must work with ATP (Advanced Threat Protection) globally, on or off. If something isn’t working, you may want to try turning ATP off to see if it makes a difference. ATP detects active malware in your network, making connections to external systems. Because the new DPI mode does port-agnostic HTTP detection, we now enforce ATP in a port-agnostic way. But that is, in turn, causing us to enforce strict HTTP specification compliance on non-standard ports, which causes problems when apps use HTTP-like connections but don’t conform to the spec. When an organization writes the client and the server and does things on non-standard ports (80/443), they sometimes do things against the spec.  If there are problems that go away when ATP is disabled, you must separate the problematic traffic to its own firewall rule.  Then, follow the instructions in Sophos Firewall: Bypass a specific firewall rule for application classification and ATP to create an ATP exception for that firewall rule.

Web > General settings > Block unrecognized SSL protocols. This must be at its default value of unchecked.

Console command line (not ssh shell). 'show http_proxy' must have relay_invalid_http_traffic at its default off value.


IoT devices:

One of the v18 improvements in our ability to scan traffic from IoT (internet of Things) devices, especially those using non-standard ports. However, IoT devices may also be more fragile when being scanned.  IoT devices often connect to specific servers and don’t always conform to the specifications or work like web browsers.

If you have an IoT device that does not work, I recommend having it work without filtering/scanning/decryption. Once this works, administrators can make changes that improve security around these devices.

Firewall Rule - The IoT device must hit a rule with no web policy and no malware scanning. When you open the rule, the web Filtering section must be unexpanded. The Other security features must be set to None.  Then, follow the instructions in Sophos Firewall: Bypass a specific firewall rule for application classification and ATP to create an ATP exception for that firewall rule.

SSL/TLS inspection rules - The IoT device traffic must hit the Don't Decrypt with a profile Maximum Compatibility rule or have no matching rule. If you have some TLS decryption rules for some things, you can create a higher-level rule that doesn't decrypt and uses your device's source, similar to your firewall rule.

If you have an IoT device that does not work with this configuration, kindly create a support case or forum post. Include the device name, whether it worked in 17.5, your configuration, and error messages on the device and in the Log Viewer, web filter, and SSL/TLS Inspection.


Phone Apps not working:

If an application isn’t working on a phone, you likely don’t want to turn off all scanning for all phones.
HTTPS decryption can cause problems on phones. You can install the CA (Certificate Authority), but depending on the phone/OS/application, it may or may not respect the CA.
As Sophos hears from customers about applications that don’t work, we’ll make improvements with every release.

If an application uses port 80/443, the "Use proxy instead of DPI engine" setting will determine whether the traditional web proxy from 17.5 is used or the DPI engine. Problems in both modes must not be reported as a DPI problem.

The DPI engine handles SSL connections on other ports, even if they’re not HTTPS (we don't know if they’re HTTPS unless we decrypt). Applications that make SSL connections on other ports to specific servers may not adhere to the specifications that the Sophos Firewall enforces. Even when the traffic isn’t being decrypted, the DPI engine must look at it, which can cause issues.

Ensure the traffic isn’t being decrypted using TLS Inspection rules or a web Exception.

If it’s not working, separate the traffic onto its firewall rule (using destination FQDN) with Web Policy None and no malware scanning.  Then, follow the instructions in Sophos Firewall: Bypass a specific firewall rule for application classification and ATP to create an ATP exception for that firewall rule.

If you have an Application working in Proxy mode instead of DPI mode only and are sure it isn’t being decrypted,raise a support Case or forum post. Include the phone OS, Application name, whether it worked in 17.5, your configuration, and error messages in the app, the Log Viewer, web filter, and SSL/TLS Inspection.


Websites not working:

If a website isn’t working and you’re doing HTTPS decryption, the first thing to try is turning off decryption for that site.

Websites typically use port 80/443, and the "Use proxy instead of DPI engine" setting controls whether the traditional web proxy from 17.5 is used or the DPI engine. Problems that occur in both modes must not be reported as a DPI problem.

Ensure the traffic isn’t being decrypted using TLS Inspection rules or a web Exception.

If it isn’t working, separate the traffic onto its firewall rule (using destination FQDN) with web Policy None and no malware scanning.  Then, follow the instructions in Sophos Firewall: Bypass a specific firewall rule for application classification and ATP to create an ATP exception for that firewall rule.

If you have a website that works in Proxy mode but not in DPI mode only and you are sure it isn’t being decrypted, please raise a support Case or forum post. Include the OS, Browser, whether it worked in 17.5, your configuration, and error messages in the browser, as well as the Log Viewer, Web filter, and SSL/TLS Inspection.


Decryption - web Exceptions and TLS exclusions:

WebAdmin > Log Viewer. Look at the web filter and SSL/TLS Inspection modules. Note any errors that appear. Some entries have an Exclude link allowing you to add domains to the Local TLS exclusion list.

WebAdmin > Control Center. Click the SSL/TLS connections widget. You must see many statistics, including a list of errors in the last seven days. Click Fix Errors.
In the pop-up, you’ll see a list of websites, users, and errors. Select Exclude from decryption to add domains to the Local TLS exclusion list.

WebAdmin > Policy Tester. Put in the details and run the test.  From the results, which include whether the traffic must be decrypted, you can link to the rules that caused that decision and change them.

You can also exclude by manually editing the Local TLS exclusion list (which supports plaintext domain names) or web Exceptions (which supports RegEx and applies to proxy mode).

If the traffic is port 80/443 and matches a firewall rule with "Use proxy instead of DPI engine," then the traditional web proxy is used, and the SSL/TLS Inspection rules aren’t applied. If "Use proxy instead of DPI engine"  isn’t checked, or if the traffic is on any other port, the DPI engine and the SSL/TLS Inspection rules are used.

Consider a scenario where DPI sees a new TCP connection on a port with a TLS ClientHello. At this point, we have no idea if it’s HTTPS. We only know it’s TLS.

The ClientHello in the TLS connection has an optional plaintext field called SNI (Server Name Identification) which must be filled in with the FQDN it connects to. I am guessing, but SNI is present in 99.9% of HTTPS traffic and >90% of non-HTTPS TLS traffic. That FQDN is sent to the categorizer. DPI knows the source IP, destination IP, FQDN, user, and category. It uses those things to determine which TLS Inspection Rules are matched, including the Local TLS exclusion list. Then, it uses those things to determine which exceptions apply, including any HTTPS decryption exceptions. So when using the DPI engine, both TLS exclusion rules and web exceptions apply, and they apply to all TLS connections (not just HTTPS).




Corrected Grammar added TAG
[edited by: Erick Jan at 2:51 AM (GMT -7) on 10 Oct 2024]
Parents
  • Thank you for posting this nice summary.  I noticed that my 'show http_proxy' had the 'relay_invalid_http_traffic' set to 'on'.  I've never heard of this setting and I certainly never set it to 'on' if the default setting is 'off'.  I set it to 'off' to see if it makes any noticeable differences.  Is there a list somewhere of what the default settings should be?

Reply
  • Thank you for posting this nice summary.  I noticed that my 'show http_proxy' had the 'relay_invalid_http_traffic' set to 'on'.  I've never heard of this setting and I certainly never set it to 'on' if the default setting is 'off'.  I set it to 'off' to see if it makes any noticeable differences.  Is there a list somewhere of what the default settings should be?

Children
  • There is no documented list of default settings, and the defaults sometimes change.  If you really want, install a new box and you can compare the defaults there.

    To my recollection the default for relay_invalid_http_traffic was always off.  There is no way to turn it on in WebAdmin, only using console.  Of course it is also in Backup/Restore and in the XML API.  This controls the behavior that occurs when we detect/expect HTTP but it does not conform to the specification (which means we cannot scan it).

    Thanks for letting me know...  I will update the text in the original post.

  • I am troubleshooting DPI engine problems and I can confirm that my relay_invalid_http_traffic is set to on and I have never changed it nor had heard of it before this article.  I'll turn it off to see what happens.