Advanced threat protection breaking Home Assistant functionality with nothing in the logs

Question

I run a "smart home" platform called Home Assistant which has a feature that allows for remote connectivity through their web service called Nabu Casa. Home Assistant runs on a Raspberry Pi 3. When remote control is enabled, this is what occurs during their authentication process: 
 
 SniTun server create a SHA256 from a random 40bit value. They will be encrypted and send to client. This decrypt the value and perform again a SHA256 with this value and send it encrypted back to SniTun. If they is valid, he going into Multiplexer modus. 
 
 With Advanced threat protection enabled, the remote control functionality is broken and the logs in Home Assistant shows there is a "challenge/response error with SniTun server". When I disable ATP, everything works fine. Unfortunately, with ATP enabled, there is nothing in the logs that shows ATP is actually blocking anything but I have tested this multiple times and positive it's ATP causing the remote control functionality to not work. 
 When remote control is enabled, I can see in the Home Assistant logs exactly what it's doing and the first step is: 
 
 2019-12-09 18:44:36 DEBUG (MainThread) [hass_nabucasa.cloud_api] Fetched remote-sni-api.nabucasa.com/snitun_token (200) 
 
 I've tried adding both nabucasa.com and remote-sni-api.nabucasa.com to the Network/Host Exceptions in ATP, but it still doesn't work. It only works when ATP is completely disabled. 
 Any suggestions? It's a bit frustrating the Sophos XG logs don't show anything. 
 Edit: I also tried changing ATP to "Log" only, and it doesn't work. ATP must be completely disabled. 
 Edit: It's not just ATP, it appears using any Web Policy causes the same issue. However, if I select "Use web proxy instead of DPI engine", it works fine Nothing in the logs either. So it appears it's being caused by the DPI engine which I'm assuming ATP is using?

NayanManvar · Accepted Answer

Hi shred, 
 Thank you for sharing a device Access. We are tacking this issue with ID NC-54818.

Paul C2 · Answer

Hi Shred, 
 
 I took a look at the packet capture and system logs collected by Nayan. Specifically looking at the connection with src ip = 172.16.16.15 and dst ip = 3.94.237.33 that Nayan said was problematic, I can see that an unrecognized protocol is being sent over port 443. 
 
 Unrecognized/invalid traffic will be blocked by default when the web DPI engine is enabled. 
 As you have noticed, the web DPI engine is enabled if the web proxy is disabled AND any/all of the following are enabled : 
 
 AV scanning 
 ATP 
 Web policy is configured (ie. web policy is not none in the firewall rule) 
 
 I don't have device access to your box to see the firewall rule configured on your box but perhaps you can confirm that this is consistent with the behavior you have seen. 
 
 Do you know what domain resolves to 3.94.237.33? It seems to be an amazon aws server. 
 
 Here are 3 different ways to workaround the issue: 
 1. create a firewall rule with a fqdn host (if you know the domain) or an IP host in the destination network field and ensure web policy is none, av scanning is unchecked, web proxy is disabled, and ATP is disabled globally. Only then will the problematic parser be bypassed. 
 2. create a firewall rule with a fqdn host (if you know the domain) or an IP host in the destination network field and choose "allow all" web policy and check "Use web proxy instead of DPI engine". 
 3. enable relay_invalid_http_traffic via cish by following the instructions described here: https://community.sophos.com/kb/en-us/123730 . This is the least recommended approach. 
 
 Please let me know once you have had a chance to try out the workarounds. 
 
 Thanks, 
 Paul