Advanced threat protection breaking Home Assistant functionality with nothing in the logs

I run a "smart home" platform called Home Assistant which has a feature that allows for remote connectivity through their web service called Nabu Casa. Home Assistant runs on a Raspberry Pi 3. When remote control is enabled, this is what occurs during their authentication process:

SniTun server create a SHA256 from a random 40bit value. They will be encrypted and send to client. This decrypt the value and perform again a SHA256 with this value and send it encrypted back to SniTun. If they is valid, he going into Multiplexer modus.

With Advanced threat protection enabled, the remote control functionality is broken and the logs in Home Assistant shows there is a "challenge/response error with SniTun server". When I disable ATP, everything works fine. Unfortunately, with ATP enabled, there is nothing in the logs that shows ATP is actually blocking anything but I have tested this multiple times and positive it's ATP causing the remote control functionality to not work.

When remote control is enabled, I can see in the Home Assistant logs exactly what it's doing and the first step is:


2019-12-09 18:44:36 DEBUG (MainThread) [hass_nabucasa.cloud_api] Fetched remote-sni-api.nabucasa.com/snitun_token (200)

I've tried adding both nabucasa.com and remote-sni-api.nabucasa.com to the Network/Host Exceptions in ATP, but it still doesn't work. It only works when ATP is completely disabled.

Any suggestions? It's a bit frustrating the Sophos XG logs don't show anything.

Edit: I also tried changing ATP to "Log" only, and it doesn't work. ATP must be completely disabled.

Edit: It's not just ATP, it appears using any Web Policy causes the same issue. However, if I select "Use web proxy instead of DPI engine", it works fine  Nothing in the logs either. So it appears it's being caused by the DPI engine which I'm assuming ATP is using?

  • Hi Shred,

    I think you will find the ATP uses snort.

    Ian

  • In reply to rfcat_vk:

    That's really strange it works fine with an app control and IPS policy, but not with ATP or a web policy. I believe all four of those functions use Snort.

  • In reply to shred:

    As far as i know, there is currently a bug in ATP, if you enable it, it will break stuff. 

     Maybe this breaks your connection in DPI too? Do you have ATP enable? 

  • In reply to LuCar Toni:

    thanks for the suggestion, but I do not think so.

    I had 2 devs connected on my XG and they found the issue. I have uploaded all the possible logs and they are still investigating.

    I can try your suggestion but the developers should already know if that was the issue, didn't they?

    Thanks

  • In reply to lferrara:

    I would give it a try. As far as i know, the issue shows as "blank pages" for no reason in ATP. 

  • In reply to LuCar Toni:

    Tried yesterday, and no change in my case.

    Regards

  • Hi  

    Thanks you for the feedback. Let me send you PM for more detailing and further investigation.

     

    Thanks,

    Nayan Manvar

  • In reply to shred:

    Hi shred,

    Thank you for sharing a device Access. We are tacking this issue with ID NC-54818.

  • Hi Shred, 

     

    I took a look at the packet capture and system logs collected by Nayan. Specifically looking at the connection with src ip = 172.16.16.15 and dst ip = 3.94.237.33 that Nayan said was problematic, I can see that an unrecognized protocol is being sent over port 443.

     

    Unrecognized/invalid traffic will be blocked by default when the web DPI engine is enabled.

    As you have noticed, the web DPI engine is enabled if the web proxy is disabled AND any/all of the following are enabled:

    • AV scanning 
    • ATP
    • Web policy is configured (ie. web policy is not none in the firewall rule)

    I don't have device access to your box to see the firewall rule configured on your box but perhaps you can confirm that this is consistent with the behavior you have seen.

     

    Do you know what domain resolves to 3.94.237.33? It seems to be an amazon aws server. 

     

    Here are 3 different ways to workaround the issue:

    1. create a firewall rule with a fqdn host (if you know the domain) or an IP host in the destination network field and ensure web policy is none, av scanning is unchecked, web proxy is disabled, and ATP is disabled globally. Only then will the problematic parser be bypassed.

    2. create a firewall rule with a fqdn host (if you know the domain) or an IP host in the destination network field and choose "allow all" web policy and check "Use web proxy instead of DPI engine". 

    3. enable relay_invalid_http_traffic via cish by following the instructions described here: https://community.sophos.com/kb/en-us/123730. This is the least recommended approach.

     

    Please let me know once you have had a chance to try out the workarounds.

     

    Thanks,

    Paul

  • In reply to Paul C2:

    Paul,

    The only two things I know were causing issues is ATP and web policies. I’m not sure what domain that IP address resolves to but the service Home Assistant uses for remote access is through Nabu Casa (nabucasa.com). 

    The current workaround solution I have is I’ve created a separate firewall rule for my Home Assistant device and have “Use web proxy instead of DPI engine” enabled with all my typically policies and AV scanning in place.

    Is it possible to classify this as recognized/valid traffic such that the web DPI engine doesn’t block it?

  • In reply to shred:

    Hi  

    There is one more feedback you have open with ATP , is it in the same appliance of different one ? if it is the same i can share support access with Paul which you have shared with me in PM. 

    https://community.sophos.com/products/xg-firewall/sfos-eap/sfos-v18-early-access-program/f/feedback-and-issues/117690/advanced-threat-protection-atp-breaks-xbox-live-achievements-syncing

     

    Thanks,

    Rana Sharma

  • In reply to Rana Sharma:

    It’s the same network and Sophos XG device.

  • Hey all,

     

    My connection between hassio and nabucasa.com started working after installing latest firmware that just got released. Just restarted hassio after XG firmware update and errors were gone and remote control worked like it should. I hope it works for you too :)

    https://community.sophos.com/products/xg-firewall/sfos-eap/sfos-v18-early-access-program/b/blog/posts/sophos-xg-firewall-v18-eap-3-refresh_2d00_1-firmware-has-been-released

  • In reply to Othou:

    After updating to Refresh1, I tried disabling my firewall rule for Home Assistant that uses the web proxy instead of the DPI engine, and I started receiving the errors again and remote control doesn’t work. I’ve found that if I have Sophos XG configured in such a way that works with Home Assistant (i.e. using the web proxy instead of the DPI engine) and the connection between Home Assistant and Nabu Casa is established, it will continue to work even after changing back to a configuration that shouldn’t work. I’m assuming it’s because the issue is with Home Assistant establishing the initial connection which performs a challenge/response function (using SniTun) that doesn’t pass when running through the DPI engine. So when you restarted Sophos XG, I bet the connection was some how established before the DPI engine was fully initialized.

    Just to confirm, is Home Assistant running on a firewall rule that is using the DPI engine? If so, what happens when you go into the Home Assistant UI and disable remote UI then re-enable it? 

  • In reply to shred:

    shred

    Just to confirm, is Home Assistant running on a firewall rule that is using the DPI engine?

    Oh yeah, silly me, forgot to mention my FW rule i tried last week and left it on.

    No, I'm using web proxy instead of DPI engine. This is the firewall rule i have:

     

    Source zones: LAN           Source networks and devices: Hassio (local IP of Home Assistant host)

    Destination zone: WAN     Destination networks: NabuCasa (this is a FQDN host for *.nabu.casa )

    Web policy: Allow All        Check to Use web proxy instead of DPI engine.

     

    Remember to put it above your default lan to wan-rule.

    shred

    If so, what happens when you go into the Home Assistant UI and disable remote UI then re-enable it? 

    Well, not using DPI engine but now it goes instantly from "Your instance will be available at" to "Your instance is available at". 

     

    I'm not really sure did understood your post correctly and does this even help you, but I hope this helps someone struggling with this anyway :)