I'm having a peculiar issue where Sophos XG fails to detect the primary RADIUS server is offline and finally failover to the secondary server for wireless authentication.
With either server individually set as the primary RADIUS authentication server, I can connect to the Wi-Fi network with a client machine no problem at all.
To simulate a failure of the premise RAD-1 server, I went into ESXI and proceeded to suspended the machine and then attempted to connect to the Wi-Fi using the same client machine. The behavior observed on the client machine is I continue to receive "Can't connect to this network".
Multiple attempts yield the same result even after 10 to 15 minutes.
If I manually set the RAD-2 as the primary while the RAD-1 system is still suspended, the client machine is able to connect as expected.
I've been playing around with this for a while now and am not sure what's going on here. Packet captures show Sophos sending out ARP-NDP requests to the RAD-1 IP address but never receives a response which is expected.
This issue has been observed on both this XG135 and a XG310 I deployed at another site with a similar configuration for Wi-Fi.
Any ideas on what might be going on here?
I guess we are talking about this:
Just to be sure, XG sending the traffic but not getting a response? As far as i know, the service will be monitored via TCP (Port). So if the Port is still reachable, it considers the service as alive.
If you block the Radius protocol on the windows server to prevent the communication, does it failover?
So doing some more testing today. When I suspend the premise VM I can see Sophos continues to send UDP packets to the endpoint. I have 3 pages worth of packets using filter "port 1812" to that host. I do not see any requests at all to the secondary server using this filter.
Following what LuCar Toni mentioned. I tried leaving the VM online and blocked the RADIUS port (1812) using Window's firewall but the same behavior persist.
Welp, that totally explains the issue. Any reason why this functionality works on the legacy gear and was essentially put on the back burner for the newer gear?
Its more likely a Firmware limitation, as the APX has a newer firmware, which does not include this feature.
Central Wireless supports one Radius server. XG, SG and Wireless shares the "same" firmware.
Tejas Kashyap Can we provide a update on the Radius Support for failback?
Hello. Yup, secondary radius server support and fallback is not supported on APXs. This is currently there on the backlog but not in short term plans yet.
Thank you for following up! I understand it's not a super high priority with everything else you guys are working on for the platform but it would be awesome to see this functionality restored in the coming releases.
Hi David. Sure thing. We will evaluate this once the ongoing release is completed. Thanks for your patience.