One of my clients had an issue this morning.
They have an UTM HA setup (active/passive) in an ESXi environment (ESXi standalone), with both nodes running on a different ESXi server. Every ESXi server has two vswitches: one connected to a pair of 1Gbps NICs (OOB management and Internet VLAN's amongst them), and one connected to a pair of 10Gbps NICs (backend VLANs and storage connections).
The heartbeat connection is a VLAN on the 10G vswitch, the backup connection a VLAN on the 1G vswitch.
They experienced an issue with their network stack, causing the 1G vswitch in the server running the active node to crash and stop forwarding traffic to the vNICs. Traffic between VM's still worked fine.
No failover happened, I assume because the link state of the vswitch ports didn't change, and there was still heartbeat connectivity. This effectively meant a loss of all internet connectivity and the ability to manage the virtual servers (or login to the UTM).
Are there any idea's on how to solve this issue?
I've thought about a physical interface pass-through to handle link state detection, but since in this case the switch issue was related to forwarding, that link would also have stayed up. I assume the UTM doesn't have something like beacon probing or gateway pings to check interface availability?
This thread was automatically locked due to age.