I just learned a lesson and am sharing my experience, which will hopefully save others time.
We had a severe electrical storm in middle of the night.
When I got to work, 2 LAN connections (fiber) inbound from ISP down, but 2 ISP Internet connections were up. We tried backup resto, working with carriers. We triaged and got things kind of working via tunnel vpn.
After 12 long hours and at my wits end I decided to pull the plug on our secondary XGS. Shazam, connectivity between LAN connections started to work (OSPF).
Here's how Switch is related to this: Switch was using vlans to split connections from ISP equipment.
Before Storm
Ports 1-4 were on vlan 100, ISP internet connection
Ports 5-7 were vlan 101 office 1 connection
Ports 8-10 vlan 103 office 2 connection
After:
Ports 1-4 were on vlan 100, ISP connection - UP
Ports 5-8 were vlan 101 office 1 connection - misconfigured ports
ports 9-11 were default (1) Vlan 103 not present
After storm: vlan 103 disappeared from switch and port assignments were different. -- Big trouble. The switch config resembles settings from months ago.
I should have checked switch hours earlier, but didn't because critical ISP connections were up (false flag). I believe port misconfiguration caused trunk lines from ISP to NOT be in correct vlans resulting in outage, caused by network loop from Vlan loss.
Question: If the switch took an electrical hit, ups drained and power was reset, could the running config have been nuked, and the switch reverted to a backup?
Lesson Learned: If critical circuits are run through switch, double check for correct VLAN assignments, as problem may not be with XGS. Pulling the plug on the secondary XGS removed a presumed loop or conflict caused by vlans. Also, with vlans incorrect, the ISP equipment was seeing conflicting MAC addresses, which most likely corrupted the arp table.
Added TAGs
[edited by: Raphael Alganes at 2:11 PM (GMT -7) on 4 Jun 2024]