So I have a new customer with an XGs30 at HQ were their POS server is located. They have 25 branch stores that access that POS over a IPSec VPN tunnel. HQ XG 230 does have 2 WAN's and the remote site have an XG105 with some bigger stores having 2 WAN. They have been having issues with tunnels going down often for a long time. I have been able to more stable by creating new IPSec Policies and making sure the actual tunnels setting are setup properly. I have been working with Sophos support to ensure and to have a second set of eye.
HQ is set to respond only.
Branch store initiate.
Most branch stores have two VPN connections. Fist one goes to HQ primary WAN, the second one goes to HQ secondary WAN. Then there is a failover group setup.
After going store by store and making sure that everything is setup properly, most locations are up 24/7 with no issues or they automatically failover if needs to. I have 5 repeat offender stores that they go down randomly. recently this morning the same 5 stores were down. Then I went to check some settings after the stores closed and the same stores down this morning we down again around 8pm tonight. I spent about 3 hours with support trying to get all the logs and checking all the settings but no luck.
Getting the tunnels back up is easy. All I have to do is log in to each remote site, disable failover groups and then reenable it. It reconnects with no issues. The strange thing is that the actual failover group status is green for enable and the VPN profiles show active next to the failover group name but the active and connected status buttons are red. Like the failover group got stuck or something?
All sites settings are the same and all sites are on the same latest firmware.
Hi Carlos Carrasquillo Based on issue description I would like to suggest you to verify/try with below steps:1) Please re-validate both HQ and BO are not set as in "Initiator" as that may lead to race condition for IPSec SA ( may generate duplicate SAs ) and create such problem. It should recommend either end Responder (Generally fail over group end "Initiator" and another end Responder only.2) I would like to suggest test any one group with IKEv2 policy at both the end.
If still issue there then based on "strongswan" debug logs issue needs to be validated further to confirm more.For that you may continue with your on going support case and asked engineer to confirm the logs with NC-51185 to validate it is getting matched or not or take the case further based on suspected logs.
Regards,Vishal RanpariyaTechnical Account Manager | Sophos Technical SupportSophos Support Videos | Knowledge Base | @SophosSupport | Sign up for SMS Alerts | If a post solves your question use the 'This helped me' link.
HQ is set to Respond and Remote Office Initiate.
The new IPSec Policy I created is IKE 2.
They have gotten some of the logs files after the VPN was initiated and last night I was able to call support while the VPN at the 5 locations were down and left them down because it was after hours and the store were closed.