This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Site 2 Site VPN Most Instant Failover which AVOIDS Flapping?

So I've read a few posts describing different scenarios, and my scenario is quite similar to 1 (of many) answered by BAlfson :)

 

2 Sites, both currently use Uplink Balancing for 2x WAN connections at each site.

 

I've read about availability groups and making sure the order matches between sites, but wanted to make sure this was still the best way to set this up?

1) Is there any way to have redundant VPNs running for quickest failover, intelligently managed so the Sophos doesn't muck up the routes and return routes for traffic? (if I turned on 4 vpns ex: Site1WAN1--> Site2WAN1,Site1WAN2--> Site2WAN1,Site1WAN1--> Site2WAN2,Site1WAN2--> Site2WAN2 , and then setup 4x multipath rules on each Site in the same order, would that work, or even be beneficial over using availability groups?)

2) We currently use a VPN with EACH side being able to initiate (quicker turnups), everything I read says for failover to work, only 1 side can initiate, is this still accurate?

3) When I do implement redundant VPNs, what is to prevent the Sophos from panicking and wrongly directing traffic to/from the sites, *IF* 1 or both sites have a connection instability issue?

3a) Going the opposite of "instant" per my post, is there any sort of delay that the VPN turn up could be told to follow? (like the 5 min interval persistence on the uplink balancing itself?)

Thanks!



This thread was automatically locked due to age.
Parents
  • Hey Jared,

    I suspect that you're talking about the "classic" failover approach described in Auto-Failover IPsec VPN Connections.

    Although Sophos UTM multiple S2S IPsec VPN mit Failover – Tutorial (DE) is written in German, it is well documented with pictures of the configurations with WebAdmin in English.  This approach supplies virtually instant failover.

    I don't know what you consider "flapping," but the standard timeout is 15 minutes in Uplink Balancing and Uplink Monitoring.  You will want to read #3 in Rulz (last updated 2019-04-17) for the second exception to it.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Looking over the article, and wanting to understand it fully (thanks google translate)

    1) Since no routes are defined, am i forced to use "automatic firewall rules" and unable then to use "strict routing" ?

    2) Looking at how he configured each WAN at each site to connect, it seems that if SiteAWAN1 dies, and also SiteBWAN4 dies...there is no SiteAWAN2 going to SiteBWAN3 so it has room for improvement maybe?

    3) The article talks about "to use the multipath rules" the WAN links must be part of uplink balancing...

                a) is the weight of 100 / 0 really necessary? I ask because i would like the primary VPN connection at one of the sites to be 10MBFIBER but i want the Internet to rely heavily on 50MBcomcast which seems like it can't be done if the primary VPN needs to have that much weight set on the balancing

                b) the article only apparently shows setting up these weights at Site1, but i assume they'd be needed at site2 also?

    3) As for flapping, let me maybe reword or provide a scenario. At 1 of my sites where im wanting failover VPN setup, even though i have my uplink balancing set to 1 minute for the timeout, at the same time i sometimes get email notifications that one of the WAN links has gone down and then maybe 1 minute later or less it has come back up. Would this cause any VPNs to fail to negotiate / reconnect properly?

Reply
  • Looking over the article, and wanting to understand it fully (thanks google translate)

    1) Since no routes are defined, am i forced to use "automatic firewall rules" and unable then to use "strict routing" ?

    2) Looking at how he configured each WAN at each site to connect, it seems that if SiteAWAN1 dies, and also SiteBWAN4 dies...there is no SiteAWAN2 going to SiteBWAN3 so it has room for improvement maybe?

    3) The article talks about "to use the multipath rules" the WAN links must be part of uplink balancing...

                a) is the weight of 100 / 0 really necessary? I ask because i would like the primary VPN connection at one of the sites to be 10MBFIBER but i want the Internet to rely heavily on 50MBcomcast which seems like it can't be done if the primary VPN needs to have that much weight set on the balancing

                b) the article only apparently shows setting up these weights at Site1, but i assume they'd be needed at site2 also?

    3) As for flapping, let me maybe reword or provide a scenario. At 1 of my sites where im wanting failover VPN setup, even though i have my uplink balancing set to 1 minute for the timeout, at the same time i sometimes get email notifications that one of the WAN links has gone down and then maybe 1 minute later or less it has come back up. Would this cause any VPNs to fail to negotiate / reconnect properly?

Children
  • 1) Automatic firewall rules are just {local networks} <--> Any <--> {remote networks}.  You can make more restrictive rules manually if you want.  I usually don't select 'Strict routing' since I want to reserve the opportunity to SNAT select traffic into the tunnel.

    2) Interesting - I hadn't thought of that.  I can't think of a way to make that work off the top of my head.

    3) I haven't used this approach personally, so I haven't experimented with those values.  If you do, please share your results here.

    4) One minute is awfully short, but probably long enough for all but the slowest CPUs to establish a tunnel.  I'd think the lowest value I'd set would be 5 minutes.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • hmmm regarding #2 point I made, in your original "classic" failover vpn setup with availability groups...does that work to handle all possible combinations of failure? so if Site1WAN1 goes down and SITE2WAN3 goes down, would the vpns know to connect via Site1WAN2 and Site2WAN4?

     

    also, what is the timeout period before the Sites try to switch to the other VPN, when using "classic failover" ?

  • Like I said, Jared, I haven't experimented with the instant failover, but my recollection is that blending the two approaches cannot work because the instantaneous approach requires binding each IPsec Connection to a specific interface, and I suspect that that won't work with an Interface Group.  If you try it and it does, please report back to us.

    The checks occur every 15 seconds.  As soon as the UTM sees the other side is unavailable, it restarts the VPN via the other connection.  The total time depends on how long t takes the two endpoints to establish.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA