This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Uplink balancing vs Active Standby, dual ISP questions

I've got a large number of UTM devices at sites with dual ISPs and we're trying to resolve a 'best practices' question.

We typically have both ISPs active with multipathing / weights set up to put our 'priority' traffic (VOIP and RED Tunnels) on the better ISP, and everything else on the secondary.  This works great until the primary fails, at which point the tunnels fail over to the secondary.  That's not a problem, except that when the primary comes back up, the tunnels never fail back to the primary interface on their own. They can sit on the secondary (weaker) connection for hours, days, or weeks until we manually deactivate and reactivate them.

We're considering going to an Active / Standby setup with dual ISPs to address this issue, however in that configuration, our PRTG service can't properly monitor the backup connection (since it's essentially off).

For those of you on dual ISP setups:

1) How do you make sure RED tunnels (or whatever tunnels) fail back to a primary interface when an outage is resolved?

2) If you're running Active / Standby instead of multipathing, how do you monitor your standby ISP?

 

Thanks for the guidance.



This thread was automatically locked due to age.
Parents
  • I think you want to stay with Active-Active and your Multipath rules.  See the second exception in #3 in Rulz (last updated 2019-04-17).  Any better luck now?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • This doesn't do exactly what we want - the problem is the persistence of the tunnels on the 'lesser' interface after the primary comes back up.  So say we have two connections, Fiber and Cable.

     

    We use multipathing to set the tunnels to go out Fiber (by specifying all traffic to the RED destination use the Fiber interface).  This works fine.

    Fiber goes down, the tunnel fails over to Cable.  This works fine.

    Fiber comes back up, but due to the persistence of the connection, the tunnel stays on Cable for days, weeks, or months, until either Cable goes down, we restart the RED interface, or we restart the entire device.

    There has to be a way to force the tunnels to re-initialize once a day or something, right? 

  • Click on the wrench beside 'Active Interfaces' and show us a picture of those settings.  Also, show picture(s) of the Edits of the relevant Multipath rule(s).

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • This shows the uplink balancing (we use our primary for ONLY multipath-specific traffic, everything else goes out eth2)

     

     

    This shows the multipath rule that forces tunnels onto the primary (the group shown, Colos, includes the IP of our RED tunnel endpoint).

     

    Again, this piece is working, it's the failing-back-over that doesn't.

  • Add a fourth Multipath rule at the bottom binding 'Any -> Any -> Any' to 'eth2 - monkeybrains'.

    For testing purposes, in 'Edit scheduler', set 'Persistence timeout' to 1 minute.  After testing, set it back to 15 minutes.

    Any better luck with that?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • I'll make the change and test it, but can you explain to me how this is supposed to affect the change we want? If the issue is connection persistence, and the tunnel doesn't reinitialize unless it's downed and brought back up or otherwise interrupted, how does adding this at the base change the current setup?

     

    Thanks for the info.

  • This multipathing rule had no effect, as far as I can tell.  I tested by failing the primary ISP, the tunnels came back up on the secondary as expected.  I then reactivated the primary ISP, and other routing came back as normal.  However, since the tunnel has not been reinitialized, it's still on the secondary ISP, 2+ hours and counting (I tried setting the persistence to both 1 min and 15 min).

     

    What's my next option?

  • Maybe a bug.  What does Sophos Support have to say about this?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • I've opened multiple tickets on the issue and they never provide a viable response.  I've been linked the FAQ about 'Actions' in uplink monitoring, but to my knowledge there's no way to have an action that restarts a tunnel when the ISP comes back up.

     

    This seems more like a blindingly obvious design flaw than a bug - the other brands of firewalls I work with regularly both handle this with no special config (Juniper and Sonicwall) - the tunnel just reinitializes after a given period and is then on the right ISP.

Reply
  • I've opened multiple tickets on the issue and they never provide a viable response.  I've been linked the FAQ about 'Actions' in uplink monitoring, but to my knowledge there's no way to have an action that restarts a tunnel when the ISP comes back up.

     

    This seems more like a blindingly obvious design flaw than a bug - the other brands of firewalls I work with regularly both handle this with no special config (Juniper and Sonicwall) - the tunnel just reinitializes after a given period and is then on the right ISP.

Children