SD-WAN settings not working correctly on reboot

We have two WAN links and have setup SD-WAN policy routing.

We have a rule setup so that all traffic from our server should go over one of the WAN links. This server has software running that pings a variety of external services.

When we reboot the XG Firewall (EAP3 Virtual Machine), a trace route shows that the majority of pings are routed through the correct WAN link but some are being incorrectly routed through the wrong WAN link.

My rule is below

Parents
  • I've done some more work on this with a limited amount of testing.

    If I stop the software running the pings, restart the XG and then restart the pings after the XG is back up and running, the pings all route through the correct interface.

    My guess is that the SD-WAN routing takes a little time to kick in, if a ping is issued before the SD-WAN routing is effect, it can be routed out through the wrong interface. Once a route is established the XG router maintains that route in a table which doesn't get overwritten by the SD-WAN coming into operation so the 'wrong' routing remains in effect.

    As I say, this is only a guess based on the observed behaviour. Really need someone from Sophos to confirm this and if appropriate get this problem resolved.

    Equally, I may have something wrong in my rule in which case I would be grateful if someone can point out what it is.

  • Hi Jason,

     

    Your understanding is correct PBR demon comes up little later in the startup process due to other subsystem dependancy and hence interim connection may flow via default GW.

     

    Regards,

    Alok

  • Hi Alok

    Thanks for confirming my observations.

    The question is, what is going to be done to fix it?!

    It wouldn't matter if the correct flow started once the PBR started but unfortunately the old established flow continues. Can't you just clear any established routes once the PBR demon starts, then they would flow correctly?

  • Sorry but I re-read your answer after posting my reply and can't edit it. There is nothing incorrect in my last reply but I wanted to make it quite clear what the problem is.

    You said "interim connection may flow via default GW". That would not be a problem. The problem is that any "interim" flows that are established before PBR starts, remain in effect. So if the interim connection is wrong, it stays wrong! What should happen, once PBR starts, it should clear all the connections and establish new ones based on the PBR rules.

  • Hi Jason,

     

    Approach to minimise the impact is yet not decided, hence I won't be able to answer this right away. 

     

    On the suggestion clearing established routes (session table) may have system wide impact.

     

    Regards,

    Alok

Reply Children