This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Policy Based routing through IPSEC VPN

Hi All- 

Please forgive me for reposting this, however the previous post I had replied to with this (https://community.sophos.com/utm-firewall/f/network-protection-firewall-nat-qos-ips/114954/routing-single-local-host-internet-traffic-through-remote-ipsec-tunnel-gateway) has been locked due to inactivity and I am again revisiting trying to make this work. 

and - you were the experts in that thread, perhaps you can shed some light on my situation. 

SITE A: is a branch office with 192.168.68.0/23 and 10.0.1.0/24. SITE B is HQ. SITE B has 192.168.64.0/22

All traffic from SITE A that originates from an IP in 10.0.1.0/24 should traverse the IPSEC tunnel and egress via the SITE B firewall WAN interface. 

Here is the tunnel status, as you can see, it has established correctly: 

Using either a masquerading or SNAT rule, I can get traffic to exit via the WAN interface at SITE A, but remember, we want this to go back to SITE B first via the tunnel. Per the above thread, I created a PBR on SITE A (BRANCH OFFICE) firewall. NOTE: "Remote_LAN is the internal local IP on the HQ (SITEB) firewall, 192.168.64.1.

This immediately breaks the internet at SITE A. I can still be on a workstation at 10.0.1.2 and ping 192.168.64.1, but once the above PBR is enabled, internet breaks. No logs are found at the HQ site that originate from the tunnel. 

both sites have been tested with an ANY>ANY firewall rule, along with "auto firewall rules" on both sides for the IPSEC tunnel, which makes no difference. 

I've also tested iterations of disabling the SNAT rule (and/or masq rule) at the BRANCH site, makes no difference. As soon as the PBR is enabled, internet breaks. 



This thread was automatically locked due to age.
Parents
  • Leaving this here in case anyone stumbles upon this via search down the road:

    So I've been doing some extensive testing on this in a lab environment. I can get this working in a virtualized lab (as another user did in the above linked thread). However, a trial run in real life yields very different results. Each time, whenever enabling either a policy route (or a standard static route) - the internet connection become unstable and flags to "Error" in the dashboard. Ping tests outbound from any device on the network shows intermittent drop outs. 

    I then suspected this might have something to do with the clients I was trying to statically route to another site for internet egress getting stuck in some kind of OSPF routing loop - so I tested from a network which was not involved in any OSPF routers. Same result, internet breaks for the entire site. My next step will probably be to rent some cloud server space and attempting to test this in a completely fresh isolated real environment, since my virtual lab suffers the same problem as the previous linked thread did, with everything eventually egressing out to the same WAN. The idea here is to figure out which part of a more advanced configuration is causing the internet to break when a static route is enabled.

Reply
  • Leaving this here in case anyone stumbles upon this via search down the road:

    So I've been doing some extensive testing on this in a lab environment. I can get this working in a virtualized lab (as another user did in the above linked thread). However, a trial run in real life yields very different results. Each time, whenever enabling either a policy route (or a standard static route) - the internet connection become unstable and flags to "Error" in the dashboard. Ping tests outbound from any device on the network shows intermittent drop outs. 

    I then suspected this might have something to do with the clients I was trying to statically route to another site for internet egress getting stuck in some kind of OSPF routing loop - so I tested from a network which was not involved in any OSPF routers. Same result, internet breaks for the entire site. My next step will probably be to rent some cloud server space and attempting to test this in a completely fresh isolated real environment, since my virtual lab suffers the same problem as the previous linked thread did, with everything eventually egressing out to the same WAN. The idea here is to figure out which part of a more advanced configuration is causing the internet to break when a static route is enabled.

Children
No Data