Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SOPHOS Purposefully Designs bugs into their Firewalls: Episode 1 - VPN Failover and WAN Interfaces

I’m documenting my numerous issues with SOPHOS Firewalls so that others can be aware of what they are getting themselves into.

 

Our Background:

My business is a long time customer of SOPHOS Firewalls(more than 10 years). We have 18 Firewalls and multiple RED devices, and many Access Points. Up until this year SOPHOS was our absolute favorite Firewall for a lot of reasons. But that changed this year. SOPHOS has been developing their new SFOS operating system, and this year they told us that we could not longer purchase their SG line of firewalls, and that we must move to their new XGS line of firewalls. SOPHOS is retiring the SG line of firewalls which were absolutely amazing. Their XGS(SFOS) new firewalls however are complete trash if you have any interest in being able to configure the firewall the way you desire.

 

I was promised the new Firewalls were great, and up to now, SOPHOS Support has been great, so we sunk well over $100,000 into a complete overhaul of our environment and SOPHOS has done nothing but treat us like garbage ever since. The have dismissed all the issues we’ve had. Every problem with the new Firewalls is ‘by design’ or ‘as intended’ and no recourse options are available. I’ve opened up 6 or more tech support issues with SOPHOS as well, and spent nearly 100 hours on the phone with SOPHOS in the last 4 months. Most cases get closed with no resolution because SOPHOS can’t find the source of the issue.

 

These firewalls aren’t even half baked. Any firewall in this class is supposed to be designed such that the administrator of the network can set up the firewall in the manner that they need for their business. With their new Firewalls, not-so. Sure you ‘can’ set things up the way you want, but SOPHOS builds in defects ‘by design’ to make your experience poor unless you use the firewall in the way they want you to. If you raise these issues with them, you will be dismissed and told to do it their way, or deal with the problems. It’s pretty dictatorial.

 

After I built the first of our 18 new Firewalls, I hired a SOPHOS consultant when we were forced to make the switch. The new Firewall is vastly different and I wanted to ensure the setup I did was a good baseline for all the other 17 firewalls I was deploying. I’ve had nothing but issues ever since. We’re stuck with SOPHOS now… at least for the foreseeable future, so I’m going to be using this time to look for another firewall to move to once our term is over. For the price… the lack of functionality is abhorrent, and the dismissiveness is really sad. SOPHOS used to be a great company and a great Firewall product. Now, if you want a flexible, configurable, and well-functioning firewall, this is no longer the product for you.

 

Issue # 1 - VPN Failover and WAN Interfaces

                IPSEC VPNs are commonplace. Lots of businesses use them and we are no different. We’ve been using IPSEC for years on the SG firewalls and they have been great. No issues at all. I could set them up however I wanted with lots of flexibility, and they were extremely reliable.

 

How it was:

On the old model firewall, you had a WAN Interface Group where you’d specify the primary internet as well as one or more backup internet connections. It would keep the backup internet connection(s) disabled until the primary failed. Then it would activate and fail over to the backup internet until the primary came back online, and then fail back. Basic stuff…

 

Also on the old model firewall, you’d create the VPN tunnel, designate the WAN Interface Group as the internet and the VPN tunnel would establish itself over the whatever internet connection was the active one in the WAN Interface Group at the time. You could also hard set the VPN tunnel to a specific WAN interface if you preferred as well. It was easy to use, very configurable, and it worked very reliably.

 

How it is now:

                On the new firewall, it still has a WAN Interface Group, where you specify the primary internet as well as one or more backup internet connections, and it fails over and back just like the old firewall. However, now it keeps the backup internet connection(s) enabled and active, and uses quite a lot of data to check that they are online. If you have a cellular 5G backup internet(different provider\different medium methodology)… it’s going eat up your data whether you like it or not, and if it’s pay-per-use, it will run up your bill. Even if you never fail over to it. I brought this up to SOPHOS. Answer “it’s by design” end of story. Very nice. So it’ll eat up data and run up charges forever. SOPHOS says pay the bills, or get a backup internet that does not have usage charges. “do it our way or pay the penalty”.

 

                Moving onto the VPN… even though there is a WAN Interface Group, which should be able to used to float the VPN Tunnel to the active Internet connection, you can’t use it that way. I told them the old Firewall did this, and asked how to do the same… however, this is “by design”, end of story. So now we need to set up two or more VPN tunnels(one per internet connection). More time and more complicated than necessary, but it’s “by design” so it’s non-negotiable. Anyhow… after you create your multiple VPN connections(all identically set up, just with the WAN interface different), you then need to create a Failover Group, which is a new configuration that determines which VPN Tunnel is primary, and which is backup. This is all so that when the primary internet goes down in the WAN Interface Group, and the backup internet becomes active, then the Primary VPN Tunnel connection will fail as it is hard set to the primary WAN connection, and then the VPN Failover Group will bring up the backup VPN tunnel which is hard set to the backup internet connection. All this complication for the same result, “by design”.

 

                In the Failover VPN section there is a checkbox called “automatic failback”. That’s all it says, no description. It’s supposed to automatically fail back your VPN connection when then primary internet comes back online, however I learned it only tries once. So if your ISP is doing some maintenance overnight… or there is some unexpected brief interruptions on your Primary internet, it will try once (60 seconds later) to fail the VPN Tunnel back, and if that is not successful, it will leave your VPN tunnel connected over to the backup internet permanently. Even if the Primary internet comes back online and is working correctly. So then, half a day later you notice that the VPN tunnel has been eating up your cellular data, or running up your pay-per-use data charges, and you need to manually fail it back. I opened a case with Mark Esiovwa from SOPHOS, and I bet you can guess what he said… “this is by design”. He “confirmed this from the GES team (highest level of support at Sophos)”.

 

So SOPHOS has purposefully created a bug, which creates issues/costs/loss for customers who choose to use an IPSEC VPN Tunnel. They won’t fix the issue they purposefully created, because they want you to use the equipment you bought their way. This is evidenced by their next statement of “While this is by design for PBVPN, we do offer an interactive RBVPN. This allows for managing route criteria based on configured polices”. Simply put, you bought this firewall, it will allow you to use an IPSEC VPN, however, if you don’t do it our way… we will manufacture consequences and issues so that you will eventually comply out of exhaustion.

 

This is ONE scenario and there are many more like this coming. Stay tuned. If you want SOPHOS to tell you how to run your network, right down to the settings you choose, then they’ll be happy to do that and it will probably work. However if you would like to administer your network the way you want\need\choose, don’t walk, run from SOPHOS. You won’t be happy after you spend hundreds of thousands of dollars and days of phone calls, to be told over and over… “we designed it that way”.



This thread was automatically locked due to age.
Parents
  • So: I would actually design this with Route based VPN. Policy Based VPN (or how you say "UTM") is just a relic of the old world. It comes with a lot of conventional problems. 

    You can use Route Based VPN and only do the checkups with SD-WAN Routes. This would mean, "pings" would eat up your internet connection. 

    Failover Groups are the "old world". "Build up VPN, if the VPN 1 is down". Better would be: Build up a modern Route based VPN and only call the route, if you need it. 

    Policy Based VPN is something you are doing because UTM did it that way. You cannot use any newer technologies on it (like SD-WAN). 

    And maybe we should discuss the meaning of "bug". Simply because a product with 20+ Years development (and roughly 6-7 years Startup development) has a feature, which a product in a modern software development does not have - does not mean, it is a bug. It is something, which needs to be designed and maybe adjusted or implemented. 

    Astaro was a startup, they could build features in a fast pace and implement them to there (smaller) installation base. SFOS and Sophos is grown larger and larger: Meaning: every feature needs to be tested in different implementations and scenarios. 
    Sophos is not telling your how to build your setup, but if you take the course to migrate, it might be a good idea, to adapt to newer technologies. Why do Policy Based VPN? What is the advantages? Why not using Route based VPN + SD-WAN? You could do a Zero-Impact Failover with Route based VPN. You could do a quality based routing. You could do packet lost routing. Everything a UTM never could do. 
    I am from Germany and my daily business is to discuss and migrate such development - I can tell you, you should not close your view on the point of "That is my need", maybe discuss the "where do i want to be from a modern view of technology in 1-3 years". Zero-Trust is coming in rapid steps and customers are telling me "i am forcing them to do that", but in the end, the outcome is what matters, a "Simple" solution to get your network secure and modernized is the outcome without the downside of "user experience" which is amazing with a ZTNA product. But to get there - You have to do some refactoring. 

    I am sorry to hear, you are feeling not treated well by Sophos, but likely you can get in touch with your Sales Team to discuss the situation as well. 

    __________________________________________________________________________________________________________________

  • And i did the math about the WAN Link manager and keep alive.

    UTM does "turn of the interface", which means, it is "down". You cannot use it anymore in this state. In SFOS, a Backup interface is always reachable and you can call it all the time, which means, even in the state backup, you can use it for specific cases, if you want. 

    SFOS does a health check every 30 secs. 

    10:35:12.944741 PortB, OUT: IP 192.168.0.4 > 8.8.8.8: ICMP echo request, id 1, seq 1, length 192
    10:35:12.947065 PortB, IN: IP 8.8.8.8 > 192.168.0.4: ICMP echo reply, id 1, seq 1, length 76

    You have 192+76 Bytes, means you have 268 Bytes x 2 per Minute. You have 536 x 60 per hour. 32160 x 24 per Day = 771840 Bytes per Day, means 771 KiloBytes per Day. This means, from your LTE Plan, you are loosing 23 MB per Month because of the Keepalive. 

    I found this to be acceptable for the situation to have the flexibility of knowing, the interface is online all the time. In UTM, you never knew, if the backup interface actually will come up or not. 

    __________________________________________________________________________________________________________________

Reply
  • And i did the math about the WAN Link manager and keep alive.

    UTM does "turn of the interface", which means, it is "down". You cannot use it anymore in this state. In SFOS, a Backup interface is always reachable and you can call it all the time, which means, even in the state backup, you can use it for specific cases, if you want. 

    SFOS does a health check every 30 secs. 

    10:35:12.944741 PortB, OUT: IP 192.168.0.4 > 8.8.8.8: ICMP echo request, id 1, seq 1, length 192
    10:35:12.947065 PortB, IN: IP 8.8.8.8 > 192.168.0.4: ICMP echo reply, id 1, seq 1, length 76

    You have 192+76 Bytes, means you have 268 Bytes x 2 per Minute. You have 536 x 60 per hour. 32160 x 24 per Day = 771840 Bytes per Day, means 771 KiloBytes per Day. This means, from your LTE Plan, you are loosing 23 MB per Month because of the Keepalive. 

    I found this to be acceptable for the situation to have the flexibility of knowing, the interface is online all the time. In UTM, you never knew, if the backup interface actually will come up or not. 

    __________________________________________________________________________________________________________________

Children
  • I agree. 23MB/month is fine. However for the first 2 months I had a case open with SOPHOS where I spent MANY hours troubleshooting our UTM. It was using 12-15MB per DAY. Mysteriously it just stopped one day, and SOPHOS could never find the cause. Like most of my SFOS cases opened with SOPHOS. Typically they get closed with no solution.