Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SOPHOS Purposefully Designs bugs into their Firewalls: Episode 1 - VPN Failover and WAN Interfaces

I’m documenting my numerous issues with SOPHOS Firewalls so that others can be aware of what they are getting themselves into.

 

Our Background:

My business is a long time customer of SOPHOS Firewalls(more than 10 years). We have 18 Firewalls and multiple RED devices, and many Access Points. Up until this year SOPHOS was our absolute favorite Firewall for a lot of reasons. But that changed this year. SOPHOS has been developing their new SFOS operating system, and this year they told us that we could not longer purchase their SG line of firewalls, and that we must move to their new XGS line of firewalls. SOPHOS is retiring the SG line of firewalls which were absolutely amazing. Their XGS(SFOS) new firewalls however are complete trash if you have any interest in being able to configure the firewall the way you desire.

 

I was promised the new Firewalls were great, and up to now, SOPHOS Support has been great, so we sunk well over $100,000 into a complete overhaul of our environment and SOPHOS has done nothing but treat us like garbage ever since. The have dismissed all the issues we’ve had. Every problem with the new Firewalls is ‘by design’ or ‘as intended’ and no recourse options are available. I’ve opened up 6 or more tech support issues with SOPHOS as well, and spent nearly 100 hours on the phone with SOPHOS in the last 4 months. Most cases get closed with no resolution because SOPHOS can’t find the source of the issue.

 

These firewalls aren’t even half baked. Any firewall in this class is supposed to be designed such that the administrator of the network can set up the firewall in the manner that they need for their business. With their new Firewalls, not-so. Sure you ‘can’ set things up the way you want, but SOPHOS builds in defects ‘by design’ to make your experience poor unless you use the firewall in the way they want you to. If you raise these issues with them, you will be dismissed and told to do it their way, or deal with the problems. It’s pretty dictatorial.

 

After I built the first of our 18 new Firewalls, I hired a SOPHOS consultant when we were forced to make the switch. The new Firewall is vastly different and I wanted to ensure the setup I did was a good baseline for all the other 17 firewalls I was deploying. I’ve had nothing but issues ever since. We’re stuck with SOPHOS now… at least for the foreseeable future, so I’m going to be using this time to look for another firewall to move to once our term is over. For the price… the lack of functionality is abhorrent, and the dismissiveness is really sad. SOPHOS used to be a great company and a great Firewall product. Now, if you want a flexible, configurable, and well-functioning firewall, this is no longer the product for you.

 

Issue # 1 - VPN Failover and WAN Interfaces

                IPSEC VPNs are commonplace. Lots of businesses use them and we are no different. We’ve been using IPSEC for years on the SG firewalls and they have been great. No issues at all. I could set them up however I wanted with lots of flexibility, and they were extremely reliable.

 

How it was:

On the old model firewall, you had a WAN Interface Group where you’d specify the primary internet as well as one or more backup internet connections. It would keep the backup internet connection(s) disabled until the primary failed. Then it would activate and fail over to the backup internet until the primary came back online, and then fail back. Basic stuff…

 

Also on the old model firewall, you’d create the VPN tunnel, designate the WAN Interface Group as the internet and the VPN tunnel would establish itself over the whatever internet connection was the active one in the WAN Interface Group at the time. You could also hard set the VPN tunnel to a specific WAN interface if you preferred as well. It was easy to use, very configurable, and it worked very reliably.

 

How it is now:

                On the new firewall, it still has a WAN Interface Group, where you specify the primary internet as well as one or more backup internet connections, and it fails over and back just like the old firewall. However, now it keeps the backup internet connection(s) enabled and active, and uses quite a lot of data to check that they are online. If you have a cellular 5G backup internet(different provider\different medium methodology)… it’s going eat up your data whether you like it or not, and if it’s pay-per-use, it will run up your bill. Even if you never fail over to it. I brought this up to SOPHOS. Answer “it’s by design” end of story. Very nice. So it’ll eat up data and run up charges forever. SOPHOS says pay the bills, or get a backup internet that does not have usage charges. “do it our way or pay the penalty”.

 

                Moving onto the VPN… even though there is a WAN Interface Group, which should be able to used to float the VPN Tunnel to the active Internet connection, you can’t use it that way. I told them the old Firewall did this, and asked how to do the same… however, this is “by design”, end of story. So now we need to set up two or more VPN tunnels(one per internet connection). More time and more complicated than necessary, but it’s “by design” so it’s non-negotiable. Anyhow… after you create your multiple VPN connections(all identically set up, just with the WAN interface different), you then need to create a Failover Group, which is a new configuration that determines which VPN Tunnel is primary, and which is backup. This is all so that when the primary internet goes down in the WAN Interface Group, and the backup internet becomes active, then the Primary VPN Tunnel connection will fail as it is hard set to the primary WAN connection, and then the VPN Failover Group will bring up the backup VPN tunnel which is hard set to the backup internet connection. All this complication for the same result, “by design”.

 

                In the Failover VPN section there is a checkbox called “automatic failback”. That’s all it says, no description. It’s supposed to automatically fail back your VPN connection when then primary internet comes back online, however I learned it only tries once. So if your ISP is doing some maintenance overnight… or there is some unexpected brief interruptions on your Primary internet, it will try once (60 seconds later) to fail the VPN Tunnel back, and if that is not successful, it will leave your VPN tunnel connected over to the backup internet permanently. Even if the Primary internet comes back online and is working correctly. So then, half a day later you notice that the VPN tunnel has been eating up your cellular data, or running up your pay-per-use data charges, and you need to manually fail it back. I opened a case with Mark Esiovwa from SOPHOS, and I bet you can guess what he said… “this is by design”. He “confirmed this from the GES team (highest level of support at Sophos)”.

 

So SOPHOS has purposefully created a bug, which creates issues/costs/loss for customers who choose to use an IPSEC VPN Tunnel. They won’t fix the issue they purposefully created, because they want you to use the equipment you bought their way. This is evidenced by their next statement of “While this is by design for PBVPN, we do offer an interactive RBVPN. This allows for managing route criteria based on configured polices”. Simply put, you bought this firewall, it will allow you to use an IPSEC VPN, however, if you don’t do it our way… we will manufacture consequences and issues so that you will eventually comply out of exhaustion.

 

This is ONE scenario and there are many more like this coming. Stay tuned. If you want SOPHOS to tell you how to run your network, right down to the settings you choose, then they’ll be happy to do that and it will probably work. However if you would like to administer your network the way you want\need\choose, don’t walk, run from SOPHOS. You won’t be happy after you spend hundreds of thousands of dollars and days of phone calls, to be told over and over… “we designed it that way”.



This thread was automatically locked due to age.
  • Edited TAGs
    [edited by: emmosophos at 9:14 PM (GMT -7) on 28 Jun 2023]

    Hey Emmosophos. Why did you remove my tags. Are you trying to silence my post? This is a REAL situation, we are a REAL business, we REALLY spent over $100,000 we REALLY have had this awful experience, and this is the REAL crap answers we've been given. Put our tags back please.

  • Hello Steve,

    Thank you for your feedback; I am sorry to hear about your experience with the Sophos Firewall.

    We have connected you directly with our Product Management team through your Sales team and Account Manager. They’ll address your concerns, particularly regarding any Road Map or features improvements to be made to specific modules.

    We also have an escalation resource available on your recent support case with visibility to Support Management, and they would be happy to have a call with you to address any Support concerns you might have about Support.

    Regards,


     
    Emmanuel (EmmoSophos)
    Technical Team Lead, Global Community Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
  • I've talked to lots of people at SOPHOS. Many many people. It all ends up at the man behind the curtain, who says "it's as designed" and then it's case closed. However, I won't give up hope. People and businesses can change for the better. It is still possible.

    Regarding an escalation resource... I don't have any issues with the support people. They help as much as they can. The issue is the support person is told that the problem is a feature, and so the support person can't do anything at that point.

    Re-labelling problems as features is a clever way to get out of having to fix problems, however... people aren't fooled.

    That said... if they want to discuss the problem\feature issue, I'd be more than open to that. 

  • Pretty big text for pretty little problems like LTE traffic and WAN. But yes, I’m switching from UTM to XGS with intensive testing and can confirm:

    More complicated design and harder to administrate. Some basic things are still missing or (it seems) not well tested (even me is exploring „strange behaviors“ after just 2 weeks of testing).

    In the end there is only one sure way: Testing before buying. I never have any sales contact, they tell you what you wanna hear and in the end it is always your problem ;-)

    To be fair, the UTM is an „old“ product with a long update and bug history, the XGS newer and I‘m sure it will be much better over time. But yes I hope Sophos is looking at some things back to the UTM and not trying to „redesign“ all things (that where perfect working and easy do administrate on the UTM).

  • Pretty big text for pretty little problems like LTE traffic and WAN.

    You apparently didn't read it all because it costs customers a lot of money for a "feature" that makes no sense.  So yes, it should be big text for a complaint that is in a long line of "stupid things in XG".

    To be fair, the UTM is an „old“ product with a long update and bug history

    And exactly why people like it so much - it's been proven to work, it's easy to administer, but yes it's old.  However, for me XG simply isn't the answer to my solution.  I've recently had issues with it allowing internet traffic one day, then the very next day downloading the very same thing from the very same place blocks the traffic.  Dumb.  Makes no sense. (and I'm not talking about my NVIDIA post).

    Administration is a dumpster fire.  Sorry, but I'm not going to be nice about it anymore. Doing things multiple times to accomplish one thing is inherently time consuming and absurd. Sophos Assistant helps, yes, but overall - it irritates me to think about logging into XG.  I've given it months of existence in my network and it's time to go.  It's made me to the point of bitter which I really hate because I want to help.  But when you voice concerns with a response to you of "Duly noted, too bad.  That's how it works" really gets to you after you've been told that over and over.  Just like XG administration.

    OPNSense 64-bit | Intel Xeon 4-core v3 1225 3.20Ghz
    16GB Memory | 500GB SSD HDD | ATT Fiber 1GB
    (Former Sophos UTM Veteran, Former XG Rookie)

  • I agree mostly with this as I find myself always reminded of how frustrated I am over doing the simplest things such as creating static IP addresses from that darn DHCP lease table that closes and has to to be reloaded every time you make a change to the DHCP Server settings. Mind boggling, then we are told "well, most businesses use other products for their DHCP server". This is not an excuse for a poorly designed product. 

    There's more, but yeah the DHCP lease table "bug" is the most mind-boggling so far. With the UTM is was a simple click of "make this a static IP" but they can't even do this in the XG yet.

    I could go on, but it's a toss up since some things are done better in the XG, like firewall and webfilter logs.

  • I agree, it's not all bad. But there are just so many things that are basics that you'd expect on any firewall, and they are just missed. And if the response was, "we are going to fix that" then we'd wait. But saying "no... this is how we designed it", is not a good response.

    I don't know if you've set up an XG in Active\Passive High Availability mode where you have a pair of them in case of failure. When you are setting them up...

    - with the SG, you could set up the first one, connect the second firewall, and the first one would detect the second one and absorb it into the HA Pair. 

    - with XG, they have that same feature, but it's only available on virtual firewall version. If you buy the physical firewalls you have to go through the entire setup twice, then link them and the second unit then gets wiped and synced from the first. So much more time spent for nothing.

  • So: I would actually design this with Route based VPN. Policy Based VPN (or how you say "UTM") is just a relic of the old world. It comes with a lot of conventional problems. 

    You can use Route Based VPN and only do the checkups with SD-WAN Routes. This would mean, "pings" would eat up your internet connection. 

    Failover Groups are the "old world". "Build up VPN, if the VPN 1 is down". Better would be: Build up a modern Route based VPN and only call the route, if you need it. 

    Policy Based VPN is something you are doing because UTM did it that way. You cannot use any newer technologies on it (like SD-WAN). 

    And maybe we should discuss the meaning of "bug". Simply because a product with 20+ Years development (and roughly 6-7 years Startup development) has a feature, which a product in a modern software development does not have - does not mean, it is a bug. It is something, which needs to be designed and maybe adjusted or implemented. 

    Astaro was a startup, they could build features in a fast pace and implement them to there (smaller) installation base. SFOS and Sophos is grown larger and larger: Meaning: every feature needs to be tested in different implementations and scenarios. 
    Sophos is not telling your how to build your setup, but if you take the course to migrate, it might be a good idea, to adapt to newer technologies. Why do Policy Based VPN? What is the advantages? Why not using Route based VPN + SD-WAN? You could do a Zero-Impact Failover with Route based VPN. You could do a quality based routing. You could do packet lost routing. Everything a UTM never could do. 
    I am from Germany and my daily business is to discuss and migrate such development - I can tell you, you should not close your view on the point of "That is my need", maybe discuss the "where do i want to be from a modern view of technology in 1-3 years". Zero-Trust is coming in rapid steps and customers are telling me "i am forcing them to do that", but in the end, the outcome is what matters, a "Simple" solution to get your network secure and modernized is the outcome without the downside of "user experience" which is amazing with a ZTNA product. But to get there - You have to do some refactoring. 

    I am sorry to hear, you are feeling not treated well by Sophos, but likely you can get in touch with your Sales Team to discuss the situation as well. 

    __________________________________________________________________________________________________________________

  • And i did the math about the WAN Link manager and keep alive.

    UTM does "turn of the interface", which means, it is "down". You cannot use it anymore in this state. In SFOS, a Backup interface is always reachable and you can call it all the time, which means, even in the state backup, you can use it for specific cases, if you want. 

    SFOS does a health check every 30 secs. 

    10:35:12.944741 PortB, OUT: IP 192.168.0.4 > 8.8.8.8: ICMP echo request, id 1, seq 1, length 192
    10:35:12.947065 PortB, IN: IP 8.8.8.8 > 192.168.0.4: ICMP echo reply, id 1, seq 1, length 76

    You have 192+76 Bytes, means you have 268 Bytes x 2 per Minute. You have 536 x 60 per hour. 32160 x 24 per Day = 771840 Bytes per Day, means 771 KiloBytes per Day. This means, from your LTE Plan, you are loosing 23 MB per Month because of the Keepalive. 

    I found this to be acceptable for the situation to have the flexibility of knowing, the interface is online all the time. In UTM, you never knew, if the backup interface actually will come up or not. 

    __________________________________________________________________________________________________________________

  • I agree. 23MB/month is fine. However for the first 2 months I had a case open with SOPHOS where I spent MANY hours troubleshooting our UTM. It was using 12-15MB per DAY. Mysteriously it just stopped one day, and SOPHOS could never find the cause. Like most of my SFOS cases opened with SOPHOS. Typically they get closed with no solution.