Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SOPHOS Purposefully Designs bugs into their Firewalls: Episode 1 - VPN Failover and WAN Interfaces

I’m documenting my numerous issues with SOPHOS Firewalls so that others can be aware of what they are getting themselves into.

 

Our Background:

My business is a long time customer of SOPHOS Firewalls(more than 10 years). We have 18 Firewalls and multiple RED devices, and many Access Points. Up until this year SOPHOS was our absolute favorite Firewall for a lot of reasons. But that changed this year. SOPHOS has been developing their new SFOS operating system, and this year they told us that we could not longer purchase their SG line of firewalls, and that we must move to their new XGS line of firewalls. SOPHOS is retiring the SG line of firewalls which were absolutely amazing. Their XGS(SFOS) new firewalls however are complete trash if you have any interest in being able to configure the firewall the way you desire.

 

I was promised the new Firewalls were great, and up to now, SOPHOS Support has been great, so we sunk well over $100,000 into a complete overhaul of our environment and SOPHOS has done nothing but treat us like garbage ever since. The have dismissed all the issues we’ve had. Every problem with the new Firewalls is ‘by design’ or ‘as intended’ and no recourse options are available. I’ve opened up 6 or more tech support issues with SOPHOS as well, and spent nearly 100 hours on the phone with SOPHOS in the last 4 months. Most cases get closed with no resolution because SOPHOS can’t find the source of the issue.

 

These firewalls aren’t even half baked. Any firewall in this class is supposed to be designed such that the administrator of the network can set up the firewall in the manner that they need for their business. With their new Firewalls, not-so. Sure you ‘can’ set things up the way you want, but SOPHOS builds in defects ‘by design’ to make your experience poor unless you use the firewall in the way they want you to. If you raise these issues with them, you will be dismissed and told to do it their way, or deal with the problems. It’s pretty dictatorial.

 

After I built the first of our 18 new Firewalls, I hired a SOPHOS consultant when we were forced to make the switch. The new Firewall is vastly different and I wanted to ensure the setup I did was a good baseline for all the other 17 firewalls I was deploying. I’ve had nothing but issues ever since. We’re stuck with SOPHOS now… at least for the foreseeable future, so I’m going to be using this time to look for another firewall to move to once our term is over. For the price… the lack of functionality is abhorrent, and the dismissiveness is really sad. SOPHOS used to be a great company and a great Firewall product. Now, if you want a flexible, configurable, and well-functioning firewall, this is no longer the product for you.

 

Issue # 1 - VPN Failover and WAN Interfaces

                IPSEC VPNs are commonplace. Lots of businesses use them and we are no different. We’ve been using IPSEC for years on the SG firewalls and they have been great. No issues at all. I could set them up however I wanted with lots of flexibility, and they were extremely reliable.

 

How it was:

On the old model firewall, you had a WAN Interface Group where you’d specify the primary internet as well as one or more backup internet connections. It would keep the backup internet connection(s) disabled until the primary failed. Then it would activate and fail over to the backup internet until the primary came back online, and then fail back. Basic stuff…

 

Also on the old model firewall, you’d create the VPN tunnel, designate the WAN Interface Group as the internet and the VPN tunnel would establish itself over the whatever internet connection was the active one in the WAN Interface Group at the time. You could also hard set the VPN tunnel to a specific WAN interface if you preferred as well. It was easy to use, very configurable, and it worked very reliably.

 

How it is now:

                On the new firewall, it still has a WAN Interface Group, where you specify the primary internet as well as one or more backup internet connections, and it fails over and back just like the old firewall. However, now it keeps the backup internet connection(s) enabled and active, and uses quite a lot of data to check that they are online. If you have a cellular 5G backup internet(different provider\different medium methodology)… it’s going eat up your data whether you like it or not, and if it’s pay-per-use, it will run up your bill. Even if you never fail over to it. I brought this up to SOPHOS. Answer “it’s by design” end of story. Very nice. So it’ll eat up data and run up charges forever. SOPHOS says pay the bills, or get a backup internet that does not have usage charges. “do it our way or pay the penalty”.

 

                Moving onto the VPN… even though there is a WAN Interface Group, which should be able to used to float the VPN Tunnel to the active Internet connection, you can’t use it that way. I told them the old Firewall did this, and asked how to do the same… however, this is “by design”, end of story. So now we need to set up two or more VPN tunnels(one per internet connection). More time and more complicated than necessary, but it’s “by design” so it’s non-negotiable. Anyhow… after you create your multiple VPN connections(all identically set up, just with the WAN interface different), you then need to create a Failover Group, which is a new configuration that determines which VPN Tunnel is primary, and which is backup. This is all so that when the primary internet goes down in the WAN Interface Group, and the backup internet becomes active, then the Primary VPN Tunnel connection will fail as it is hard set to the primary WAN connection, and then the VPN Failover Group will bring up the backup VPN tunnel which is hard set to the backup internet connection. All this complication for the same result, “by design”.

 

                In the Failover VPN section there is a checkbox called “automatic failback”. That’s all it says, no description. It’s supposed to automatically fail back your VPN connection when then primary internet comes back online, however I learned it only tries once. So if your ISP is doing some maintenance overnight… or there is some unexpected brief interruptions on your Primary internet, it will try once (60 seconds later) to fail the VPN Tunnel back, and if that is not successful, it will leave your VPN tunnel connected over to the backup internet permanently. Even if the Primary internet comes back online and is working correctly. So then, half a day later you notice that the VPN tunnel has been eating up your cellular data, or running up your pay-per-use data charges, and you need to manually fail it back. I opened a case with Mark Esiovwa from SOPHOS, and I bet you can guess what he said… “this is by design”. He “confirmed this from the GES team (highest level of support at Sophos)”.

 

So SOPHOS has purposefully created a bug, which creates issues/costs/loss for customers who choose to use an IPSEC VPN Tunnel. They won’t fix the issue they purposefully created, because they want you to use the equipment you bought their way. This is evidenced by their next statement of “While this is by design for PBVPN, we do offer an interactive RBVPN. This allows for managing route criteria based on configured polices”. Simply put, you bought this firewall, it will allow you to use an IPSEC VPN, however, if you don’t do it our way… we will manufacture consequences and issues so that you will eventually comply out of exhaustion.

 

This is ONE scenario and there are many more like this coming. Stay tuned. If you want SOPHOS to tell you how to run your network, right down to the settings you choose, then they’ll be happy to do that and it will probably work. However if you would like to administer your network the way you want\need\choose, don’t walk, run from SOPHOS. You won’t be happy after you spend hundreds of thousands of dollars and days of phone calls, to be told over and over… “we designed it that way”.



This thread was automatically locked due to age.
  • You're basically saying that SOPHOS implemented a feature... and it does not work well because of SOPHOS' design choices, so use this instead. How about SOPHOS designs all their features well, and allow the administrator the choice how they want to operate their firewall in any way they choose.

    I don't disagree SFOS has good new features, and I'm not closing my view on SD-WAN or other VPN methods. I needed to do a major hardware uplift due to being forced onto SFOS, so that ate up half my year. Now I have other projects that outweigh SD-WAN and VPN at the moment. And the features that are offered on SFOS, that I'm presently using, are poorly implemented.

  • Um, you accuse them of building bugs into their firewalls on purpose. You lose all credibility with that. Sure you're annoyed. Sure things used to work the way you were comfortable and you simply don't want to wrap your head around a different approach. But that is not the same as "purposefully designs bugs into their firewalls."

    You've gotten well-reasoned answers from LuCar Toni and others.

  • I would assume, doing a migration means you are also checking your "options" in the new product. Because every IT Admin knows the fact "If you are not doing it NOW, you will never change it again". That is the situation, most admins are dealing with. Sometime like "i will do it like that now, and change it "later"" will nearly always result in "you never change it again". Therefore by doing a migration, you should have sorted it out, what are you options in the new product.

    Because now is the same situation: You are stuck with the old technologies and - as you said - you dont have the time to change it anymore. That is a frustrating situation, which could be dealt with by doing the migration and checking the possibilities (together with a partner). 

    I am saying: IPsec Failover Groups are a relic of old ages. Even the premise is not good: It is the same like "UTM backup interfaces". Why would you have to "disable" the interface and hope it will come online, if you need it? The same is with a ipsec failover group. But ipsec failover group have the problem of policy based VPN - you should not have the same SA (SPIs) enabled at the same time. Therefore there is a feature build to "failover" to another IPsec tunnel, in case the one is failing. 

    But you could completely workaround and better implement this by using modern technologies. Route based VPN is an always on solution. It is enabled and it callable. This mean, you can route whenever you like. You can even do a "Zero impact failover" with route based VPN - So you can route in case of failure an existing connection to another XFRM interface without killing the existing session - true zero impact failover. The user does not notice this. And there is a builtin failback to the old connection, as soon as it comes back online. 
    --> That is possible with SFOS and not UTM. But customers often time opt-in to the old stuff and never touch it again - a bummer. 

    __________________________________________________________________________________________________________________

  • It's true, I did write that... but it's more of a deduction than an accusation. LuCar posted a good description of a bug. "flaw or fault in the design".

    My deduction:

    I find a flaw in the design -> communicate with SOPHOS -> SOPHOS says it was designed that way =  flaw created by design

                 - Effectively, it was their response that confirmed the flaw was intentional. This has been the response on multiple occasions.

    If it had been this way:

    I find a flaw in the design -> communicate with SOPHOS -> was told it would be fixed =  unintentional design flaw now being corrected

                - This would make a whole different story. If it was this way, I would not even be posting my experiences.

    If SOPHOS would be more committed and motivated to fix these flaws that customers bring up, and deal with them in a prompt manner(promptly is key, no one wants to wait 4 years), it would change the situation hugely.

  • I wasn't doing intending to do a migration. I intended to renew my existing firewall licenses. Also I did review the new product and the product options, and it did everything I needed it to on paper. So I decided to stay with SOPHOS, since I've been happy with SOPHOS up to now. Unfortunately, the implementation of features that SFOS offers have flaws, and were designed that way(per responses from SOPHOS). 

    It's not that I don't have the time to change things. I'm just not going to invest the time now. Given the experiences I've had with SFOS, I can only imagine the new problems I'll uncover by implementing the things you mention above. I don't have time to uncover all the new problems, and there will be problems. Other projects need tending to first.

  • At this point, i cannot comment anymore on your points, as i do not see them as huge problems. I would from the foundation build it differently and have a stronger feature set. 

    What i do not understand: How did you do it in UTM? You did it with the old "hack" with availability groups in UTM?  Sophos UTM multiple S2S IPsec VPN mit Failover – Tutorial (DE) This one? Because this is the same "flawed" design of doing ipsec, which bugged me for ages. I found it to be "interesting" to hope the ipsec tunnel will come up eventually. And what to do if the old session comes up again?  

    __________________________________________________________________________________________________________________

  • Fair enough. That's the SOPHOS line. "As designed", "not huge problems", dismiss the customer. At least it's consistent.

    I wasn't aware that the UTM feature called "Availability Groups" was a hack. I had that functioning for many years in many locations and it worked very well.

    I can't ignore the irony that someone calls a feature with a flaw a bug, and it's not good, but then we call a feature that worked very well "hack" and no issue there. 

  • Essentially you are stating every feature is a bug in SFOS, just because it does not behavior like you expect it to be - But there are not many people mentioning the IPsec failover groups. The Hack with the availability groups was in place for decades, but it is not an feature, instead somebody in the community discovered, it worked like that and people started to adapt this config. 

    A feature is essentially something, the vendor build and promotes, a hack is something, which was never intended to work that way, but it does. 

    But anyway, we can now discuss this for ages - You have your contact persons, to discuss your options. 

    __________________________________________________________________________________________________________________

  • Essentially you are stating every feature is a bug in SFOS, just because it does not behavior like you expect it to be

    I support your freedom to say this... but it does not make it true. And it's not true. A bug and a design flaw are synonymous, just like your wiki post states. Take my issue where that the failback only tries once and then gives up. I have first party experience that this is an issue for us. This "failback trying only once" was not a mandatory design choice. It could have been set to "try indefinitely", or "try 3 times", or "try once per hour", or even a box where I can input how many tries it makes until it gives up. The design choice that was chosen, had unintended consequences which affected us(the customer), and SOPHOS should be humble enough to say "Hey, we designed it this way, however, we see that there are some issues with this" and then be open to fixing it. You don't need to tow the company line so hard that you make broad sweeping generalizations like "you are stating every feature is a bug in SFOS" and "every IT Admin knows the fact" which are immediately incorrect.

    I am a customer who put a lot of real money into a product, and I have some legitimate concerns. It's true that I have a laundry list of issues, and I have not even posted them all. It's true I posted these articles in a way to garner some attention, that was the point, and it has been effective. The reason I posted this way(somewhat noisily), was because most of my issues were not being given the attention and they deserved. 

    As an employee of SOPHOS perhaps take a more professional approach and de-escalate a bit, rather than calling out the customer so much. Obviously I wouldn't be here if there wasn't an issue. Whether you like "IPsec Failover Groups" or not, they are a feature of your product, not a hack, and I do have a legitimate issue with it. If I (the customer) want to use a feature that your product offers, or not, I should be able to. Your pride should not prevent you from saying that I have a point here, and perhaps even that it should be addressed.

  • At this point, i cannot help you anymore. You are not willing to address anything i am stating. But you will likely have you channel to discuss this further. 

    And again: I am not a product manager / developer - I am not in the process of taking such decisions but i can show you, how most customers do a modern approach. If you do not want to do this - it is up to you to do it your way. You can stay in the old way or adapt to modern technologies - in the end it is your network. 

    __________________________________________________________________________________________________________________