DO NOT INSTALL 9.703-2!!!
My lab system was Up2Dated to 9.703-2 Thursday evening at 10PM CDT (UTC -0500) and all connection with the outside world immediately stopped. My local connection would work normally a few minutes at a time and then everything would lock up for a few minutes. I could not identify the problem with top, but did see a lot of zombie confd processes. I lost the entire day of Friday because my wife has a big project due next week and was working via Microsoft Teams all day with her colleagues.
I will suggest to Sophos that the file be removed from the ftp site. Grumble.
Cheers - Bob
A new v9.703 update is currently being tested, and is targeted for release during the week of Apr 20. Both customers running on v9.702 and the previous version of v9.703 will be able to upgrade to this new version.
The Advisory KBA has been updated to provide more information regarding this incident:
this is the first time I feel like being informed about this matter in a timely manner and directly from a Sophos source. I appreciate that very much.
Mit freundlichem Gruß, Regards from Germany,
New Vision GmbH, GermanySophos Silver-Partner
I agree it is novel that we are actually told something constructive from Sophos.
but is this premature? are we looking at another RED issue that was not fixed for six months (even though they 'said' they had fixed it twice)? - apologies the cynic in me came out ...
I do understand that they do a lot of work in the back-ground, the UTM has had little or no information about updates, new builds, EAP or Road Maps. Will this change now?
Although this should not have happened in the first place!
will I get some sort of recompense for having to go into work (on Bank Holiday Monday) to fix the issue (by rebuilding the firewall from scratch)?
I always try to look on the positive side, the issues with the UTM and associated equipment has been un-helpful.
XG & UTM Architect (Systems: XG v18 & UTM 9.7 - Virtual, HW & SW)Curious enough to take it apart, skilled enough to put it back together, Clever enough to hide the extra parts when I'm Done!
First of all, I would like to sincerely apologize to anyone who was affected by this issue. There was a gap in our testing for this 9.703 release, and the problem should've never made it into the field. Additionally we should've reacted more quickly when some of you reported this issue on the forums.
As you can see from the KBA that was posted, we have done a detailed analysis on what went wrong (not just the bug, but how we reacted & our testing process), and have/will put improvements/additional safeguards in place to ensure something like this doesn't happen again. We will learn from this, and continue to improve.
As for whether this is another RED50 issue: Fundamentally this issue and the RED50 problem are very different. The RED50 issue was related to the hardware (specifically how the driver reacts when the underlying flash storage degrades - which is normal for flash drives), and it took us a long time to reproduce the problem even after getting failed units shipped back to Engineering for analysis (it is indeterministic when flash drives will degrade). Once we were able to reproduce/see the problem, we moved quickly to address it (the delay wasn't due to a lack of focus/trying). This 9.703 issue does not have any hardware component, and we were able to reproduce, isolate & understand the problem quickly, so we can confidently say the new 9.703 update that was re-released today has the problem addressed.
Again I would like to apologize to all who were affected by this issue. We should be better, and we will be. I also would like to thank BAlfson for reporting this issue first, and working with us on the solution.
Based on comments here in the Community and my persistence, Sophos actually removed 9.703-2 from the Up2Date servers even though they had yet to reproduce the problem in their labs. Sophos deserves a pat-on-the-back for breaking their own rules in dealing with this issue.
9.703-3 works great!
On one box that I manage that was running 9.702-1, I set Up2Date on manual when I saw this problem initially in April. I didn't get back to it until 5/30 when I noticed that a 9.702-1 to 9.703-3 was available. After applying the update I've had 100% CPU spikes every 3 hours exactly and memory usage correspondingly high and erratic. Today I turned off web filtering and set both Up2Date's to manual. This seemed to have stopped the 1 hour CPU spikes but this was not a problem with the previous firmware.
Has there been any such thing being reported elsewhere? I've not been able to find anything else in the box that may cause this.
Please let us know what Sophos Support has to say about this.
I'll follow up with you via PM to get your support case number to investigate further.
To update the Community regarding TomMorgan CPU issue,
His client's issue was related to older hardware that unfortunately had certain settings mis-configured. This caused the CPU spike, and the problem was resolved after the device was factory reset and reconfigured properly.
I've installed the 9.703-3 update on a 9.702 system and I'm having the same issue as the original faulty 703 update that was pulled. Can't access the admin webpage, not a lot of traffic is being passed etc etc. Have rebooted and switched back to the slave and still no good.....
Hi Sophos User278
I would advise you to Open a Support Case for further investigation and PM me the case number.
Community Support Engineer | Sophos Technical SupportSupport Videos | Product Documentation | @SophosSupport | Sign up for SMS Alerts If a post solves your question use the 'Verify Answer' button.
The issue is resolved now. It was an MTU issue disrupting OSPF which for some reason has changed but either way, both nodes are updated and working fine again. The update is fine.