Since upgrading all our customers to 9.601, a bigger part of them are complaining about RED's re/disconnection in a no-pattern way.
It started for all of them just the night we upgraded to 9.601, and they all are on different ISP's and located different places around the country.
Been with Sophos support for 2 hours today, and now they escalated it to higher grounds.
Will return with an update....
Suspicious entries in the log - but all connected REDs do this before connection:
2019:03:06-15:15:38 fw01-2 red_server: SELF: Cannot do SSL handshake on socket accept from 'xxx.xxx.xxx.xxx': SSL connect accept failed because of handshake problems
2019:03:06-15:15:46 fw01-2 red2ctl: Missing keepalive from reds3:0, disabling peer xxx.xxx.xxx.xxx
I know the last line is written before the tunnel disconnects, because there was no "PING/PONG" answer...
One customer has 2 x RD 50, one 1 100% stable and the other fluctuates in random intervals - we replaced this with a new RED 50, but the same thing occurs.
Same issues here after 9.601-5 UTM update. 2x RED50 Rev 1. Drop multiple ISPs at varying intervals and lengths. It was advised to re-create RED in UTM. I have performed this, but problems still persist. I was sent two replacement RED50. The first one has been replaced, a new config created, but problem persists. ISPs modems have been replaced although they were reluctant to do so. One of the REDs wont recognize the presence of ISP on WAN1 at all.
We are losing a lot of productivity and business. We do a sizeable portion of our business via teleconferencing.
The tech alluded to a potential issue with REDs after the update to 9.6.01-5.
My problem is resolved. There is a known issue related to unified firmware.
from su -
cc get red use_unified_firmware
if value returned = 1
cc set red use_unified_firmware 0
reds will update and reboot
confirm value is 0 rerunning get command above
NOT A PERMANENT FIX. The issue needs to be addressed in Sophos UTM firmware permanently.
this also worked for me ...
Today the Tunnel of one RED was down again (with use_unified_firmware = 0). It was the one that was problematic after updating to 9.601-5
Disabling and reenabling the RED did not fix the issue. After switching back to 'unified firmware' alle REDs are UP again (use_unified_firmware = 1)
I think it has to bee something different as the issue still apears after one day (with or without unified firmware). ...
However it is related to version 9.601 as i did not have issues before the update...
Hallo somi and welcome to the UTM Community!
Based on what others have said above, I would push Sophos Support to RMA the failing RED.
Cheers - Bob
thanks, we already replaced that RED with a brand new RED 15 - same thing.
...it seems to be a timing issue... When i disable the RED for 5 minutes on the UTM it works fine shortly after reenabling it again.
No 'stabelizing peers' and such in the log - just 'tunnel -up' -> PING PONG, PING PONG
Tunnel is stable until the DSL-Line does its 24h reconnect then.This brings it out of 'sync'...
Then until i disable and reenable the RED i can see -> boot -> stabelizing peers -> 5 ICMP Packets go through the tunnel (yay!) -> unstable peers -> reboot (oh no!)... over and over again ...
Sorry for the delay, been on vacation for a week - my nerves cound not stand it anymore ;-)
I also did the "cc set red use_unified_firmware 0" before I left, and can confirm it solved ALL MY ISSUES.
Had one customer with two RED 50s, one was very unstable and another was completely offline, we have setup temporary SG115's with IPSEC just to keep the customer running.
After I have disabled the new unified firmware, both RED 50's are back and 100% stable!
Sophos Support claims that there are no issues with this, but please, keep refering to this community string, so they can see, that there actually are problems.
I have enabled RED debugging with suppoort, and inserted USB key for debug logging into the red 50, but nothing important was shown.
We have the unified firmare enabled with several other customers, which have no issues with it, so it's odd, I think it looks like some ttl, ips issues, with the different ISP.
Some other issues have been located, and it seems like Sophos it looking into it:
I have had the same issues described.
I was issued an RMA and got a new RED50 box.
it bricked again within 12 hours.
i am trying to connect a RED 15 to our UTM with 9.6.01 Firmware (SG115) the first time now and get the same error now:
SELF: Cannot do SSL handshake on socket accept from 'xxx.xxx.xxx.xxx': SSL connect accept failed because of handshake problems.
Do you know when Sophos will fix this and release a new Firmware fot UTM and RED?
Did you try James Stoy's solution above - switch to MTU1400 and disable/enable the RED? If that didn't work Sophos may want to RMA the 15.
yes, not i tried it out, MTU 1400
Its a different error in the log now, but it does stikll no connect:
2019:04:26-08:27:33 fw red_server: SELF: (Re-)loading device configurations2019:04:26-08:27:36 fw red2ctl: Overflow happened on reds1:02019:04:26-08:27:36 fw red2ctl: Missing keepalive from reds1:0, disabling peer xxx.xxx.xxx.xxx2019:04:26-08:28:41 fw red_server: SELF: New connection from xxx.xxx.xxx.xxx with ID XXXXXXXXXXXXXX (cipher AES256-GCM-SHA384), rev1<30>Apr 26 08:28:41 red_server: XXXXXXXXXXXXXX: connected OK, pushing config2019:04:26-08:28:45 fw red_server: XXXXXXXXXXXXXX: command 'CON_CLOSE reason=fallback_config'2019:04:26-08:28:45 fw red_server: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="XXXXXXXXXXXXXX" forced="1"2019:04:26-08:28:45 fw red_server: XXXXXXXXXXXXXX is disconnected.
I have an open ticket at the german Sophos premium support, but still not solution.
Need to get it running for a homeoffice start in May.