This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

UTM 9.601 - RED issues!

Since upgrading all our customers to 9.601, a bigger part of them are complaining about RED's re/disconnection in a no-pattern way.

It started for all of them just the night we upgraded to 9.601, and they all are on different ISP's and located different places around the country.

Been with Sophos support for 2 hours today, and now they escalated it to higher grounds.

Will return with an update....

Suspicious entries in the log - but all connected REDs do this before connection:

2019:03:06-15:15:38 fw01-2 red_server[17509]: SELF: Cannot do SSL handshake on socket accept from 'xxx.xxx.xxx.xxx': SSL connect accept failed because of handshake problems

2019:03:06-15:15:46 fw01-2 red2ctl[12420]: Missing keepalive from reds3:0, disabling peer xxx.xxx.xxx.xxx

I know the last line is written before the tunnel disconnects, because there was no "PING/PONG" answer...

One customer has 2 x RD 50, one 1 100% stable and the other fluctuates in random intervals - we replaced this with a new RED 50, but the same thing occurs.



This thread was automatically locked due to age.
  • Hi everyone,

     

    we have the same exact behaviour with one of our RED50, started exactly after the update to 9.601-5. Strangely we saw that the disconnects only happend during office hours, apperently without much traffic the RED was stable...
    We found a different and hopefully supported workaround, although I do not understand, why it works.

     

    Someone told us that sometimes the RED does not get the new firmware correctly, and you would have to force a firmware update. Either by changing some settings on the RED, forcing a reboot, or - his suggestion - changing the MTU on the RED interface to 1400 until the firmware is downloaded.

    I did that - our RED has two used interfaces with different VLAN configurations (one for data, one for VOIP network). I set the MTU on the data interface to 1400, and there were no more disconnects. I thought that now I can go back to default, so I swiched the MTU back, and the disconnects started again. I then tried different MTU values down to 1450, without success. Since a week now, we are at 1400, and not one disconnect since then.

     

    Maybe this is not at all related to the RED bug, but since it started with the update... I will see when the patch is applied - if I still cannot go back to the default MTU, there was another problem. But I still wanted to share this, since it is easier to change the MTU then changing something on the console...

     

    Regards,

     

    Tobias

  • Hi All,

    Just a quick update, had this issue on a RED 15w today in a remote site office connected to a UTM SG330 running 9.601-5.

    Receiving logs as;

    2019:04:03-12:44:54 uaejltfw red_server[7052]: SELF: Cannot do SSL handshake on socket accept from 'hidden': SSL connect accept failed because of handshake problems
    2019:04:03-12:45:31 uaejltfw red_server[7066]: SELF: Cannot do SSL handshake on socket accept from 'hidden': SSL wants a read first

    I tried disabling the RED tunnel and waiting for the reconnect, then enabling it again - no success.

    I then did as Tobias stated above and changed the MTU on the interface to 1400 - no success yet.

    I then disabled the RED tunnel again, then re-enabling it again - SUCCESS.

    So it looks like the MTU 1400 change fixed it for me after the RED tried to re-connect after a timeout.

    Please note, I have NOT disabled the unified firmware in the shell as suggested above, this was going to be my last resort.

    Kind Regards,

    James.

  • I have had the same issues described.

     

    I was issued an RMA and got a new RED50 box.

     

    it bricked again within 12 hours.

  • Hi,

    we have the same problem but a little bit different ;)

    We have two SG135 Firewalls in H/A Cluster mode. The RED Connection is only DOWN when we do a takeover from Master to Slave.

    our workaround is to switch back with a second takeover back to the first Firewall again where the RED is connects again to our SG135.

    we opened a Case, i hope 9.602 is available soon...

    greetings from Austria
    Greg

  • Hi Tobias

     

    Had this exact problem yesterday after updating to 9.601-5 the single RED 50 that was connected to the appliance started flapping. The appliance has several RED 10 and RED 15's connected, none of them had the problem.

     

    Fortunately I stumbled on this article, so it was a quick fix. What did it for me was setting the MTU to 1400. This solved my problem immediately. Thanks for posting.

     

    I must say I am slightly shocked that this known problem hasn't been fixed by Sophos. Very frustrating!

     

     

    Ivan

  • Hello Bob,

    i am trying to connect a RED 15 to our UTM with 9.6.01 Firmware (SG115) the first time now and get the same error now:

    SELF: Cannot do SSL handshake on socket accept from 'xxx.xxx.xxx.xxx': SSL connect accept failed because of handshake problems.

    Do you know when Sophos will fix this and release a new Firmware fot UTM and RED?

    Regards, Reinhold

  • Hallo Reinhold,

    Did you try James Stoy's solution above - switch to MTU1400 and disable/enable the RED?  If that didn't work Sophos may want to RMA the 15.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hello Bob,

    yes, not i tried it out, MTU 1400

    Its a different error in the log now, but it does stikll no connect:

    2019:04:26-08:27:33 fw red_server[4454]: SELF: (Re-)loading device configurations
    2019:04:26-08:27:36 fw red2ctl[4383]: Overflow happened on reds1:0
    2019:04:26-08:27:36 fw red2ctl[4383]: Missing keepalive from reds1:0, disabling peer  xxx.xxx.xxx.xxx
    2019:04:26-08:28:41 fw red_server[8474]: SELF: New connection from xxx.xxx.xxx.xxx with ID XXXXXXXXXXXXXX (cipher AES256-GCM-SHA384), rev1<30>Apr 26 08:28:41 red_server[8474]: XXXXXXXXXXXXXX: connected OK, pushing config
    2019:04:26-08:28:45 fw red_server[8474]: XXXXXXXXXXXXXX: command 'CON_CLOSE reason=fallback_config'
    2019:04:26-08:28:45 fw red_server[8474]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="XXXXXXXXXXXXXX" forced="1"
    2019:04:26-08:28:45 fw red_server[8474]: XXXXXXXXXXXXXX is disconnected.

    I have an open ticket at the german Sophos premium support, but still not solution.

    Need to get it running for a homeoffice start in May.

    Regards, Reinhold

  • Hi Reinhold,

    Have you tried the "disable unified firmware" fix yet? Most users who have tried the MTU 1400 fix state that disabling the unified firmware works for them.

     

    See the answer from  for more details.

    Let us know, if not I'm afraid it is an RMA.

    Kind Regards,

    James.

  • Hi James,

    yes, the Sophos Support did this first, but it did not help.

    It's very strange for me that the Sophos (Premium) Support does not know an anser and does not help me in any way.

    Looks like UTM-RED15 is a non working combination and they prefer XG.

    Regards, Reinhold