UTM 9.601 - RED issues!

Since upgrading all our customers to 9.601, a bigger part of them are complaining about RED's re/disconnection in a no-pattern way.

It started for all of them just the night we upgraded to 9.601, and they all are on different ISP's and located different places around the country.

Been with Sophos support for 2 hours today, and now they escalated it to higher grounds.

Will return with an update....

Suspicious entries in the log - but all connected REDs do this before connection:

2019:03:06-15:15:38 fw01-2 red_server[17509]: SELF: Cannot do SSL handshake on socket accept from 'xxx.xxx.xxx.xxx': SSL connect accept failed because of handshake problems

2019:03:06-15:15:46 fw01-2 red2ctl[12420]: Missing keepalive from reds3:0, disabling peer xxx.xxx.xxx.xxx

I know the last line is written before the tunnel disconnects, because there was no "PING/PONG" answer...

One customer has 2 x RD 50, one 1 100% stable and the other fluctuates in random intervals - we replaced this with a new RED 50, but the same thing occurs.

  • So I had previously switched to the non Unified Firmware for my RED 15s (all my REDs are RED 15s) which helped noticeably but we were still experiencing random dropouts of random REDs.  As of now I have upgraded manually to 9.700-5.  My question is do I need to re-enable/switch back to Unified Firmware or not?

     

    Thanks,

    Tracy 

  • In reply to Tracy Carlton:

    Check the status by going to the command line and issue command "cc get red use_unified_firmware"

    If it returns a 1 then your upgrade has automatically turned the Unified firmware back on. In previous upgrades this was the default behavior. See previous posts in this thread about the best way to perform an upgrade and still retain the old RED firmware on the devices.

     

    Whether or not you "need" to be on the Unified firmware is a matter of choice. I am still seeing people here reporting big problems with the Unified firmware (note Twisters many issues). I am not using the Unified firmware until I am convinced the issues have been completely fixed.

  • In reply to garth1138:

    I checked after the upgrade and can confirm that it *DID NOT* switch back to unified firmware.  The cc get red use_unified_firmware returned a 0 on both my UTMs after the upgrade.

     

    I am going to monitor the logs closely for the next little while an see what if anything happens.

     

    Tracy

  • In reply to Tracy Carlton:

    Tracy Carlton

    I checked after the upgrade and can confirm that it *DID NOT* switch back to unified firmware.  The cc get red use_unified_firmware returned a 0 on both my UTMs after the upgrade.

     ...

    These are very good news for anybody struggling with the update to 9.7. Thank you for sharing that. It seems Sophos has learned a little bit from the updates since 9.601.

    Best regard

    Alex

  • In reply to Alexander Busch:

    It started for all of  xenderthem just th omegle e night we upgraded to 9.601, and they all are on discord  different ISP's and located different places around the country.

  • In reply to Alexander Busch:

    This could also be very bad news, since users are reporting that everything is fixed in 9.700-5. It could be that they are all still running with the non unified version of the firmware and just think that it's been fixed. 

  • In reply to John Holliday:

    Just had another occurence of the server shutting down the clients, although it rebooted the client, it didn't bring the interface up correctly, why is the server killing off the clients

     

    2019:11:06-11:34:43 sophos-2 red_server[9984]: A#########ab: command '{"data":{"seq":783},"type":"PING"}'
    2019:11:06-11:34:43 sophos-2 red_server[9984]: A#########ab: Sending json message {"data":{"seq":783},"type":"PONG"}
    2019:11:06-11:34:44 sophos-2 red_server[32198]: SELF: shutdown requested, killing clients
    2019:11:06-11:34:44 sophos-2 red_server[32198]: SELF: killing client A#########ab
    2019:11:06-11:34:44 sophos-2 red_server[9984]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="A#########ab" forced="0"
    2019:11:06-11:34:44 sophos-2 red_server[32198]: SELF: killing client A#########3a
    2019:11:06-11:34:44 sophos-2 red_server[9984]: A#########ab is disconnected.
    2019:11:06-11:34:45 sophos-2 red_server[32198]: SELF: exiting
    2019:11:06-11:34:45 sophos-1 red_server[27988]: SELF: RED10rev1 fw version set to 14
    2019:11:06-11:34:45 sophos-1 red_server[27988]: SELF: RED10rev2 local fw version set to 5214R2
    2019:11:06-11:34:45 sophos-1 red_server[27988]: SELF: RED10rev2 fw version set to 2005R2
    2019:11:06-11:34:45 sophos-1 red_server[27988]: SELF: RED15(w) fw version set to 1-424-7131d4e52-e9f0c31
    2019:11:06-11:34:45 sophos-1 red_server[27988]: SELF: RED50 fw version set to 1-424-7131d4e52-0000000
    2019:11:06-11:34:45 sophos-1 red_server[27988]: SELF: IO::Socket::SSL Version: 1.953
    2019:11:06-11:34:45 sophos-1 red_server[27988]: SELF: Startup - waiting 15 seconds ...
    2019:11:06-11:34:46 sophos-1 red2ctl[28000]: Starting REDv2 control daemon
    2019:11:06-11:34:49 sophos-2 red2ctl[32206]: Stopping REDv2 control daemon
    2019:11:06-11:35:00 sophos-1 red_server[27988]: SELF: Overlay-fw has been updated ...
    2019:11:06-11:35:00 sophos-1 red_server[28845]: UPLOAD: Uploader process starting
    2019:11:06-11:35:01 sophos-1 red_server[27988]: SELF: (Re-)loading device configurations
    2019:11:06-11:35:01 sophos-1 red_server[27988]: A#########3a: New device
    2019:11:06-11:35:01 sophos-1 red_server[27988]: A#########3a: Staging config for upload
    2019:11:06-11:35:01 sophos-1 red_server[27988]: A#########ab: New device
    2019:11:06-11:35:01 sophos-1 red_server[27988]: A#########ab: Staging config for upload
    2019:11:06-11:35:02 sophos-1 red_server[28845]: [A#########3a] Config has not changed, no need to upload to registry service
    2019:11:06-11:35:02 sophos-1 red_server[28845]: [A#########ab] Config has not changed, no need to upload to registry service
    2019:11:06-11:36:34 sophos-1 red_server[29146]: SELF: Cannot do SSL handshake on socket accept from 'xx.100.xx.xx': SSL connect accept failed because of handshake problems
    2019:11:06-11:36:37 sophos-1 red_server[29147]: SELF: New connection from xx.100.xx.xx with ID A#########ab (cipher AES256-GCM-SHA384), rev1
    2019:11:06-11:36:37 sophos-1 red_server[29147]: A#########ab: connected OK, pushing config
    2019:11:06-11:36:37 sophos-1 red_server[27988]: SELF: (Re-)loading device configurations
    2019:11:06-11:36:38 sophos-1 red_server[29147]: A#########ab: command '{"data":{"version":"0"},"type":"INIT_CONNECTION"}'
    2019:11:06-11:36:38 sophos-1 red_server[29147]: A#########ab: Initializing connection running protocol version 0
    2019:11:06-11:36:38 sophos-1 red_server[29147]: A#########ab: Sending json message {"data":{},"type":"WELCOME"}
    2019:11:06-11:36:39 sophos-1 red_server[29147]: A#########ab: command '{"data":{},"type":"CONFIG_REQ"}'
    2019:11:06-11:36:39 sophos-1 red_server[29147]: A#########ab: Sending json message {"data":

  • In reply to Brian Stilts:

    Hello,

    yesterday I tried to start up with a brand new RED 15 and it still was caught in a boot loop.
    But I have to say that I still work with 9.605-1 on my SG 210.
    Today I read that all problems should have been fixed with 9.700-5.

    This version ist still not available for my SG 210 via Up2Date and I would not like to upgarde to a firmware that is not officialy released because I´ve been shipwrecked several times with that.

    My questions are:
    - Are there some updates/experiences? Are the problems really fixed in 9.700-5 ?
    - If YES, does anybody know when 9.700-5 will be available via Up2Date ?

    Greetings
    Bruno

  • In reply to Bruno Schley:

    Just as a note, 9.7 is officially released. See https://community.sophos.com/products/unified-threat-management/b/blog/posts/utm-up2date-9-700-released for the phases of the release.

    BR

    Alex

  • In reply to Bruno Schley:

    Thanks to everyone who posted information on this topic. It took many days of reading and re-reading to glean all the information and solutions that were provided. I would like to attach information that hopefully some will find beneficial. As a result of others postings I was able to perform testing and come to a resolution for my environment related to using, or in my case, NOT using the unified firmware. Which seemed to be the consensus, do not use unified firmware. As of this posting version 9.701-6 is available via Up2Date and also on the UTM downloads page as an ISO file.

     

    Thanks again to all who contributed.

  • In reply to ToddCooper:

    here is another update ...

    I performed a UTM update from 9.7 to 9.701, and this wiped out the Red50 onsite, i get the following error messages;

    Lots of these post upgrade ...
    2020:02:09-14:07:38 gw-utm-dia red_server[7756]: SELF: Cannot do SSL handshake on socket accept from '62.30.199.87': SSL connect accept failed because of handshake problems SSL wants a read first

    2020:02:09-14:07:43 gw-utm-dia red_server[7771]: SELF: New connection from 62.30.199.87 with ID A34025CA27C151C (cipher AES256-GCM-SHA384), rev1<30>Feb 9 14:07:43 red_server[7771]: A34025CA27C151C: connected OK, pushing config 2020:02:09-14:07:44 gw-utm-dia red_server[7771]: A34025CA27C151C: command '{"data":{"version":"0"},"type":"INIT_CONNECTION"}' 2020:02:09-14:07:44 gw-utm-dia red_server[7771]: A34025CA27C151C: Initializing connection running protocol version 0 2020:02:09-14:07:44 gw-utm-dia red_server[7771]: A34025CA27C151C: Sending json message {"data":{},"type":"WELCOME"} 2020:02:09-14:07:46 gw-utm-dia red_server[7771]: A34025CA27C151C: command '{"data":{},"type":"CONFIG_REQ"}' 2020:02:09-14:07:46 gw-utm-dia red_server[7771]: A34025CA27C151C: Sending json message {"data":{"pin":"","fullbr_dns":"","split_networks":"1.2.3.4","lan2_vids":"","lan4_vids":"","local_networks":"","tunnel_id":1,"manual2_netmask":24,"asg_cert":"[removed]","manual_address":"0.0.0.0","bridge_proto":"none","unlock_code":"aayc4jr8","password":"","manual2_defgw":"0.0.0.0","prev_unlock_code":"","manual_netmask":24,"lan3_vids":"","version_r2":"2005R2","mac_filter_type":"none","mac":"00:b8:f5:c9:ab:f3","dial_string":"*99#","manual2_address":"0.0.0.0","version_ng_red50":"1-433-2023f2ad6-0000000","manual_dns":"0.0.0.0","lan1_mode":"unused","username":"","activate_modem":0,"tunnel_compression_algorithm":"lzo","version_red50":"1-433-2023f2ad6-0000000","fullbr_domains":"","htp_server":"gw-utm-dia.fwa.org.uk","uplink_balancing":"failover","asg_key":"[removed]","type":"red50","deployment_mode":"online","uplink2_mode":"dhcp","version_red15":"1-433-2023f2ad6-e9f0c31","manual2...L1518 2020:02:09-14:07:46 gw-utm-dia red_server[7771]: A34025CA27C151C: command '{"data":{"message":"Firmware update required. Trying provisioning service ..."},"type":"DISCONNECT"}' 2020:02:09-14:07:46 gw-utm-dia red_server[7771]: A34025CA27C151C: Disconnecting: Firmware update required. Trying provisioning service ... 2020:02:09-14:07:46 gw-utm-dia red_server[7771]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="A34025CA27C151C" forced="1" 2020:02:09-14:07:46 gw-utm-dia red_server[7771]: A34025CA27C151C is disconnected.
    lots of these
    2020:02:09-14:08:06 gw-utm-dia red_server[7349]: SELF: Cannot do SSL handshake on socket accept from '62.30.199.87': SSL wants a read first 2020:02:09-14:09:36 gw-utm-dia red_server[9203]: SELF: Cannot do SSL handshake on socket accept from '62.30.199.87': SSL connect accept failed because of handshake problems 2020:02:09-14:09:37 gw-utm-dia red_server[9204]: SELF: Cannot do SSL handshake on socket accept from '62.30.199.87': SSL connect accept failed because of handshake problems
    2020:02:09-14:09:43 gw-utm-dia red_server[9212]: SELF: New connection from 62.30.199.87 with ID A34025CA27C151C (cipher AES256-GCM-SHA384), rev1<30>Feb 9 14:09:43 red_server[9212]: A34025CA27C151C: connected OK, pushing config 2020:02:09-14:09:44 gw-utm-dia red_server[9212]: A34025CA27C151C: command '{"data":{"version":"0"},"type":"INIT_CONNECTION"}' 2020:02:09-14:09:44 gw-utm-dia red_server[9212]: A34025CA27C151C: Initializing connection running protocol version 0 2020:02:09-14:09:44 gw-utm-dia red_server[9212]: A34025CA27C151C: Sending json message {"data":{},"type":"WELCOME"} 2020:02:09-14:09:45 gw-utm-dia red_server[9212]: A34025CA27C151C: command '{"data":{},"type":"CONFIG_REQ"}' 2020:02:09-14:09:45 gw-utm-dia red_server[9212]: A34025CA27C151C: Sending json message {"data":{"pin":"","fullbr_dns":"","split_networks":"1.2.3.4","lan2_vids":"","lan4_vids":"","local_networks":"","tunnel_id":1,"manual2_netmask":24,"asg_cert":"[removed]","manual_address":"0.0.0.0","bridge_proto":"none","unlock_code":"aayc4jr8","password":"","manual2_defgw":"0.0.0.0","prev_unlock_code":"","manual_netmask":24,"lan3_vids":"","version_r2":"2005R2","mac_filter_type":"none","mac":"00:b8:f5:c9:ab:f3","dial_string":"*99#","manual2_address":"0.0.0.0","version_ng_red50":"1-433-2023f2ad6-0000000","manual_dns":"0.0.0.0","lan1_mode":"unused","username":"","activate_modem":0,"tunnel_compression_algorithm":"lzo","version_red50":"1-433-2023f2ad6-0000000","fullbr_domains":"","htp_server":"gw-utm-dia.fwa.org.uk","uplink_balancing":"failover","asg_key":"[removed]","type":"red50","deployment_mode":"online","uplink2_mode":"dhcp","version_red15":"1-433-2023f2ad6-e9f0c31","manual2...L1518 2020:02:09-14:09:46 gw-utm-dia red_server[9212]: A34025CA27C151C: command '{"data":{"message":"Firmware update required. Trying provisioning service ..."},"type":"DISCONNECT"}' 2020:02:09-14:09:46 gw-utm-dia red_server[9212]: A34025CA27C151C: Disconnecting: Firmware update required. Trying provisioning service ... 2020:02:09-14:09:46 gw-utm-dia red_server[9212]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="A34025CA27C151C" forced="1" 2020:02:09-14:09:46 gw-utm-dia red_server[9212]: A34025CA27C151C is disconnected. 2020:02:09-14:11:03 gw-utm-dia red_server[9515]: SELF: Cannot do SSL handshake on socket accept from '62.30.199.87': SSL connect accept failed because of handshake problems SSL wants a read first 2020:02:09-14:11:04 gw-utm-dia red_server[9517]: SELF: Cannot do SSL handshake on socket accept from '62.30.199.87': SSL connect accept failed because of handshake problems 2020:02:09-14:11:10 gw-utm-dia red_server[9540]: SELF: New connection from 62.30.199.87 with ID A34025CA27C151C (cipher AES256-GCM-SHA384), rev1<30>Feb 9 14:11:10 red_server[9540]: A34025CA27C151C: connected OK, pushing config 2020:02:09-14:11:11 gw-utm-dia red_server[9540]: A34025CA27C151C: command '{"data":{"version":"0"},"type":"INIT_CONNECTION"}' 2020:02:09-14:11:11 gw-utm-dia red_server[9540]: A34025CA27C151C: Initializing connection running protocol version 0 2020:02:09-14:11:11 gw-utm-dia red_server[9540]: A34025CA27C151C: Sending json message {"data":{},"type":"WELCOME"} 2020:02:09-14:11:12 gw-utm-dia red_server[9540]: A34025CA27C151C: command '{"data":{},"type":"CONFIG_REQ"}' 2020:02:09-14:11:12 gw-utm-dia red_server[9540]: A34025CA27C151C: Sending json message {"data":{"pin":"","fullbr_dns":"","split_networks":"1.2.3.4","lan2_vids":"","lan4_vids":"","local_networks":"","tunnel_id":1,"manual2_netmask":24,"asg_cert":"[removed]","manual_address":"0.0.0.0","bridge_proto":"none","unlock_code":"aayc4jr8","password":"","manual2_defgw":"0.0.0.0","prev_unlock_code":"","manual_netmask":24,"lan3_vids":"","version_r2":"2005R2","mac_filter_type":"none","mac":"00:b8:f5:c9:ab:f3","dial_string":"*99#","manual2_address":"0.0.0.0","version_ng_red50":"1-433-2023f2ad6-0000000","manual_dns":"0.0.0.0","lan1_mode":"unused","username":"","activate_modem":0,"tunnel_compression_algorithm":"lzo","version_red50":"1-433-2023f2ad6-0000000","fullbr_domains":"","htp_server":"gw-utm-dia.fwa.org.uk","uplink_balancing":"failover","asg_key":"[removed]","type":"red50","deployment_mode":"online","uplink2_mode":"dhcp","version_red15":"1-433-2023f2ad6-e9f0c31","manual2...L1518

    I did have an issue the other day, and had to reset the Red50 (paper clip in the unmarked hole on the rear of the Red50) and all was fine.

    now I will have to do this again to reset to defaults again!


  • In reply to Argo:

    well this is good, the 9.701 update seems to have caused some issues with Red Devices.

     hope you hasn't been transformed into a paper weight?

  • In reply to Argo:

    Hi all,

    I can confirm the problems with various firmware versions. We lost 7 RED50 devices so far. Most of them during normal operating times and some even during "low load" times during the night. The last RED50 was bricked 2 days ago during the update from 9.700-5 to 9.701-6 - no connection and nothing on the display anymore. 

    This situation is not acceptable, especially as there is no clear statement from the sophos support to the cause of the problem and when it will be fixed. 

    In addition to the RED50 devices we use 4 RED15. They are not affected yet (fingers crossed!). 

     

    Best regards, 

    Falk

  • In reply to Fr�hner Alexander:

    I have taken all of my RED50 out of circulation, they are horrendous - my RED15W devices seem to work OK (for now).

    The locations where the RED50 were I have had to replace them with UTM SG115W, which is overkill, but they are connected via site-to-site tunnel and at least I can control the updates so that they don't automatically get junk firmware and brick themselves!

    Terrible support on this Sophos, really really bad.

  • In reply to James Stoy:

    James Stoy

     

    Terrible support on this Sophos, really really bad.

     
    Totally Agree !!
     
    Two RED50 have been bricked. We replaced them with Red15 on our own to bring the location back, but nevertheless i need a new setup soon, because we want to push VLANs to the Remote-Location which is not very easy with Red15. Also tried to connect via serial connection and there are a lot of errors while the RED50 was booting!
     
    After talking to our partner, what happens now with the open tickets at sophos support regarding this issue, there is no feedback at all from Sophos !!! Now we have thrown them away! 
     
    We have other Tickets open at Sophos Support, regarding WiFi and Mesh Issues e.g. over half a year now! Response time is awfull and they cannot/do not help, always investigating, update to newest firmware, check again, etc. etc. 
     
    We will definetly move away from Sophos in Future! A pity we refreshed our Hardware not long time ago.