This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

UTM 9.601 - RED issues!

Since upgrading all our customers to 9.601, a bigger part of them are complaining about RED's re/disconnection in a no-pattern way.

It started for all of them just the night we upgraded to 9.601, and they all are on different ISP's and located different places around the country.

Been with Sophos support for 2 hours today, and now they escalated it to higher grounds.

Will return with an update....

Suspicious entries in the log - but all connected REDs do this before connection:

2019:03:06-15:15:38 fw01-2 red_server[17509]: SELF: Cannot do SSL handshake on socket accept from 'xxx.xxx.xxx.xxx': SSL connect accept failed because of handshake problems

2019:03:06-15:15:46 fw01-2 red2ctl[12420]: Missing keepalive from reds3:0, disabling peer xxx.xxx.xxx.xxx

I know the last line is written before the tunnel disconnects, because there was no "PING/PONG" answer...

One customer has 2 x RD 50, one 1 100% stable and the other fluctuates in random intervals - we replaced this with a new RED 50, but the same thing occurs.



This thread was automatically locked due to age.
  • Interesting, Argo!

    I have a client whose RED 15 was (seemingly) killed on 11 August by the 9.604-to-9.605 Up2Date.  When the replacement also wouldn't connect, I asked to work with someone onsite at the remote office 400 miles away from my usual interlocutor for this client.  I was suspicious that they had upgraded their service a month before the RED 15 stopped working and that their ISP had given instructions to another person in that office on setting a fixed public IP so that he could connect over the Internet without a functional RED.  I asked the guy to try getting a public IP on a laptop connected directly to the ISP's modem.  The laptop couldn't get an IP, so I asked the guy to call the ISP and have them enable DHCP for their connection.  Bingo!  The RED 15 came online as soon as the ISP flipped the switch.

    It turns out that a RED needs DHCP when it first downloads its configuration from the cloud, but it's not necessary after that.  This is why the original RED 15 was unaffected by the loss of DHCP on their connection.

    I'm having the original RED 15 shipped to me to examine.  My theory is that the firmware upgrade in the 9.604-to-9.605 Up2Date left the RED in an unconfigured state - making it require DHCP to get its configuration.  I expect to receive the device Monday or Tuesday and will report back here as well as to Sophos Support.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • After the Desaster Update 9.605-1 we now hat to replace our 2 Red50 and 1 Red 10 to Red15

    The Red15 which was replaced first, was running without any issues more than 2 weeks on 9.605-1. Today it just stopped working:

    2019:08:27-12:27:16 vpn red_server[5074]: RED15-STOPPED-WORKING: command '{"data":{"seq":47539},"type":"PING"}'
    2019:08:27-12:27:16 vpn red_server[5074]: RED15-STOPPED-WORKING: Sending json message {"data":{"seq":47539},"type":"PONG"}
    2019:08:27-12:27:23 vpn red_server[5074]: RED15-STOPPED-WORKING: command '{"data":{"key_active":1,"key0":"OMUxKkof9EVz\/7BOjAYp7uCcsa5ybLsx9g2pZ7+jlVk="},"type":"SET_KEY_REQ"}'
    2019:08:27-12:27:23 vpn red_server[5074]: RED15-STOPPED-WORKING: Sending json message {"data":{},"type":"SET_KEY_REP"}
    2019:08:27-12:27:47 vpn red_server[5074]: RED15-STOPPED-WORKING: No ping for 30 seconds, exiting.
    2019:08:27-12:27:47 vpn red_server[5074]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="RED15-STOPPED-WORKING" forced="0"
    2019:08:27-12:27:47 vpn red_server[5074]: RED15-STOPPED-WORKING is disconnected.
    2019:08:27-12:27:47 vpn red_server[6966]: SELF: (Re-)loading device configurations
    2019:08:27-12:27:49 vpn red2ctl[4938]: Overflow happened on reds3:0
    2019:08:27-12:27:49 vpn red2ctl[4938]: Missing keepalive from reds3:0, disabling peer EXTERNAL-IP-REMOTE
    2019:08:27-12:27:52 vpn red2ctl[4938]: Received keepalive from reds3:0, enabling peer EXTERNAL-IP-REMOTE


    we already tried everything in the meantime with our old Red50/Red10 and now with the new Red15:

    - we switched to public IP instead of FQDN already 2 weeks ago
    - disabling tunnel compression
    - setting MTU 1400
    - disabling the red, waiting 5 minutes and enable it again
    - removed the Red and added it again.
    - switching from static RED-WAN-IP to DHCP

    It is not reconnecting anymore. Checking the DSL-Modem at the office the Red is not asking for an ip-adress via DHCP??


    After adding the Red15 back to our SG310 Rev2. this is all whats gonna happen:

    2019:08:27-21:25:27 vpn red_server[6966]: SELF: (Re-)loading device configurations
    2019:08:27-21:25:29 vpn red_server[6966]: SELF: (Re-)loading device configurations
    2019:08:27-21:25:29 vpn red_server[6966]: RED15-STOPPED-WORKING: New device
    2019:08:27-21:25:29 vpn red_server[6966]: RED15-STOPPED-WORKING: Staging config for upload
    2019:08:27-21:25:29 vpn red_server[6966]: SELF: (Re-)loading device configurations
    2019:08:27-21:25:31 vpn red_server[7212]: RED15-STOPPED-WORKING Uploaded config to registry service

     

    i also checked the up2date ftp server for maybe a new fix, it seems sophos has now canceld rolling out the 9.605-1, because you cant download it manually.

    now we have to go again to the remote office, and checking the Red :-(

    I am pretty angry about these ongoing issues. These are real time wasters and we dont have time for this. Glad currently the office is on holiday and no one is there.

    Any News on a new Firmware Update for UTM?

    Regards

    Peter

  • It seems to me the same problem I have got with one of our RED15. The device simply stops working. Only the first both LEDs are green. A reboot of the RED solves the problem for a few days up to about 10 days.

    At the moment I was trying the MTU 1400, result pending.

    It’s not very good to have these problems with devices in remote locations, but whom I am telling that...

    Best regards

    Alex

    -

  • Hi all, 

     

    We have been experiencing this problem with 2 separate RED15s intermittently going bye-bye ever since 9.601. The temporary workarounds of setting the MTU to 1400, as well as removing and re-adding the RED in the clustered SG230 to force a clean restart have kept us up and running so far. 

    I have been monitoring this thread for close to six months in the hope that the problem would be cleared up in a subsequent update. Unfortunately, this does not seem to be the case so far. We have been keeping the SG230 at 9.601 to not risk any further damage or different issues. Judging by the reports here, that seems to be a wise decision, but I do not like keeping firmware this far behind, and am getting very concerned as to whether Sophos will be able to fix the problem at all. If anyone from Sophos is reading this: I am sure we would all appreciate an official update regarding the issue!

     

    Best, OliverW8

  • After deleting the RED15 and adding it again while setting the Interface to 1400 MTU, the RED came back round about 6 hours later from itself!

  • my upgrade path has been

    9.601  to 9.604  to   9.605

    i had 1 red device that upgrade fine with the unified firmware to 601 and then to 604 ,  but it didnt make it to 9.605 

    i submitted an RMA  on monday the 26th after the initial update that i will get a tracking number.

    the next mail i got was a customer satisfaction survey email,  in spanish

    i am in austria (german)   

    the case is closed

    when i queried this i am being told they have no stock at present and no ETA 

    we bought 2spares a while ago , becuase of all these problems.

    its no wonder they have shortages as the last 3 RMA,s  they say you dont have to send back the defective units ..... 

  • 5+ hours downtime this morning. No problem on our local network or Internet connection. The RED 15 (in Germany) was trying to handshake with an IP in US - I assume one of Sophos providers. 

     

    2019:08:29-05:07:24 neo-2 red_server[16917]: A35xxxxxxxxxxxx: No ping for 30 seconds, exiting.
    2019:08:29-05:07:24 neo-2 red_server[16917]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="A35xxxxxxxxxxxx" forced="0"
    2019:08:29-05:07:24 neo-2 red_server[16917]: A35xxxxxxxxxxxx is disconnected.
    2019:08:29-05:07:24 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-05:07:26 neo-2 red2ctl[21514]: Overflow happened on reds2:0
    2019:08:29-05:07:26 neo-2 red2ctl[21514]: Missing keepalive from reds2:0, disabling peer 195.xxx.xxx.xx
    2019:08:29-05:07:29 neo-2 red2ctl[21514]: Received keepalive from reds2:0, enabling peer 195.xxx.xxx.xx
    2019:08:29-05:08:07 neo-2 red_server[6708]: SELF: Cannot do SSL handshake on socket accept from '195.xxx.xxx.xx': SSL connect accept failed because of handshake problems
    2019:08:29-05:19:38 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-05:34:25 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-05:38:06 neo-2 red_server[11570]: SELF: Cannot do SSL handshake on socket accept from '195.xxx.xxx.xx': SSL wants a read first
    2019:08:29-05:49:23 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-06:04:27 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-06:19:27 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-06:34:24 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-06:49:38 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-07:04:32 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-07:04:46 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-07:19:21 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-07:34:21 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-07:49:22 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-07:50:00 neo-2 red_server[2023]: SELF: Cannot do SSL handshake on socket accept from '198.108.67.48': SSL accept attempt failed with unknown error SSL wants a read first
    2019:08:29-07:50:00 neo-2 red_server[2027]: SELF: Cannot do SSL handshake on socket accept from '198.108.67.48': SSL accept attempt failed with unknown error SSL wants a read first
    2019:08:29-07:50:00 neo-2 red_server[2026]: SELF: Cannot do SSL handshake on socket accept from '198.108.67.48': SSL accept attempt failed with unknown error SSL wants a read first
    2019:08:29-07:50:00 neo-2 red_server[2044]: SELF: Cannot do SSL handshake on socket accept from '198.108.67.48': SSL accept attempt failed with unknown error error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
    2019:08:29-07:50:00 neo-2 red_server[2046]: SELF: Cannot do SSL handshake on socket accept from '198.108.67.48': SSL accept attempt failed with unknown error error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
    2019:08:29-07:50:00 neo-2 red_server[2049]: SELF: Cannot do SSL handshake on socket accept from '198.108.67.48': SSL accept attempt failed with unknown error error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
    2019:08:29-07:50:01 neo-2 red_server[2051]: SELF: unable to get peer address or retrieve CN for '198.108.67.48'
    2019:08:29-07:50:01 neo-2 red_server[2052]: SELF: unable to get peer address or retrieve CN for '198.108.67.48'
    2019:08:29-07:50:01 neo-2 red_server[2053]: SELF: unable to get peer address or retrieve CN for '198.108.67.48'
    2019:08:29-08:04:24 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-08:19:24 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-08:34:28 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-08:49:33 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-09:04:27 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-09:19:26 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-09:34:24 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-09:49:23 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-10:04:26 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-10:04:40 neo-2 red_server[21506]: SELF: (Re-)loading device configurations
    2019:08:29-10:12:00 neo-2 red_server[6015]: SELF: Cannot do SSL handshake on socket accept from '195.xxx.xxx.xx': SSL connect accept failed because of handshake problems
    2019:08:29-10:12:03 neo-2 red_server[6026]: SELF: New connection from 195.xxx.xxx.xx with ID A35xxxxxxxxxxxx (cipher AES256-GCM-SHA384), rev1
    2019:08:29-10:12:03 neo-2 red_server[6026]: A35xxxxxxxxxxxx: connected OK, pushing config
    2019:08:29-10:12:04 neo-2 red_server[6026]: A35xxxxxxxxxxxx: command '{"data":{"version":"0"},"type":"INIT_CONNECTION"}'
    2019:08:29-10:12:04 neo-2 red_server[6026]: A35xxxxxxxxxxxx: Initializing connection running protocol version 0
    2019:08:29-10:12:04 neo-2 red_server[6026]: A35xxxxxxxxxxxx: Sending json message {"data":{},"type":"WELCOME"}
    2019:08:29-10:12:05 neo-2 red_server[6026]: A35xxxxxxxxxxxx: command '{"data":{},"type":"CONFIG_REQ"}'
    2019:08:29-10:12:05 neo-2 red_server[6026]: A35xxxxxxxxxxxx: Sending json message {"data":{"pin":"","fullbr_dns":"","split_networks":"1.2.3.4","lan2_vids":"","lan4_vids":"","local_networks":"","tunnel_id":2,"manual2_netmask":24,"asg_cert":"[removed]","manual_address":"195.xxx.xxx.xx","bridge_proto":"none","unlock_code":"qm7gittj","password":"","manual2_defgw":"0.0.0.0","prev_unlock_code":"qm7gittj","manual_netmask":29,"lan3_vids":"","version_r2":"2005R2","mac_filter_type":"none","mac":"00:47:9c:f3:f3:2e","dial_string":"*99#","manual2_address":"0.0.0.0","version_ng_red50":"1-330-f4c55ab8-0000000","manual_dns":"194.25.0.60","lan1_mode":"unused","username":"","activate_modem":0,"tunnel_compression_algorithm":"lzo","version_red50":"1-330-f4c55ab8-0000000","fullbr_domains":"","htp_server":"neo.geco-group.com","uplink_balancing":"failover","asg_key":"[removed]","type":"red15","deployment_mode":"online","uplink2_mode":"dhcp","version_red15":"1-330-f4c55ab8-655eb...L1538
    2019:08:29-10:12:08 neo-2 red_server[6026]: A35xxxxxxxxxxxx: command '{"data":{"key1":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","key0":"yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy","key_active":0},"type":"SET_KEY_REQ"}'
    2019:08:29-10:12:08 neo-2 red_server[6026]: A35xxxxxxxxxxxx: Sending json message {"data":{},"type":"SET_KEY_REP"}
    2019:08:29-10:12:09 neo-2 red_server[6026]: A35xxxxxxxxxxxx: command '{"data":{"seq":0},"type":"PING"}'
    2019:08:29-10:12:09 neo-2 red_server[6026]: id="4201" severity="info" sys="System" sub="RED" name="RED Tunnel Up" red_id="A35xxxxxxxxxxxx" forced="0"
    2019:08:29-10:12:09 neo-2 red_server[6026]: A35xxxxxxxxxxxx: Sending json message {"data":{"seq":0},"type":"PONG"}
    2019:08:29-10:12:10 neo-2 red_server[6026]: A35xxxxxxxxxxxx: command '{"data":{"wan1_ip":"195.xxx.xxx.xx","mobile_signal_strength":"","wan2_ip":"","uplink":"WAN1","uplink_state":"0"},"type":"STATUS"}'
    2019:08:29-10:12:11 neo-2 red2ctl[21514]: Overflow happened on reds2:0
    2019:08:29-10:12:11 neo-2 red2ctl[21514]: Missing keepalive from reds2:0, disabling peer 195.xxx.xxx.xx
    2019:08:29-10:12:14 neo-2 red2ctl[21514]: Received keepalive from reds2:0, enabling peer 195.xxx.xxx.xx
    2019:08:29-10:12:18 neo-2 red_server[21506]: SELF: (Re-)loading device configurations

     

     

    The xxxxx and yyyy strings are mine. At 10:12, the config reloaded and the RED resumed operation. 

     

    Still hoping that Sophos will fix this, but urgently looking for an alternative in the meantime. Any suggestions for devices to replace the RED?

  • An Alternative would be a SG1xx in RED-Mode but you need a Network Subscription for that, when i am right.

    But this is more expensive than a red15 or red50, because currently you can get a Red15 für 250 € (brutto)  and a SG105 e.g. starts at 370 € (brutto) without subscription.

     

     

     

  • Thanks for your response! This is kind of what I had in mind. Spending the extra money is slightly painful, but not as painful as the current unreliability of the RED :-/ I think we can repurpose the RED as a backup VPN solution and move to an SG or even XG device for the primary. 

  • Anybody trying the use unified firmware switch with 9.605?

    Is that still working in that version ?

    Best regards

    Alex

    -