This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

UTM 9.601 - RED issues!

Since upgrading all our customers to 9.601, a bigger part of them are complaining about RED's re/disconnection in a no-pattern way.

It started for all of them just the night we upgraded to 9.601, and they all are on different ISP's and located different places around the country.

Been with Sophos support for 2 hours today, and now they escalated it to higher grounds.

Will return with an update....

Suspicious entries in the log - but all connected REDs do this before connection:

2019:03:06-15:15:38 fw01-2 red_server[17509]: SELF: Cannot do SSL handshake on socket accept from 'xxx.xxx.xxx.xxx': SSL connect accept failed because of handshake problems

2019:03:06-15:15:46 fw01-2 red2ctl[12420]: Missing keepalive from reds3:0, disabling peer xxx.xxx.xxx.xxx

I know the last line is written before the tunnel disconnects, because there was no "PING/PONG" answer...

One customer has 2 x RD 50, one 1 100% stable and the other fluctuates in random intervals - we replaced this with a new RED 50, but the same thing occurs.



This thread was automatically locked due to age.
  • We were fortunate and didn't lose any REDs in the upgrade process.

     

    Hopefully this is isolated, but our upgrade process failed partway through though, and left one of the nodes stuck in up2date mode.  Apparently the primary node somehow lost the file and was unable to complete the update when its turn came up.

    It took a bit of doing to get the file on the passive node so it could complete, but we're back now.

    I just hope this resolves the RED issues for good, I'm going to be skeptical of these things for a while now.

  • Hi Jan,

     

    after upgrading to 9.702-1 on UTM the RED device ist again in bootloop.

     

    Only difference: it is now tring to connect to UTM


    2020:03:12-11:44:29 firewall red_server[13446]: SELF: Cannot do SSL handshake on socket accept from '<public IP adress>': SSL connect accept failed because of handshake problems
    2020:03:12-11:44:35 firewall red_server[13478]: SELF: New connection from <public IP adress> with ID <RED-ID> (cipher AES256-GCM-SHA384), rev1<30>Mar 12 11:44:35 red_server[13478]: <RED-ID>: connected OK, pushing config
    2020:03:12-11:44:36 firewall red_server[13478]: <RED-ID>: command '{"data":{"version":"0"},"type":"INIT_CONNECTION"}'
    2020:03:12-11:44:36 firewall red_server[13478]: <RED-ID>: Initializing connection running protocol version 0
    2020:03:12-11:44:36 firewall red_server[13478]: <RED-ID>: Sending json message {"data":{},"type":"WELCOME"}
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID>: command '{"data":{},"type":"CONFIG_REQ"}'
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID>: Sending json message {"data":{"pin":"","fullbr_dns":"","split_networks":"1.2.3.4","lan2_vids":"","lan4_vids":"","local_networks":"","tunnel_id":2,"manual2_netmask":24,"asg_cert":"[removed]","manual_address":"0.0.0.0","bridge_proto":"none","unlock_code":"egbbg2nu","password":"","manual2_defgw":"0.0.0.0","prev_unlock_code":"","manual_netmask":24,"lan3_vids":"","version_r2":"2005R2","mac_filter_type":"none","mac":"00:f3:02:11:e7:d0","dial_string":"*99#","manual2_address":"0.0.0.0","version_ng_red50":"1-442-bdae8a94a-0000000","manual_dns":"0.0.0.0","lan1_mode":"unused","username":"","activate_modem":0,"tunnel_compression_algorithm":"lzo","version_red50":"1-442-bdae8a94a-0000000","fullbr_domains":"","htp_server":"<fqdn>","uplink_balancing":"failover","asg_key":"[removed]","type":"red50","deployment_mode":"online","uplink2_mode":"dhcp","version_red15":"1-433-2023f2ad6-e9f0c31","manual2_...L1504
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID>: command '{"data":{"message":"Firmware update required. Trying provisioning service ..."},"type":"DISCONNECT"}'
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID>: Disconnecting: Firmware update required. Trying provisioning service ...
    2020:03:12-11:44:38 firewall red_server[13478]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="<RED-ID>" forced="1"
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID> is disconnected.


    Further I discovered that the RED is displaying WAN2 as WAN1 and WAN1 as WAN2

    So if you have plugged in WAN1 and try to display ip adress of WAN1 then the display gives N/A

    However if you let the RED show you the ip adress of WAN2 the ip provided by DHCP is displayed...


     

    Now I am currently in the german 'Warteschleife für UTM' and i try not to break my phone.

    I was so happy that you finally fixed the problems your devs produced and now there is the next glitch ?

    WTF SOPHOS


  • ..oh wow ... after enabling the tunnel compression and disabling it again, the firwareupdate on the red started... maybe now there will be success ?

    keep you posted...

    UPDATE:

    After 4 reboots the device is finally up ...

  • yes tried and seen exactly that on 5 devices :-O

    -----

    Best regards
    Martin

    Sophos XGS 2100 @ Home | Sophos v20 Technician

  • Hi,

    what is the current recommend Setting with UTM Firmware 9.702-1 ?

     

    cc set red use_unified_firmware = 0

    or

    cc set red use_unified_firmware = 1

     

    Thanks,
    Klaus

  • So all RED's are running 9.702-1 and still seeing this SSL drop issue. 

     

    red2ctl[15035]: Overflow happened

    SELF: Cannot do SSL handshake on socket accept from 'x.x.x.x': SSL connect accept failed because of handshake problems

     

    Honestly...

  • Moved the RED15 from our office in Italy to a customer in Austria.

     

    Got 

    SELF: Cannot do SSL handshake on socket accept from 'x.x.x.x': SSL connect accept failed because of handshake problems

     

     

    Reading the post, I'll try cc set red use_unified_firmware 0 and MTU 1400 tricks.

    Let's see tomorrow...

     

    Good night Sophos. It seems you're in trouble.

     

    G.

  • I'm still getting this also on a RED 50 running 9.702-1.... quite annoying.

  • Hello everybody,

    can somebody give me an update how the actual situation is about the connection problems, bootloop problems and unified_firmware problems wit RED ?

    I was strugeling myself at the end of 2019 with the bootloop problem and I`m still looking for a solution to connect a branch to our main office, using the same IP-range.
    I got three RED15 over RMA, but no one worked until I found this post - we gave all of them back but I still have no connection.

    In the main office we`re using a SG210 running 9.605-1, but I could update to 9.702-1.

    What about the RED 20....same problem...?

     

    Greetings..
    Bruno