This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

UTM 9.601 - RED issues!

Since upgrading all our customers to 9.601, a bigger part of them are complaining about RED's re/disconnection in a no-pattern way.

It started for all of them just the night we upgraded to 9.601, and they all are on different ISP's and located different places around the country.

Been with Sophos support for 2 hours today, and now they escalated it to higher grounds.

Will return with an update....

Suspicious entries in the log - but all connected REDs do this before connection:

2019:03:06-15:15:38 fw01-2 red_server[17509]: SELF: Cannot do SSL handshake on socket accept from 'xxx.xxx.xxx.xxx': SSL connect accept failed because of handshake problems

2019:03:06-15:15:46 fw01-2 red2ctl[12420]: Missing keepalive from reds3:0, disabling peer xxx.xxx.xxx.xxx

I know the last line is written before the tunnel disconnects, because there was no "PING/PONG" answer...

One customer has 2 x RD 50, one 1 100% stable and the other fluctuates in random intervals - we replaced this with a new RED 50, but the same thing occurs.



This thread was automatically locked due to age.
Parents
  • So I had previously switched to the non Unified Firmware for my RED 15s (all my REDs are RED 15s) which helped noticeably but we were still experiencing random dropouts of random REDs.  As of now I have upgraded manually to 9.700-5.  My question is do I need to re-enable/switch back to Unified Firmware or not?

     

    Thanks,

    Tracy 

  • Check the status by going to the command line and issue command "cc get red use_unified_firmware"

    If it returns a 1 then your upgrade has automatically turned the Unified firmware back on. In previous upgrades this was the default behavior. See previous posts in this thread about the best way to perform an upgrade and still retain the old RED firmware on the devices.

     

    Whether or not you "need" to be on the Unified firmware is a matter of choice. I am still seeing people here reporting big problems with the Unified firmware (note Twisters many issues). I am not using the Unified firmware until I am convinced the issues have been completely fixed.

  • Jan, for XG, as posted before, I wrote that I think XG is not running the Unified Firmware as 9.702 does, and I have noticed that when I on Xg run firmware 2.0.18 and upgrade to 2.0.19 the firmware revision doe snot change, nor does the RED 50 perform firmware upgrade, it just reconnects, here is the log from the XG, is this intetionally ?!?! or an error:

     

    Firmware 2.0.18

    Wed Feb 19 12:50:57 2020 REDD INFO: server: Using RED firmware in /content/redfw/
    Wed Feb 19 12:50:57 2020 REDD INFO: server: RED10 fw version set to 10224R2
    Wed Feb 19 12:50:57 2020 REDD INFO: server: RED15(w) fw version set to 10224
    Wed Feb 19 12:50:57 2020 REDD INFO: server: RED50 fw version set to 10224

     

    Firmware 2.0.19

    Tue Mar  3 12:58:55 2020 REDD INFO: server: RED10 fw version set to 10224R2
    Tue Mar  3 12:58:55 2020 REDD INFO: server: RED15(w) fw version set to 10224
    Tue Mar  3 12:58:55 2020 REDD INFO: server: RED50 fw version set to 10224

     

    ?!?!?

    -----

    Best regards
    Martin

    Sophos XGS 2100 @ Home | Sophos v20 Technician

  • Further funny evidence :-O

     

     

    -----

    Best regards
    Martin

    Sophos XGS 2100 @ Home | Sophos v20 Technician

  • Hi  

    Followed up with the team to confirm the behavior.

    • Users who are upgrading and are currently on the legacy firmware will not be forced onto the unified firmware (at the moment).

    Regards,


    Florentino
    Director, Global Community & Digital Support

    Are you a Sophos Partner? | Product Documentation@SophosSupport | Sign up for SMS Alerts
    If a post solves your question, please use the 'Verify Answer' button.
    The Award-winning Home of Sophos Support Videos! - Visit Sophos Techvids
  • Hi Martin,

    not all XG installations are running on the unified firmware, so it is expected that the new pattern would not install a new firmware on the RED50 in all cases when it comes to XG.

    So nothing to worry here, if your XG connected RED50 is not installing a new firmware you are not on the unified firmware and there is no change needed. If you would have been on the unified firmware the RED50 would have updated.

    Jan

  • Hi Jan,

     

    Thanks for explaining - it's off-topic here, but I cannot seem to find anywhere how to switch XG to unified firmware, even if I deploy a whole new Xg 18, it's deployed with the legacy firmware?! - It's kinda odd.

     

    One other thing.

    When upgrading to 9.702-1, if a RED 50 is already so "bad" in shape with the flash file structure, it bricks when you upgrade to 9.702.

     

    I upgraded 36 UTM's yesterday and two RED 50's from 9.701 died after moving to 9.702. So watch out and plan carefully!

    -----

    Best regards
    Martin

    Sophos XGS 2100 @ Home | Sophos v20 Technician

  • Yep!  Watch out!  I experienced the same thing as twister5800.  I updated to 9.702 last night, one RED50 updated successfully, the other one died.  The one that died was already acting odd.  It had a few disconnects over the last couple of weeks even though the connection (fiber) never had an outage.

    The one that died was only put in to service on 1-16-2020.  So it wasn't like it was a really old RED50.  It was an RMA replacement with a mfg date of July of 2019.

    BTW, this is dead RED50 number 7 for me.  Aren't I lucky?  Hopefully this disaster is finally over.

    Thanks,

               BP

  • We were fortunate and didn't lose any REDs in the upgrade process.

     

    Hopefully this is isolated, but our upgrade process failed partway through though, and left one of the nodes stuck in up2date mode.  Apparently the primary node somehow lost the file and was unable to complete the update when its turn came up.

    It took a bit of doing to get the file on the passive node so it could complete, but we're back now.

    I just hope this resolves the RED issues for good, I'm going to be skeptical of these things for a while now.

  • Hi Jan,

     

    after upgrading to 9.702-1 on UTM the RED device ist again in bootloop.

     

    Only difference: it is now tring to connect to UTM


    2020:03:12-11:44:29 firewall red_server[13446]: SELF: Cannot do SSL handshake on socket accept from '<public IP adress>': SSL connect accept failed because of handshake problems
    2020:03:12-11:44:35 firewall red_server[13478]: SELF: New connection from <public IP adress> with ID <RED-ID> (cipher AES256-GCM-SHA384), rev1<30>Mar 12 11:44:35 red_server[13478]: <RED-ID>: connected OK, pushing config
    2020:03:12-11:44:36 firewall red_server[13478]: <RED-ID>: command '{"data":{"version":"0"},"type":"INIT_CONNECTION"}'
    2020:03:12-11:44:36 firewall red_server[13478]: <RED-ID>: Initializing connection running protocol version 0
    2020:03:12-11:44:36 firewall red_server[13478]: <RED-ID>: Sending json message {"data":{},"type":"WELCOME"}
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID>: command '{"data":{},"type":"CONFIG_REQ"}'
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID>: Sending json message {"data":{"pin":"","fullbr_dns":"","split_networks":"1.2.3.4","lan2_vids":"","lan4_vids":"","local_networks":"","tunnel_id":2,"manual2_netmask":24,"asg_cert":"[removed]","manual_address":"0.0.0.0","bridge_proto":"none","unlock_code":"egbbg2nu","password":"","manual2_defgw":"0.0.0.0","prev_unlock_code":"","manual_netmask":24,"lan3_vids":"","version_r2":"2005R2","mac_filter_type":"none","mac":"00:f3:02:11:e7:d0","dial_string":"*99#","manual2_address":"0.0.0.0","version_ng_red50":"1-442-bdae8a94a-0000000","manual_dns":"0.0.0.0","lan1_mode":"unused","username":"","activate_modem":0,"tunnel_compression_algorithm":"lzo","version_red50":"1-442-bdae8a94a-0000000","fullbr_domains":"","htp_server":"<fqdn>","uplink_balancing":"failover","asg_key":"[removed]","type":"red50","deployment_mode":"online","uplink2_mode":"dhcp","version_red15":"1-433-2023f2ad6-e9f0c31","manual2_...L1504
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID>: command '{"data":{"message":"Firmware update required. Trying provisioning service ..."},"type":"DISCONNECT"}'
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID>: Disconnecting: Firmware update required. Trying provisioning service ...
    2020:03:12-11:44:38 firewall red_server[13478]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="<RED-ID>" forced="1"
    2020:03:12-11:44:38 firewall red_server[13478]: <RED-ID> is disconnected.


    Further I discovered that the RED is displaying WAN2 as WAN1 and WAN1 as WAN2

    So if you have plugged in WAN1 and try to display ip adress of WAN1 then the display gives N/A

    However if you let the RED show you the ip adress of WAN2 the ip provided by DHCP is displayed...


     

    Now I am currently in the german 'Warteschleife für UTM' and i try not to break my phone.

    I was so happy that you finally fixed the problems your devs produced and now there is the next glitch ?

    WTF SOPHOS


  • ..oh wow ... after enabling the tunnel compression and disabling it again, the firwareupdate on the red started... maybe now there will be success ?

    keep you posted...

    UPDATE:

    After 4 reboots the device is finally up ...

Reply Children