This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

UTM 9.601 - RED issues!

Since upgrading all our customers to 9.601, a bigger part of them are complaining about RED's re/disconnection in a no-pattern way.

It started for all of them just the night we upgraded to 9.601, and they all are on different ISP's and located different places around the country.

Been with Sophos support for 2 hours today, and now they escalated it to higher grounds.

Will return with an update....

Suspicious entries in the log - but all connected REDs do this before connection:

2019:03:06-15:15:38 fw01-2 red_server[17509]: SELF: Cannot do SSL handshake on socket accept from 'xxx.xxx.xxx.xxx': SSL connect accept failed because of handshake problems

2019:03:06-15:15:46 fw01-2 red2ctl[12420]: Missing keepalive from reds3:0, disabling peer xxx.xxx.xxx.xxx

I know the last line is written before the tunnel disconnects, because there was no "PING/PONG" answer...

One customer has 2 x RD 50, one 1 100% stable and the other fluctuates in random intervals - we replaced this with a new RED 50, but the same thing occurs.



This thread was automatically locked due to age.
Parents
  • Hi All,

     

    I just thought I would do some diagnosing after I was getting the reboot loop on a RED50. Here is what I found out.

    On removing the Red50 From the Customer after Replacing using an RMA, I took the 'Dead' Red50 back to my office to see if I could replicate the issue, and understand it more.

    I found that the source port for the initial communications was TCP/3400, where I was expecting this to be the destination port. - although this may be a red herring.

    I then went through some more basic checks and also found that I connected using a FQDN, this I changed to a public IP Address and this was the factor that enabled it to connect every time, without problem.

    I only provide this as a way to (possibly) fix the issue.

    Of course my issue may be different to others.

     

    - Update: I have just tried this on an XG and although it does load a new firmware to the RED50, I can provision it with either FQDN or IP Address, so it looks like Sophos forgot to add DNS resolution to the UTM config. 

    XG & UTM Architect (Systems: XG v18 & UTM 9.7 - Virtual, HW & SW)
    Curious enough to take it apart, skilled enough to put it back together, Clever enough to hide the extra parts when I'm Done!

Reply
  • Hi All,

     

    I just thought I would do some diagnosing after I was getting the reboot loop on a RED50. Here is what I found out.

    On removing the Red50 From the Customer after Replacing using an RMA, I took the 'Dead' Red50 back to my office to see if I could replicate the issue, and understand it more.

    I found that the source port for the initial communications was TCP/3400, where I was expecting this to be the destination port. - although this may be a red herring.

    I then went through some more basic checks and also found that I connected using a FQDN, this I changed to a public IP Address and this was the factor that enabled it to connect every time, without problem.

    I only provide this as a way to (possibly) fix the issue.

    Of course my issue may be different to others.

     

    - Update: I have just tried this on an XG and although it does load a new firmware to the RED50, I can provision it with either FQDN or IP Address, so it looks like Sophos forgot to add DNS resolution to the UTM config. 

    XG & UTM Architect (Systems: XG v18 & UTM 9.7 - Virtual, HW & SW)
    Curious enough to take it apart, skilled enough to put it back together, Clever enough to hide the extra parts when I'm Done!

Children
  • Interesting, Argo!

    I have a client whose RED 15 was (seemingly) killed on 11 August by the 9.604-to-9.605 Up2Date.  When the replacement also wouldn't connect, I asked to work with someone onsite at the remote office 400 miles away from my usual interlocutor for this client.  I was suspicious that they had upgraded their service a month before the RED 15 stopped working and that their ISP had given instructions to another person in that office on setting a fixed public IP so that he could connect over the Internet without a functional RED.  I asked the guy to try getting a public IP on a laptop connected directly to the ISP's modem.  The laptop couldn't get an IP, so I asked the guy to call the ISP and have them enable DHCP for their connection.  Bingo!  The RED 15 came online as soon as the ISP flipped the switch.

    It turns out that a RED needs DHCP when it first downloads its configuration from the cloud, but it's not necessary after that.  This is why the original RED 15 was unaffected by the loss of DHCP on their connection.

    I'm having the original RED 15 shipped to me to examine.  My theory is that the firmware upgrade in the 9.604-to-9.605 Up2Date left the RED in an unconfigured state - making it require DHCP to get its configuration.  I expect to receive the device Monday or Tuesday and will report back here as well as to Sophos Support.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA