This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

RED50 - We lose our connection sporadically - what shall we do? 'overflow, missing keepalive, self re-loading'

Hey Community,

since two weeks we have an unsteady connection - a few times a day our connection to UTM gets dropped. We don't really have some specific times or doings when it happens.

All we can see is, that a new connection is requested, the old gets released, disconnected and then gets connected again. The action is followed by a overflow and a missing keepalive on reds1.
Next we get a keepalive, and everything seems fine again.

Below are two logs: first one was a 1-sec-disconnect;
second one (below underline) was about a minute.

We really need some advice, help, tips or tricks.

UTM is 9.403-4

2016:08:05-17:17:54 astaro red_server[18646]: SELF: New connection from 123.123.123.123 with ID A12312312312312 (cipher AES256-GCM-SHA384), rev1
2016:08:05-17:17:55 astaro red_server[18646]: A12312312312312: already connected, releasing old connection.
2016:08:05-17:17:56 astaro red_server[29422]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="A12312312312312" forced="1"
2016:08:05-17:17:56 astaro red_server[29422]: A12312312312312 is disconnected.
2016:08:05-17:18:01 astaro red_server[18646]: A12312312312312: connected OK, pushing config
2016:08:05-17:17:58 astaro red2ctl[4301]: Overflow happened on reds1:0
2016:08:05-17:17:59 astaro red2ctl[4301]: Missing keepalive from reds1:0, disabling peer 123.123.123.123
2016:08:05-17:18:02 astaro red2ctl[4301]: Received keepalive from reds1:0, enabling peer 123.123.123.123
2016:08:05-17:18:04 astaro red_server[18646]: A12312312312312: command 'UMTS_STATUS value=OK'
2016:08:05-17:18:04 astaro red_server[18646]: A12312312312312: command 'PORTSTATE 1E04,1004,1004,1004,1E04'
2016:08:05-17:18:04 astaro red_server[18646]: A12312312312312: PORTSTATE LAN1: 1Gb/s,LAN2: Down,LAN3: Down,LAN4: Down
2016:08:05-17:18:06 astaro red_server[18646]: A12312312312312: command 'PING 0 uplink=WAN uplinkstate=0'
2016:08:05-17:18:06 astaro red_server[18646]: id="4201" severity="info" sys="System" sub="RED" name="RED Tunnel Up" red_id="A12312312312312" forced="0"
2016:08:05-17:18:06 astaro red_server[18646]: A12312312312312: PING remote_tx=0 local_rx=0 diff=0
2016:08:05-17:18:06 astaro red_server[18646]: A12312312312312: PONG local_tx=0
2016:08:05-17:18:07 astaro red_server[4291]: SELF: (Re-)loading device configurations
2016:08:05-17:18:20 astaro red_server[18646]: A12312312312312: command 'PORTSTATE 1E04,1004,1004,1004,1E04'
2016:08:05-17:18:20 astaro red_server[18646]: A12312312312312: PORTSTATE LAN1: 1Gb/s,LAN2: Down,LAN3: Down,LAN4: Down
2016:08:05-17:18:21 astaro red_server[18646]: A12312312312312: command 'PING 0 uplink=WAN uplinkstate=0'
2016:08:05-17:18:21 astaro red_server[18646]: A12312312312312: PING remote_tx=0 local_rx=0 diff=0
2016:08:05-17:18:21 astaro red_server[18646]: A12312312312312: PONG local_tx=0
2016:08:05-17:18:34 astaro red_server[18646]: A12312312312312: command 'PORTSTATE 1E04,1004,1004,1004,1E04'
2016:08:05-17:18:34 astaro red_server[18646]: A12312312312312: PORTSTATE LAN1: 1Gb/s,LAN2: Down,LAN3: Down,LAN4: Down
2016:08:05-17:18:34 astaro red_server[18646]: A12312312312312: command 'PING 0 uplink=WAN uplinkstate=0'
2016:08:05-17:18:34 astaro red_server[18646]: A12312312312312: PING remote_tx=0 local_rx=0 diff=0
2016:08:05-17:18:34 astaro red_server[18646]: A12312312312312: PONG local_tx=0

______________________________________________________________________________

2016:08:05-18:05:18 astaro red_server[27294]: SELF: New connection from 123.123.123.123 with ID A12312312312312 (cipher AES256-GCM-SHA384), rev1
2016:08:05-18:05:33 astaro red_server[27294]: A12312312312312: already connected, releasing old connection.
2016:08:05-18:05:35 astaro red_server[18646]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="A12312312312312" forced="1"
2016:08:05-18:05:35 astaro red_server[18646]: A12312312312312 is disconnected.
2016:08:05-18:05:38 astaro red2ctl[4301]: Overflow happened on reds1:0
2016:08:05-18:05:38 astaro red2ctl[4301]: Missing keepalive from reds1:0, disabling peer 123.123.123.123
2016:08:05-18:05:41 astaro red2ctl[4301]: Received keepalive from reds1:0, enabling peer 123.123.123.123
2016:08:05-18:05:47 astaro red_server[27316]: SELF: New connection from 79.214.245.190 with ID A12312312312312 (cipher AES256-GCM-SHA384), rev1
2016:08:05-18:05:47 astaro red_server[27316]: A12312312312312: already connected, releasing old connection.
2016:08:05-18:05:48 astaro red_server[27316]: A12312312312312: seems to be still connected, exiting.
2016:08:05-18:06:06 astaro red_server[27294]: A12312312312312: connected OK, pushing config
2016:08:05-18:06:09 astaro red_server[4291]: SELF: (Re-)loading device configurations
2016:08:05-18:06:11 astaro red2ctl[4301]: Missing keepalive from reds1:0, disabling peer 123.123.123.123
2016:08:05-18:06:16 astaro red_server[4291]: SELF: (Re-)loading device configurations
2016:08:05-18:06:36 astaro red_server[27294]: A12312312312312: No ping for 30 seconds, exiting.
2016:08:05-18:06:36 astaro red_server[27294]: id="4202" severity="info" sys="System" sub="RED" name="RED Tunnel Down" red_id="A12312312312312" forced="0"
2016:08:05-18:06:36 astaro red_server[27294]: A12312312312312 is disconnected.
2016:08:05-18:06:57 astaro red_server[27567]: SELF: New connection from 123.123.123.123 with ID A12312312312312 (cipher AES256-GCM-SHA384), rev1
2016:08:05-18:06:57 astaro red_server[27567]: A12312312312312: connected OK, pushing config
2016:08:05-18:07:01 astaro red_server[27567]: A12312312312312: command 'UMTS_STATUS value=OK'
2016:08:05-18:07:01 astaro red_server[27567]: A12312312312312: command 'PORTSTATE 1E04,1004,1004,1004,1E04'
2016:08:05-18:07:01 astaro red_server[27567]: A12312312312312: PORTSTATE LAN1: 1Gb/s,LAN2: Down,LAN3: Down,LAN4: Down
2016:08:05-18:07:01 astaro red_server[27567]: A12312312312312: command 'PING 0 uplink=WAN uplinkstate=0'
2016:08:05-18:07:01 astaro red_server[27567]: id="4201" severity="info" sys="System" sub="RED" name="RED Tunnel Up" red_id="A12312312312312" forced="0"
2016:08:05-18:07:01 astaro red_server[27567]: A12312312312312: PING remote_tx=0 local_rx=0 diff=0
2016:08:05-18:07:01 astaro red_server[27567]: A12312312312312: PONG local_tx=0
2016:08:05-18:07:02 astaro red2ctl[4301]: Overflow happened on reds1:0
2016:08:05-18:07:02 astaro red2ctl[4301]: Missing keepalive from reds1:0, disabling peer 123.123.123.123
2016:08:05-18:07:05 astaro red2ctl[4301]: Received keepalive from reds1:0, enabling peer 123.123.123.123
2016:08:05-18:07:08 astaro red_server[4291]: SELF: (Re-)loading device configurations



This thread was automatically locked due to age.
Parents
  • I am actually seeing this same exact issue with a RED 15w that was just installed.  I am imagining it is an issue that has been brought with one of the most recent firmware revisions because I have several other clients running older firmware with the REDs and no disconnects at all.  I will be opening a support case for this tomorrow and I will advise back here if I get any resolution.

    Thanks,
    Hugh

Reply
  • I am actually seeing this same exact issue with a RED 15w that was just installed.  I am imagining it is an issue that has been brought with one of the most recent firmware revisions because I have several other clients running older firmware with the REDs and no disconnects at all.  I will be opening a support case for this tomorrow and I will advise back here if I get any resolution.

    Thanks,
    Hugh

Children
  • hjherron6 said:

    I am actually seeing this same exact issue with a RED 15w that was just installed.  I am imagining it is an issue that has been brought with one of the most recent firmware revisions because I have several other clients running older firmware with the REDs and no disconnects at all.  I will be opening a support case for this tomorrow and I will advise back here if I get any resolution.

    Thanks,
    Hugh

    Something new?

    The time between the disconnects are getting shorter and shorter...
    Still no idea why, whats the reason and also no ideas of a workaround or similar.

    Any help is welcome.

  • Hi there.  I came across your post and wanted to post my workaround for the issue you are referencing.  I worked on this for several weeks and the past two with Sophos Tech support. 

    Needless to say it looks like there is an issue with the 9.4 software and the RED 50 box specifically because our RED 15 boxes did not have this issue.  However tech support has noted the bug and I'm awaiting a fix.  They stated that 9.405 was supposed to fix it; in a nutshell it didn't.


    The only way I could get this to stabilize was to downgrade to 9.356 (the latest 9.3 build) and restore my config file from backup.  Since doing this the system has been up and running with no issues whatsoever.


    Hope this helps, not a solution but at least a way to get your system back to normal without a remote site dropping sporadically all the time.

    Good luck.

  • One last thing, I also lambasted them for not posting advisories and/ or posting items here where they are providing a bulletin board when issues arise.  I wasted way too much of my time jerking with this issue on and off both with the ISP and Sophos to narrow down where this problem was.


    Hopefully this will result in more proactive notifications from Sophos...or at least one can hope.

  • Hi,

    I currently have a case open for this and it has been escalated to the engineering team.  They have definitely identified that there is something wrong when reviewing the similar logs i have to yours.  I will update as soon as I have made more progress.


    Thanks,
    Hugh

  • Sophos support identified this issue as being a problem with the built-in wifi on the RED15w I was having issues with.  Unfortunately this seems to be a different scenario from what you are having as yours is not a wireless unit correct?  I am replacing the 15w unit with a RED15 and AP15 as separate units and seeing if it resolves the issue and I will advise back here once I know.

  • Yes, mine is a RED50 that is having the issue, no wifi.  But good to know it's more than one unit.  I'm just extremely frustrated with halfbaked software that is supposed to be for enterprise usage!