18.5.1 problems with XG to XG RED tunnels

We have 3 primary sites, they have XG125 and Custom hardware 4-core 6GB RAM boxes, all running 18.5.1.

There are 7 branch sites, each has 3 RED tunnels, to the main 3 sites above.

Ever since we upgraded to 18.5.1, it seems like when a Branch or Main site has the WAN GW flicker (ISP issues), the RED tunnel shows status "online" but there is NO communication with the web GUI over the RED tunnel and sometimes, no communication with any branch local IP across the tunnel.

The only thing I've found that is able to re-establish the tunnel is to restore the Branch site XG from a backup.

Has anyone experienced this? LIke I said, everything was running for a year without issue.



Added TAGs
[edited by: emmosophos at 5:07 PM (GMT -7) on 14 Oct 2021]
Parents
  • Here's the system log viewer on branch FRESNO site, you can see the RED data just stops flowing, no reason why.

    2021-10-13 14:58:48  xx/LNK_FRESNO_RED transfered bytes TX: 57600 RX: 0

    2021-10-13 14:58:48  xx/SD_FRESNO_RED transfered bytes TX: 57696 RX: 0

    2021-10-13 14:58:48  xx/PHX_FRESNO_RED transfered bytes TX: 57696 RX: 0

    2021-10-13 12:17:43  xx/LNK_FRESNO_RED transfered bytes TX: 57616 RX: 38852

    2021-10-13 12:17:43  xx/SD_FRESNO_RED transfered bytes TX: 81792 RX: 65288

    2021-10-13 12:17:43  xx/PHX_FRESNO_RED transfered bytes TX: 66304 RX: 54212

  • Can you check the counterpart, if it shows the same or does it have RX Data? 

    __________________________________________________________________________________________________________________

  • Here's the main LNK site. You can see the disconnect, then re-connect after which it appears to be transferring data.

    But the logs at FRESNO RX: 0.

    2021-10-13 15:05:55  xx/LNK_FRESNO_RED transfered bytes TX: 58080 RX: 40800

    2021-10-13 15:00:53  xx/LNK_FRESNO_RED transfered bytes TX: 40128 RX: 29444

    2021-10-13 14:57:14  xx/LNK_FRESNO_RED is now re-connected after 24000 ms

    2021-10-13 14:57:00  xx/LNK_FRESNO_RED is now disconnected

    Here's the main SD site. No disconnect, and traffic appears to be flowing.

    2021-10-13 15:01:34  xx/SD_FRESNO_RED transfered bytes TX: 57952 RX: 41180

    2021-10-13 14:56:32  xx/SD_FRESNO_RED transfered bytes TX: 57696 RX: 40924

    Here's the main PHX site. No disconnect, and traffic appears to be flowing.

    2021-10-13 14:59:50 xx/PHX_FRESNO_RED transfered bytes TX: 57696 RX: 40924

    2021-10-13 14:54:48 xx/PHX_FRESNO_RED transfered bytes TX: 57888 RX: 40992

Reply
  • Here's the main LNK site. You can see the disconnect, then re-connect after which it appears to be transferring data.

    But the logs at FRESNO RX: 0.

    2021-10-13 15:05:55  xx/LNK_FRESNO_RED transfered bytes TX: 58080 RX: 40800

    2021-10-13 15:00:53  xx/LNK_FRESNO_RED transfered bytes TX: 40128 RX: 29444

    2021-10-13 14:57:14  xx/LNK_FRESNO_RED is now re-connected after 24000 ms

    2021-10-13 14:57:00  xx/LNK_FRESNO_RED is now disconnected

    Here's the main SD site. No disconnect, and traffic appears to be flowing.

    2021-10-13 15:01:34  xx/SD_FRESNO_RED transfered bytes TX: 57952 RX: 41180

    2021-10-13 14:56:32  xx/SD_FRESNO_RED transfered bytes TX: 57696 RX: 40924

    Here's the main PHX site. No disconnect, and traffic appears to be flowing.

    2021-10-13 14:59:50 xx/PHX_FRESNO_RED transfered bytes TX: 57696 RX: 40924

    2021-10-13 14:54:48 xx/PHX_FRESNO_RED transfered bytes TX: 57888 RX: 40992

Children
No Data