Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

WAN Gateway Keeps Going Down

I keep getting "Gateway Came Up" notices from XG for the WAN port. Sometimes these are preceded by "Gateway Down" but only rarely. From what I can tell, internet access goes down when that happens. 

I recall something similar happening with UTM, which I used before XG. I was able to go through the very detailed logs that UTM had, located the specific log entry with more detail on the nature of the problem, then eventually found the solution in the forums - it had to do with a known problem with the Intel NIC I have in my box and the underlying OS for UTM and could be remedied by changing a setting through the CLI. I was going to just try the solution I had found for UTM, but unfortunately can't seem to find the post in the forums that described it (and don't recall what the setting was). I am relatively certain it is not a problem with my ISP or the modem - when I was using UTM (and after I fixed this issue) the ISP and the modem rarely went down.

In any event, I suspect it might be the same issue here, but am have difficulty figuring out how to diagnose the issue. The log viewer doesn't seem to be of much help, as log details are somewhat limited. In fact, after going through all logs at the time of the outage, I can't even see the outage being logged at all.

Not necessarily looking for a solution off the bat, as I know that will depend on a number of things - really just trying to figure out how best to try to diagnose the issue. Any suggestions would be most appreciated.



This thread was automatically locked due to age.
  • Hi,

    Show me the settings for Gateway failover in the WAN Link Manager. It could also be an issue related to the IP gateway is configured to ping. I would configure my gateway to ping 8.8.8.8 and TCP port 443 x.x.x.x (google.com IP resolved through nslookup).

    Thanks

    Sachin Gurung
    Team Lead | Sophos Technical Support
    Knowledge Base  |  @SophosSupport  |  Video tutorials
    Remember to like a post.  If a post (on a question thread) solves your question use the 'This helped me' link.

  • Thanks for the suggestion Sachin. I have not enabled failover. Should I do so in order to diagnose? 

    Oops - further edit: My bad - I meant that I had changed the setting on the backup WAN gateway to manual. I've now changed it per your suggestion. It had been set only to ping the external IP address that was assigned to the gateway (which I think was the default).

    Thanks again. Will wait to see if the problem goes away.

  • Alas, this did not seem to work. I received an alert than there was a change in gateway status, with current status up. For some reason the number of notices indicating a change in status to up significantly outnumber the change in status to down...

    If anyone has any further thoughts on how I might go about diagnosing this issue they would be most appreciated.

  • It could be an issue related to the MTU value, set the MTU to 1350. If that works, check with your ISP to help you find the largest value that works. If this doesn't work, set the MTU back to its original value. Finally, place an unmanageable switch between the XG and the ISP Modem.

    Keep me updated.

    Sachin Gurung
    Team Lead | Sophos Technical Support
    Knowledge Base  |  @SophosSupport  |  Video tutorials
    Remember to like a post.  If a post (on a question thread) solves your question use the 'This helped me' link.

  • Thank you Sachin. I will try changing the MTU and will consider the use of an intervening switch (though my preference would be to avoid that).

    That being said, my original post was perhaps less oriented to seeking a direct solution at first, and more oriented to attempting to diagnose the issue to identify the actual root cause. For example, when something similar happened with UTM, I had access to very detailed logs relating to the outages, which in turn led to a permanent fix that worked perfectly (it was a known glitch in the specific Intel NIC chip not playing well with the OS used for UTM and a minor change to an obscure setting in UTM through its CLI resolved it). I was wondering if similarly detailed logs are available in XG. I've gone through all the logs accessible through the log viewer link in the top right hand corner of the management page, but it doesn't seem to provide much detail at all. In fact, I wasn't even able to find an event identifying the change in status of the gateway. Is there somewhere else I should be looking for more detailed logs? Or a setting that could be changed to provide more detailed logs, similar to those provided in UTM?

  • I happened to be online right at the time there was an interruption in connectivity. It seems like something odd is going on with XG. When I noticed the loss of connectivity, I tried accessing XG. I was able to login, and the Control Center page started loading, but before the graphs appeared, it kicked me out back to the login screen. This happened three times in row. By the time I was finally able to log-in, connectivity was back up. I haven't seen that behaviour previously and don't know if the two are necessarily related, but it would seem to be quite an odd coincidence if they were not.

  • Do NOT use Google DNS for ping tests, use a dedicated ping responder.  http://stats.es.net/ServicesDirectory/ has all kinds of dedicated responders with geographic, IPv4 and IPv6 filters.

  • Hi, 

    Generally, the Gateway issues are related to Hardware Speed Negotiation and MTU value. In some instances, the ISP gateway doesn't respond to the ARP request from the XG but that becomes an issue from the ISP end. Alongside, dead gateway detections are logged into dgd.log file.

    Thanks

    Sachin Gurung
    Team Lead | Sophos Technical Support
    Knowledge Base  |  @SophosSupport  |  Video tutorials
    Remember to like a post.  If a post (on a question thread) solves your question use the 'This helped me' link.

  • Thank you again Sachin. Would hardware speed negotiation and MTU values also be the cause of the rather odd behaviour of XG?

    And thank you for referencing the dgd.log. However, for the life of me I can't figure out where it is or how to view it. I've searched through the forums and reviewed the documentation, but the only thing I'm able to find is the log viewer in the upper right hand corner of the UI. I have already gone through each of them and had also checked my settings but could not find any reference to a dgd.log, or for that matter any other detailed logs along the same lines as the ones found in UTM.

    Would it be accessible through SSH? If so, could I perhaps trouble you for the location of the logs and perhaps the commands that could be used to export, save or send them to another computer for review?

  • Thank you for the suggestion Matt. I'll give it a try.