This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

DNS issue on Sophos? Log message "host unreachable resolving"

Hello Community,

we have sporadic issues with DNS requests within our network.

Sometime out of the blue users are complaining that they are not able to access multiple websites anymore or it takes pretty long.

When checking our Sophos UTM logging I can see lots of log messages with the following:

named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
named[17516]: host unreachable resolving './A/IN': xxx.xxx.xxx.xxx#53
 
Or log messages like this:
named[8502]: REFUSED unexpected RCODE resolving 'xx.xx.xxxxxx.xxxx/A/IN': xxx.xxx.xxx.xxx#53
named[8502]: REFUSED unexpected RCODE resolving 'xx.xx.xxxxxx.xxxx/A/IN': xxx.xxx.xxx.xxx#53
named[8502]: REFUSED unexpected RCODE resolving 'xx.xx.xxxxxx.xxxx/A/IN': xxx.xxx.xxx.xxx#53
named[8502]: REFUSED unexpected RCODE resolving 'xx.xx.xxxxxx.xxxx/A/IN': xxx.xxx.xxx.xxx#53

We than have to flush the resolver cache on our Sophos, it takes a few minutes and afterwards everything is working properly again.

The DNS setup was done like the best practice manual (https://community.sophos.com/products/unified-threat-management/f/management-networking-logging-and-reporting/32566/solved-dns-best-practice/109152)

Does anyone might have an idea what else we could be facing here?

Thanks in advance & kind regards,

Judith



This thread was automatically locked due to age.
Parents
  • Hi There,

    What do you see in selfmon.log at this time?

    Regards

    Jaydeep

  • Hey there,

    I can see the following log messages (for the time range the issue was present):

    selfmonng[6139]: W check Failed increment var_log_freeSpace counter 3 - 3
    selfmonng[6139]: W NOTIFYEVENT Name=var_log_freeSpace Level=INFO Id=153 suppressed
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[6139]: W check Failed increment winbcheck counter 1 - 1
    selfmonng[6139]: W triggerAction: 'cmd'
    selfmonng[6139]: W actionCmd(+):  '/var/mdw/scripts/ntlm restart'
    selfmonng[6139]: W child returned status: exit='0' signal='0'
    selfmonng[6139]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 2 - 3
    selfmonng[6139]: I check Failed increment var_log_freeSpace counter 2 - 3
    selfmonng[5894]: W check Failed increment var_log_freeSpace counter 3 - 3
    selfmonng[5894]: W NOTIFYEVENT Name=var_log_freeSpace Level=INFO Id=153 suppressed
    selfmonng[6139]: W check Failed increment var_log_freeSpace counter 3 - 3
    selfmonng[6139]: W NOTIFYEVENT Name=var_log_freeSpace Level=INFO Id=153 suppressed
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[6139]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 2 - 3
    selfmonng[6139]: I check Failed increment var_log_freeSpace counter 2 - 3
    selfmonng[5894]: W check Failed increment var_log_freeSpace counter 3 - 3
    selfmonng[5894]: W NOTIFYEVENT Name=var_log_freeSpace Level=INFO Id=153 suppressed
    selfmonng[6139]: W check Failed increment winbcheck counter 1 - 1
    selfmonng[6139]: W triggerAction: 'cmd'
    selfmonng[6139]: W actionCmd(+):  '/var/mdw/scripts/ntlm restart'
    selfmonng[6139]: W child returned status: exit='0' signal='0'
    selfmonng[6139]: W check Failed increment var_log_freeSpace counter 3 - 3
    selfmonng[6139]: W NOTIFYEVENT Name=var_log_freeSpace Level=INFO Id=153 suppressed
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[6139]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 2 - 3
    selfmonng[6139]: I check Failed increment var_log_freeSpace counter 2 - 3
    selfmonng[5894]: W check Failed increment var_log_freeSpace counter 3 - 3
    selfmonng[5894]: W NOTIFYEVENT Name=var_log_freeSpace Level=INFO Id=153 suppressed
    selfmonng[6139]: W check Failed increment var_log_freeSpace counter 3 - 3
    selfmonng[6139]: W NOTIFYEVENT Name=var_log_freeSpace Level=INFO Id=153 suppressed
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[6139]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 2 - 3
    selfmonng[6139]: W check Failed increment winbcheck counter 1 - 1
    selfmonng[6139]: W triggerAction: 'cmd'
    selfmonng[6139]: W actionCmd(+):  '/var/mdw/scripts/ntlm restart'
    selfmonng[6139]: W child returned status: exit='0' signal='0'
    selfmonng[6139]: I check Failed increment var_log_freeSpace counter 2 - 3
    selfmonng[5894]: W check Failed increment var_log_freeSpace counter 3 - 3
    selfmonng[5894]: W NOTIFYEVENT Name=var_log_freeSpace Level=INFO Id=153 suppressed
    xselfmonng[6139]: W check Failed increment var_log_freeSpace counter 3 - 3
    selfmonng[6139]: W NOTIFYEVENT Name=var_log_freeSpace Level=INFO Id=153 suppressed
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[6139]: I check Failed increment var_log_freeSpace counter 1 - 3
    selfmonng[5894]: I check Failed increment var_log_freeSpace counter 2 - 3

     

    If I check this log right now I get the same log entries.

    When I'm reading the log entries correct it points to the "Log Disk" of 224,4GB it has 95% filled.

     

  • Thanks for the logs.

    Is it possible for you to remove some older logs and free up the space in Log disk to <80% ?

    Regards

    Jaydeep

  • Currently it's not possible to delete items since we have special guideline we have to follow. The automatic log file deletion is set to 1 year.

  • I understand.

    It would be worth a Service request now. They will be able to extract logs for named service in selfmon and see where does it go wrong.

    Regards

    Jaydeep

  • FYI - just had a call with the Sophos support.

    They couldn't find anything special within the DNS logs on our Sophos.
    We added 8.8.8.8 into the DNS forwarder table (we normally use a different one for external dns).

    The problem can't be re created manually therefore we have to keep an eye on this and check if the global dns might have helped.

     

    Many thanks for the support so far, it was much appreciated.

Reply
  • FYI - just had a call with the Sophos support.

    They couldn't find anything special within the DNS logs on our Sophos.
    We added 8.8.8.8 into the DNS forwarder table (we normally use a different one for external dns).

    The problem can't be re created manually therefore we have to keep an eye on this and check if the global dns might have helped.

     

    Many thanks for the support so far, it was much appreciated.

Children
  • Hello,

    just wanted to give you an update and let you know that this has been fixed.

    Maybe in the future someone runs into the same mistake/issue and this might help.

     

    The issue was that we had a group added as a DNS forwarder and not the single hosts, the group however appeared as a host to us.

    When there is a group added it seems like if only the first dns forwarder ip, if that one isn't reachable for a short amount of time the described issues appear in our network.

    The problem did not reappear yet, but to prevent it from happening again, I deleted the DNS forwarding group and added the ip's to the dns forwarder list one by one.

     

    Thanks for the support.

  • Hallo Judith and welcome to the UTM Community!

    You might want to take a look at DNS best practice.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hi Bob,

     

    many thanks for the link.

    I already checked out the best practice and our DNS configuration now seems to be configured properly.

     

    Many thanks again & best regards,

    Judith