This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Random websites stop loading - DNS ??

I've been fighting an issue for months where random websites stop loading with  ERR_CONNECTION_ABORTED or ERR_CONNECTION_RESET

When this occurs, all other sites function fine.

I've turned off almost every feature on the UTM9 without resolution.

My UTM 9 Version is 9.4.10-6

My Memory utilization is averaging 42.06%

My CPU averages 1.19%

I've turned off Intrusion Prevention and Web Filtering 

I've followed the DNS Best Practices guide here:

https://community.sophos.com/products/unified-threat-management/f/management-networking-logging-and-reporting/32566/solved-dns-best-practice

I can open nslookup and lookup the sites just fine. 

I have even enabled ECN support

I have the same condition on every computer on my network when it occurs. 

I have pointed my machine directly to Google's DNS and OpenDNS without resolution

The only evidence I've found in the logs are the entries below and these are the.

DNS Proxy Log:

/var/log/named.log:2017:02:14-20:13:54 firewall named[4333]: network unreachable resolving 'www.linkedin.com/A/IN': 8.8.4.4#53
/var/log/named.log:2017:02:14-20:13:54 firewall named[4333]: network unreachable resolving 'www.linkedin.com/A/IN': 202.12.27.33#53
/var/log/named.log:2017:02:14-20:13:54 firewall named[4333]: network unreachable resolving 'www.linkedin.com/A/IN': 199.7.83.42#53
/var/log/named.log:2017:02:14-20:13:54 firewall named[4333]: network unreachable resolving 'www.linkedin.com/A/IN': 199.7.91.13#53
/var/log/named.log:2017:02:14-20:13:54 firewall named[4333]: network unreachable resolving 'www.linkedin.com/A/IN': 193.0.14.129#53
/var/log/named.log:2017:02:14-20:13:54 firewall named[4333]: network unreachable resolving 'www.linkedin.com/A/IN': 192.33.4.12#53

The only other error I have found is in the Kernel log which I don't think would affect this:

2017:02:14-20:13:52 firewall kernel: [197707.820310] e1000e 0000:00:19.0 eth2: Reset adapter unexpectedly

What else do I try to resolve this problem?

I have attached my DNS Proxy Log. It looks strange.

Please advise.

Ed

 

UTM9DNS.txt



This thread was automatically locked due to age.
Parents
  • OK, Ed, a gut-level WAG...

    Delete the  OpenDNS/Google Availability Group in Forwarders from the DNS proxy [Apply], re-enter it [Apply] and, finally, flush the DNS cache.  Any luck with that?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • I have tried this and tried just running with only the checkbook for my ISP's DNS servers. I have flushed DNS in utm9, workstations, and my AD dns ssrvers. I've pointed my workstations directly to the UTM9 and/or Google without any difference

    Ed

  • The line in your log that disturbs me is:

    7:02:14-20:13:53 firewall named[4333]: no longer listening on 173.x.y.222#53

    If this is not a home license, you should get a ticket open with Support.

    Does anything look suspicious in the System messages log just before this?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • This is a home license.  What does [4333] mean in the line?

    Can a home license open support tickets? I can check the system logs when I get home tonight.

    Thanks,

    Ed

  • 4333 is just a process number - not important for this issue.  No, a home license has no ability to interact with Sophos Support.

    At some point, since this is a home-use situation, if a reboot doesn't solve the problem, you're probably better off just getting some config backups off the UTM and re-imaging from ISO - a 10-minute fix.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Im willing to try that. Are there instructions online that you know of?

    Ed

  • No big deal, Ed.  Load the backup you want to restore into the root directory of a USB memory stick.  Re-image your UTM device.  Reboot the UTM with the memory stick inserted.  Voila, you're back in business!

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Bob,

    I proceeded to do the restore process you recommended. I've had many problems but finally got UTM running on a vanilla build and I can't restore my backup. Every time I restore the backup, the system goes unresponsive. I've tried everything I can think of to try to reconnect and it doesn't respond. I've even tried to plug my internal network to any of the 3 NIC's to see if the internal network has switched to another nic. This wasn't successful. I've tried booting with the USB backup plugged in and also tried to restore in the GUI setup where I can choose to restore during initial setup.

    My current status is that I'm up and running on a vanilla build without any of my rulesets or definitions

    The only thing I see that's looks different (took a screenshot of the interfaces page) is that my nic's have changed.

    Before the wipe, they were like this:

    DMZ on eth1
    External (WAN) on eth2
    Internal on eth0

    now after the install:

    DMZ on eth1
    External (WAN) on eth0 << Different
    Internal on eth2

    I haven't removed or added any hardware. All ethernet jacks and cables were labeled so they are still plugged in the same. I Just reinstalled and reapplied my backup I took today (I also saved some from earlier). I'm thinking the ethernet adapters are not being detected in the same order during the setup/build and this is causing the restore to bind the internal linux ethernet adapters incorrectly and UTM is applying the "Friendly" names to adapters causing it to break somehow.

    I'm thinking if someone can tell me how to change the eth[x] in linux via the cli to how they were configured originally before the wipe, my restore will work

    Do you have any ideas on what I can attempt to change this so I can restore my backup?

    I'm now on about 4 hours of trying to fix this mess.

    Looking forward for your help :)

    Ed

  • ###UPDATE###

    so, I realized that if I disabled the onboard NIC, I would be able to ensure that eth0 wouldn't bind to that nic. I did that and booted up and got the two NIC's to have the correct eth[x] order. But, I ended up having my "Internal" on the "DMZ" and vice-versa. Once I got up, I then restored my backup, allowed WebAdmin from both DMZ and Internal and rebooted. This is when I enabled my internal NIC (on boot). Once the system was up, I connected to the WebAdmin from the DMZ and added the External (WAN) interface and rebooted. (Order may be slightly different as I'm asleep now).

    All seems to be working...

    Whew...

    Going to bed now :|

    Ed

  • To permanently change the NIC order, Ed, do the following:

    # edit /etc/udev/rules.d/70-persistent-net.rules

    Save the file and then restart the ASG so the new order is loaded.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • I followed that recommendation from this site: https://networkguy.de/?p=577

    The friendly names and connections followed as well. 

    Anyway, that's fixed.

    the ORIGINAL problem where sites don't load with one of these errors below still occurs:

    ERR_CONNECTION_ABORTED or ERR_CONNECTION_RESET

    Ed

Reply Children
No Data