Slow page loads & performance

There are many posts concerning this same symptom however nearly all have centered on proper DNS settings and ensuring the the Web Polices are configured properly.  The issue I am experiencing does not appear to relate to any of those I have found because my testing utilizes: (1) The recommended DNS settings; and (2) Bare bones config that does not filter anything.

This issue makes the XG unusable in my particular environment and if left unresolved, the XG will be pulled from service.

Setup

  • Sophos XG:  SFVH (SFOS 17.0.0 GA); tested in both bridge and routed modes.
  • CPU:  Intel Core 2 6400 dual-core @ 2.13GHz.; based on Diagnostics --> System Graphs, CPU Usage never goes above 50% and even then, it is for less than 2s.
  • RAM:  4GB; based on Diagnostics --> System Graphs, Memory Usage never goes above 2.8GB.
  • HDD:  > 140GB; less than 20% occupied.
  • NICs:  Dual Broadcom BCM5721 based GE
  • Internet connections:  I employ dual Internet connections to different providers as follows:
    • DOCSIS D3.0 4DSx4US - i.e. 140Mbps down; 13-20Mbps up. (Comcast)
    • VSDL - 30Mbps down; 12 Mbps up. (AT&T)
  • NOTE:  All of the metrics mentioned above were taken during the period when I had IPS, Web Policies, and several FW rules in service.

Issue

  • Very poor page load times (i.e. minimum of 20s; often times 30s or more).
  • Occurs most often for the first time page load OR a page reload after some period of elapsed time (i.e. likely due to the DNS entry timing out).
  • The XG is the only common element in all testing meaning the issue exists despite which machine or Internet connection is used.

Testing

As mentioned above, I ensured that my testing included the fixes I was able to find in the other posts discussing this symptom.  A summary of the pertinent points follows:

  • DNS set to first use use Google's name-servers (8.8.8.8, 8.8.4.4) followed by 127.0.0.1.
  • DNS set to first use 127.0.0.1 followed by Google's name-servers.
  • Despite either of the above arrangements, domain lookup times using the Diagnostics --> Name Lookup tool are always sub-40ms; typically sub-10ms.
  • Set the applicable FW rule to allow all traffic with nothing configured for IPS, Traffic Shaping, Application Filters, NAT, etc.
  • The only filtering enabled is a very basic single rule Web Policy that is set for Anyone --> Allow All thereby allowing all traffic through.
  • I then disabled the Web Policy rule so that the default rule would take over; the default rule is also also set to Allow All thereby allowing all traffic through.
  • Performed all of the relevant test steps in the Cyberoam Troubleshooting Slow Browsing KB article.

Despite all of the above config and testing, the slow page load times persist whenever the Web Policy is set in the FW rule.  Making the single change of removing the Web Policy from the FW rule immediately restores the page load times to what they are as if the XG was not even in the network - i.e. sub 3-5s.

Interesting Point: The actual throughput performance is NOT affected - only web page load times.  I have performed literally hundreds of throughout tests using Speedtest and DSL Reports - all run great once the page loads thereby reinforcing the idea that there is some flow inspection issue going on here.

Thanks.

  • Hi,

    while my link speed is not anywhere near as good as yours, I went through a lot of testing to improve my XG performance.

    I ended using the OPEN DNS (8.8.8.8, 208.67.22.222 and 8.8.4.4) settings on both the XG and the users configuration and that appears to have fixed the performance issue. That is with IPS, proxy, ATP etc in place.

    Looking at the firefox status line I see a lot of the time is taken negotiating a TLS connection.

    I did try only using the XG ISP DNS and that didn't work very well, then tried the XG OPEN DNS with the users using the ISP DNS, that didn't work.

    Ian

  • In reply to rfcat_vk:

    Yep - that is the most common solution however, it doesn't resolve the issue I am seeing.  I've had the same Google DNS servers you mentioned configured both on the XG and all test machines during the better part of two days testing...issue still persists as long as Web Policy is enabled.

    I suspect there is a defect in how the Web Policy engine is handing changes in DNS servers.  For example, despite changing the DNS setup, the XG may need a reboot due to how those servers are loaded into the policy engine...not sure but I've seen similar issues where the devs just forgot to ensure the sub-system checked for a given change and therefore the system kept using the old settings.  Something along that line might be the issue...

  • In reply to cyberzeus:

    Hi,

    yes, the reboot of the XG does seem to be a common theme after some changes are made. I have found in the past you actually had to power it off, then remove the power lead for a short time, so when he box restarts the cache is too old and has to be flushed.

    I have said this before and undoubtedly will say it many times again, the QA of this product is very poor, just look at the fixes between MR-1 and MR-2.

    Ian

  • In reply to rfcat_vk:

    WOW...ummmm, I mean...WOW...need to actually disconnect power????

    Well, yeah - as much as I find this thing to be an awesome device - at least by intent...that kinda of frailty is a bit disturbing.  But I also keep in mind that it's free...so as the adage goes, you often get what you pay for.

    I did reboot but no joy.  Maybe later I will give the power disconnect a shot...

  • In reply to cyberzeus:

    In your main post you do not state if the XG is doing DNS/DHCP for your client systems or if you are using a system behind your XG for these services. Also a couple of things:

    • OpenDNS = 208.67.222.222, 208.67.220.220
    • Google DNS = 8.8.8.8, 8.8.4.4

    Since you are adding 127.0.0.1 to your DNS servers I am going to assume that DHCP/DNS for your local network is being handed out by your XG to provide local DNS lookups for your devices/systems. Have your tried the following:

    • Set the DNS server for your clients to be XG
    • Test DNS lookups of your devices from the XG
    • Test DNS lookups from one of your devices/systems
      • Do you have any timeouts before it resolves
      • Does it resolve at all? 
    • For everyone that fails do you have a DNS entry for it on your XG

    This is going on the presence that when you make a request to the internet the XG is doing a lookup of your client host to resolve it to see what policy(s) might need to be applied.

    Also remember on the client device/system make sure you flush DNS cache after making changes to all the DNS settings mentioned above.

    Also what are your DoS values set to on the XG.

    I have a configuration for a Home licensed XG that is using the OpenDNS servers and the systems behind the XG are set to use the XG as their DNS server.

     

    Hope this helps.

    -Ron

  • In reply to rrosson:

    Hi Ron,

    The XG is literally doing one thing only - a single Web Policy applied that has only the default Allow All rule enabled.  All DHCP and DNS is done external to the XG.  All DoS\Spoof protection is disabled\not-applied and therefore not a factor.  As for the 127.0.0.1 DNS setting - is was a test step from the Cyberoam troubleshooting document and is not the typical setting - just one of the attempted test vectors.

    All DNS queries be them directly from the XG or by clients behind the XG all complete fine using Google\Comcast\AT&T DNS servers and with typical lookup times sub-1s.

    In observing this a bit more, I am starting to suspect HTTPS - possibly scanning.  What is strange however is that all HTTPS scanning has been disabled both on the FW rule and in Web Protection...

  • In reply to cyberzeus:

    Ok, Since you are doing DNS/DHCP external of the XG may I recommend you do the following:

    • Setup XG to use your Internal DNS Server for DNS
    • Configure your Internal DNS server to use either OpenDNS or Google DNS ( I prefer OpenDNS ) as forwarders
    • Configure your DHCP server to give all clients your Internal DNS server to use for DNS
    • Create a firewall rule in XG to allow your Internal DNS server to the DNS servers you configured in the previous step.
      • Make sure your firewall policy only allows the internal DNS server to make DNS requests out side of your network.

    Test your response time for web requests and see if that speeds things up. in looking at my test XG VM while writing this and comparing it my production UTM 9.x it makes me wish that XG would also act as a DNS proxy. With XG you have to turn on the other features of the rule for inspection.

     

    Hope this helps

    -Ron

  • In reply to rrosson:

    I've had this issue when setting up my Sophos XG Firewall.. 

    *My personal fix was to put my ISP's DNS in the DNS of the sophos XG.

    You may want to check your lookups by going into the Diagnostic menu on your sophosXG then go down to Name Lookup.

     

     

     

  • In reply to Gavin Ramm:

    Hi Gavin,

    I use Google's DNS servers and the lookup response times are sub-1s both from the XG as well as all clients behind the XG.  The only time using the ISP's DNS servers is required is if the ISP happens to block or otherwise handle DNS requests which may be the case in your specific situation.  Neither of my providers do that as evidenced by the very low DNS lookup response times.

  • In reply to cyberzeus:

    understood, however this is what allows my website response times to be quicker.

    EDIT: I just changed them back to 8.8.8.8 and 8.8.4.4 and they seem fine, must have been coincidence

  • In reply to rrosson:

    I'm not sure you're completely reading all of the information in my posts.  Aside from the "Internal DNS" suggestions, all of the other suggestions you make have already been performed and noted in my posts.

     

    Also, given the very low DNS query times, I am convinced DNS is not the heart of the issue - especially given that there are no actual filtering rules being employed.  Furthermore, "Internal DNS" can only mean 1 of 2 things: (1) Setting up actual internal DNS servers - which I know you don't mean; or (2) DNS proxy - which will only serve to make the issue worse - not better.

    As an aside, OpenDNS is not a good tool to use for testing...you never want to use a filtering DNS system when using DNS lookups as a testing metric...always use something as wide open as possible (such as Google) when testing DNS.

  • In reply to cyberzeus:

    Sorry I must have missed that the SophosXG DNS test part.

    Maybe take a step back, You say that the throughput is not affected but only the website loading time.

    maybe do a test with a Chrome and press F12 to bring the network activity window.

    Disable the rule and capture what a successful web browsing times look like.

    Enable the rule again and capture what an unsuccessful web browsing time look like.

    This could highlight where the lag time is, what it's waiting for or what ever is failing.

  • In reply to Gavin Ramm:

    Well, I think there's more than one issue resulting in the same symptom.  Also, with stuff like DNS, because it is a system far removed and one you can't directly monitor, it's really easly to step on your tail - so to speak - when testing.  And this is especially true when there may be more than one issue causing the same symptom.  

    I believe the need to have reliable DNS on the actual XG is, in part at least, so that the XG is able to reliably communicate with the Sophos cloud for things like the Cyberoam data used for the web rules and such.  I have a fairly extensive post elsewhere on this site discussing how the actual data is compiled and managed for all of the web\app filtering subsystems and from that discussion, I have gathered that a lot of communication occurs behind the scenes between the XG and the mother-ship.  And of course, reliable DNS would play an important and central role to ensuring that back-office communication takes place.  I'm not 100% sure as yet as the discussion is still evolving but that's what it seems like to me anyway.

    So just to confirm your experience...say first thing in the morning after your machine has been idle for a while or even turned off, your web page load time are snappy?  And again, I'm not referring to actual throughout - that is fine - I mean only the page load time...

  • In reply to cyberzeus:

    my personal experience is that my web pages are always snappy (apart from when my internet link is being soaked) 

    From the first time in the morning.

  • In reply to Gavin Ramm:

    Good test but already done - I used Wireshark - when possible, always good to use test tools that are not on the DUT.  Nothing in the CAPs jumped out immediately however I haven't yet had the time to do the pkt-by-pkt, timing, TCP-SM analysis that it requires - probably this weekend.

    However, your suggestion caused me to come up with 4 or so additional test\captures.  

    Will report back with what I find...

    Thanks for the "muse" action... :=)