Optimizing web proxy – Lessons Learned

We launched our UTM behind the existing firewall, and out-of-band.   This allowed us to implement Standard Mode web proxy and WAF without risking disruption to existing network functions.   More recently, we moved our UTM in-line, using bridged mode, and still behind the other firewall.  This allowed us to begin using all of UTMs functions, including Transparent Web and Firewall Rules.  The process has generated several surprises.  In this forum, I have advocated for dual-mode proxy, and I was finally ready to practice what I have been preaching.

Traffic Volume Surprises

I assumed our Standard Proxy configuration was intercepting about 95% of our web traffic, because essentially all of our PCs are joined to a domain and usage of local-PC logins is extremely rare.   Much to my surprise, the actual volume turned out to be about 50%.   I learned that there is a lot of web traffic generated by the operating system and by fat-client applications, all of which were ignoring my GPO proxy settings.   The traffic sources included our Antivirus software, Skype, GoToMyPC, and a business-specific application.   Then I discovered that Chrome was using UDP 443 to establish sessions that bypassed the Standard Proxy as well, and this forum kindly taught me to block UDP 443 to force Chrome to use TCP 443 and the Standard Proxy.

Another surprise was the discovery that many of these fat-client applications use HTTPS with hard-coded IP addresses.   An IP-address URL will never have a valid certificate chain, so if using HTTPS inspection, you should create an exception to skip certificate checking (trust and date) for IP addresses, using a regular expression such as “^https://\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\/\”.  If I had seen this behavior with Standard Proxy, it was rare, but once Transparent Proxy was enabled, it was common.

Reconfiguring Proxy Bypass Exceptions

With the out-of-band configuration, the proxy script directed web traffic in one of two directions:  either through UTM using the standard proxy, or direct mode which bypassed UTM completely.   Once the UTM was in-line with both modes enabled, the proxy script “DIRECT” destinations merely switched traffic from Standard Mode to Transparent Mode. 

At first, I assumed that I should add Transparent Mode Skip List entries for all of the proxy script exceptions, but this encountered multiple complications.   The proxy script exclusions are based on regular expressions, while the Transparent Mode Skip List uses Network Objects, so there was no obvious way to create an equivalent list.   Additionally, traffic that successfully bypasses the Transparent Web proxy will still be processed by Firewall Rules, so I would have to consider new Firewall Rules as well.   Planning how to configure three layers of bypass became overwhelming.

About this same time, we had a close call: a web reference was allowed because of a regular expression that did a “contains” match instead of a “starting with” match.   It became clear that regular expressions were too easy to implement incorrectly, so we needed to move away from regular expressions to something that was less error-prone.   I realized that Web Site tagging could provide a replacement for my proxy script exceptions and a replacement for regular expressions.

The Usefulness of WebSite Tagging

Our exceptions are almost always based on FQDN (server1.example.com) or Organization (*.example.com).   Web Site exceptions allow these to be specified easily without special syntax – the “Include Subdomains” option makes “Example.com” apply to all possibilities:   example.com, server1.example.com, and server2.division1.example.com.   I created a tag called “Bypass Web Proxy,” and assigned it to all of the organization domains and host names that had been in the proxy script.   Then the “Bypass Web Proxy” tag was assigned to an Exception object that disables every feature.   The Exception object insures that there is no bouncing between UTM features:  if the web request enters through Standard Web, it exits unmodified from Standard Web.   If the web request does not come from Standard Web, it is passes through Transparent Web unmodified.   In either case, the web request appears in the Web Filtering log rather than the Firewall Log, which is a big advantage.

We have created a few other tags following the same pattern:  

  • “Uncategorized Override” because UTM has a soon-to-be-resolved problem that causes too many sites to be uncategorized.
  • “Country Blocking Override” which disables URL filtering to exempt from Country Blocking.
  • “Not Allowed” for sites that we distrust but have not been blacklisted yet by Sophos.

Implementation note:  after assigning a tag to a new website, it sometimes seems necessary to turn the Exception object off, then back on, to force the newly-tagged website to be integrated into the Exception processing.

For Improved Visibility, Avoid Firewall Traffic

Before settling on this design, I found myself staring at Firewall Log entries, trying to determine what the log entry represented, and whether it should be allowed or blocked.   The Firewall log only has IP addresses, not URLs.   Frequently, I would find that a reverse-DNS of the destination URL would provide no useful information, or no information at all, so I was at a loss to make a decision.   In cases with a persistent source port and IP, I had to log onto the source PC and use the “netstat -a -b -n” command to determine which source process was initiating the traffic.

When evaluating Country Blocking problems, the Firewall log indicates that Country Blocking is the reason for a blocked packet, but does not supply the Country name.   My Firewall logs do not show allowed traffic because I have been afraid that the volume would be excessive and create unpredictable performance problems, but my Web Logs include both allowed and blocked entries.   I realized that to understand my traffic as much as possible, I wanted web traffic handled by the web proxies, not by Firewall Rules.

A Special Surprise from Skype

I found PCs with Skype v7 trying to connect through the Transparent Proxy to many sites on many destination ports, in defiance of Microsoft documentation.  One bit of Microsoft documentation suggested upgrading to the latest version of Skype, but the installed software said that there were no updates available.   After a fair amount of travail, the solution was to remove Skype v7 and download Skype v8.  It seems to operate cleanly, using only port 443.

Finished Result:  Visibility to Normal Network Traffic

Awhile back, I read a Cybersecurity book that talked about a nation-state espionage attack that infected a weapons system contractor, then systematically pushed large volumes of weapon system data from the infected country to the attacking country.   The story left me wondering how I would ever detect an infection of that type on my network.  I simply did not know how to begin constructing a model of my “normal” traffic flows.    The Transparent Proxy rollout has provided something close to an answer to that question:

  • Standard Mode Web proxy handles almost exclusively browser-based traffic, including URLs for FTP sites and web sites with non-standard ports.
  • Transparent Mode Web proxy handles almost exclusively fat-client traffic, using ports 80 and 443.
  • The User Agent field on both proxies helps to identify the browser or application that initiated the traffic
  • Transparent FTP proxy handles FTP traffic that bypasses the Standard Mode Web, but the logging is not very useful. Because FTP does not provide encryption, our FTP usage is minimal anyway.
  • Firewall Rules handle non-web traffic.

The web filtering logs provide URLs, IP pairs, and Country names, so it becomes conceptually possible to do detailed modelling for what constitutes my “normal” network traffic.

Should All Traffic be forced to Standard Mode?

I began looking at how to enforce Standard Proxy at the machine level, but quickly ran into challenges.   The Group Policy mechanism is designed to push proxy settings to users, not machines.   “Loopback mode” can be used to apply the user setting to a machine, but it may do too much – it cannot be used to apply just a single setting to all users.   There is also a machine policy setting to “apply proxy to all users”, but the Microsoft documentation is vague about where it obtains the setting to be applied globally.

For web traffic initiated by the operating system, I have discovered this command:

   netsh winhttp proxy-server=”proxyaddress;DIRECT” bypass-list=”<local>;*.domain1;*.domain2”

This forces anything that uses the WinHTTP API to use standard proxy mode.   I have not found an easy way to deploy the setting using group policy; a startup script seems the most viable.   Even if I could find a way to deploy it, would this be wise?

My Standard Proxy uses AD SSO authentication, while my Transparent Proxy is configured for No Authentication.   The decision to omit authentication for Transparent Mode was initially caused by documented problems with AD SSO on a bridged configuration.   After evaluating my actual traffic, I realize that requiring authentication for Transparent Mode traffic would probably cause a lot of problems with the operating system functions and fat-client applications that could not pass NTLM credentials and could not respond usefully to an “authentication required” challenge.  Consequently, I expect to leave Transparent Mode without authentication even if the AD SSO restriction is eventually resolved.

  • Great work, as usual, Doug!  I've prioritized this thread to be at the top of the Web Filtering forum.

    Some suggestions for the above...

    You can add UDP to 'Allowed Target Services' to allow Standard mode Profiles to handle the Chrome traffic.  Blocking UDP 443 with a firewall rule does force Chrome to use TCP if it wasn't "aimed" at the Proxy, so I was glad to see you included that.

    Instead of skipping all SSL scanning for IP addresses, have you tried just skipping trust and date checks?

    Doesn't the netsh command in the last section incorrectly have a quote mark at the beginning?

    Cheers - Bob

  • In reply to BAlfson:

    Just to clarify, your are saying that Chrome behavior follows this hierarchy:

    1. Try UDP 443 using the standard proxy
    2. Try UDP 443 bypassing the standard proxy
    3. Try TCP 443 using the standard proxy
    4. Try TCP 443 bypassing the standard proxy

    So the reason that I saw traffic bypassing the proxy on UDP 443 was that I had not enabled UDP 443 on the non-standard ports list (since I did not know that I had any reason to do so).   

    To my mind, #2 and #3 should be flipped if Google is serious about security.

    I have made the edits to address your other suggestions.  Thanks.

  • In reply to DouglasFoster:

    I doubt that Chrome tries to bypass the Proxy, Doug.  I think it just tries UDP 443 and only "fails over" to TCP if a UDP request isn't answered.  I suspect it does that for every new connection it makes, but I haven't watched it.

    Cheers - Bob

  • In reply to BAlfson:

    Nice post. I'm currently re-evaluating our proxy setups.

    Servers = 10.1.0.0/16 = transparent with no auth (no DHCP on this subnet)
    Clients = 10.2.0.0/16 = standard using AD SSO distributed via DHCP option 252 & DNS (multitude of browsers) and clients
    Guest = 172.31.0.0/16 = transparent no auth (should I change to standard no auth?)

    We have redundancy with a 2nd UTM cluster so I'm thinking of removing all http/https FW rules and only allow non web/ non standard through the firewall

    I think the main thing to remember with the UTM is to do the firewall rules last. Set up the all of the proxies first and then do the FW rules for the remainder.

    I'm sure we've all done it the other (normal way) and got confused as to why trffic is bypassing FW rules etc. I've spent many a moon now removing FW rules, minimizing them, using WAF etc.

  • In reply to Louis-M:

    I tend to assume that for a guest network, it is impractical to use standard mode -- either guests will not know my required settings, or will not know how to configure the settings on their device, or will be on a locked-down system where they cannot be configured at all.   So I assume that guests should be in Transparent mode, and that the only people using Standard Mode will be hackers trying to beat my defenses.   

    If you share my paranoia, then you want to be aware that Transparent Mode Filter Profiles also enable Standard Mode.   To prevent it from being used, you will need a Standard Mode profile for the same guest network range, with higher evaluation priority, which is linked to a filter profile to block everything.

    The exception object to skip all checks has worked very well for me, so I agree that you should be able to have a firewall rule to block all traffic on TCP 80, TCP 443, and UDP 443.

    Some of this may be redundant of my longer post; I have not reread it recently.   Thanks for the affirmation that it was helpful to you.,

  • In reply to DouglasFoster:

    DouglasFoster

    I tend to assume that for a guest network, it is impractical to use standard mode -- either guests will not know my required settings, or will not know how to configure the settings on their device, or will be on a locked-down system where they cannot be configured at all.   So I assume that guests should be in Transparent mode, and that the only people using Standard Mode will be hackers trying to beat my defenses.  

    Most browsers start off with "Auto detect proxy settings" which use WPAD and other techniques to determine the standard mode proxy.  You can run a guest network with standard mode, but I suspect that practically, you'll need transparent mode as well.

  • In reply to Michael Dunn:

    Hmmm, maybe I should comment...  "Auto detect proxy settings' causes more problems with the UTM than it solves.  I don't recommend selecting that in your browser.  Explicitly configure that tab in Internet Settings or run the risk of random problems.

    Cheers - Bob

  • In reply to BAlfson:

    I do not know the real world.  But every company I've worked for has used "Auto detect" and AFAIK it is common with our customers that use explicit proxy.

  • In reply to Michael Dunn:

    I admit that it's been a long time since I saw these problems, Michael - probably over five years.  Perhaps my habit of avoiding "Auto detect" caused me to miss a fix.  I'll see what I can learn.  Thanks!

    Cheers - Bob

  • A small issue to keep in mind when performing log analysis - Web Filter does not collect data that it does not need.  These are two situations where I know that the principle applies:

    • If you want country="name" information in the logs, you must enable country blocking, even if you choose not to block anything.
    • If you want device="n" codes in the logs, you must enable device-specific authentication in your Filter Profiles.