Optimizing web proxy – Lessons Learned

Question

We launched our UTM behind the existing firewall, and out-of-band. This allowed us to implement Standard Mode web proxy and WAF without risking disruption to existing network functions. More recently, we moved our UTM in-line, using bridged mode, and still behind the other firewall. This allowed us to begin using all of UTMs functions, including Transparent Web and Firewall Rules. The process has generated several surprises. In this forum, I have advocated for dual-mode proxy, and I was finally ready to practice what I have been preaching. 
 Traffic Volume Surprises 
 I assumed our Standard Proxy configuration was intercepting about 95% of our web traffic, because essentially all of our PCs are joined to a domain and usage of local-PC logins is extremely rare. Much to my surprise, the actual volume turned out to be about 50%. I learned that there is a lot of web traffic generated by the operating system and by fat-client applications, all of which were ignoring my GPO proxy settings. The traffic sources included our Antivirus software, Skype, GoToMyPC, and a business-specific application. Then I discovered that Chrome was using UDP 443 to establish sessions that bypassed the Standard Proxy as well, and this forum kindly taught me to block UDP 443 to force Chrome to use TCP 443 and the Standard Proxy. 
 Another surprise was the discovery that many of these fat-client applications use HTTPS with hard-coded IP addresses. An IP-address URL will never have a valid certificate chain, so if using HTTPS inspection, you should create an exception to skip certificate checking (trust and date) for IP addresses, using a regular expression such as &ldquo;^https://\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\/\&rdquo;. If I had seen this behavior with Standard Proxy, it was rare, but once Transparent Proxy was enabled, it was common. 
 Reconfiguring Proxy Bypass Exceptions 
 With the out-of-band configuration, the proxy script directed web traffic in one of two directions: either through UTM using the standard proxy, or direct mode which bypassed UTM completely. Once the UTM was in-line with both modes enabled, the proxy script &ldquo;DIRECT&rdquo; destinations merely switched traffic from Standard Mode to Transparent Mode. 
 At first, I assumed that I should add Transparent Mode Skip List entries for all of the proxy script exceptions, but this encountered multiple complications. The proxy script exclusions are based on regular expressions, while the Transparent Mode Skip List uses Network Objects, so there was no obvious way to create an equivalent list. Additionally, traffic that successfully bypasses the Transparent Web proxy will still be processed by Firewall Rules, so I would have to consider new Firewall Rules as well. Planning how to configure three layers of bypass became overwhelming. 
 About this same time, we had a close call: a web reference was allowed because of a regular expression that did a &ldquo;contains&rdquo; match instead of a &ldquo;starting with&rdquo; match. It became clear that regular expressions were too easy to implement incorrectly, so we needed to move away from regular expressions to something that was less error-prone. I realized that Web Site tagging could provide a replacement for my proxy script exceptions and a replacement for regular expressions. 
 The Usefulness of WebSite Tagging 
 Our exceptions are almost always based on FQDN (server1.example.com) or Organization (*.example.com). Web Site exceptions allow these to be specified easily without special syntax &ndash; the &ldquo;Include Subdomains&rdquo; option makes &ldquo;Example.com&rdquo; apply to all possibilities: example.com, server1.example.com, and server2.division1.example.com. I created a tag called &ldquo;Bypass Web Proxy,&rdquo; and assigned it to all of the organization domains and host names that had been in the proxy script. Then the &ldquo;Bypass Web Proxy&rdquo; tag was assigned to an Exception object that disables every feature. The Exception object insures that there is no bouncing between UTM features: if the web request enters through Standard Web, it exits unmodified from Standard Web. If the web request does not come from Standard Web, it is passes through Transparent Web unmodified. In either case, the web request appears in the Web Filtering log rather than the Firewall Log, which is a big advantage. 
 We have created a few other tags following the same pattern: 
 
 &ldquo;Uncategorized Override&rdquo; because UTM has a soon-to-be-resolved problem that causes too many sites to be uncategorized. 
 &ldquo;Country Blocking Override&rdquo; which disables URL filtering to exempt from Country Blocking. 
 &ldquo;Not Allowed&rdquo; for sites that we distrust but have not been blacklisted yet by Sophos. 
 
 Implementation note: after assigning a tag to a new website, it sometimes seems necessary to turn the Exception object off, then back on, to force the newly-tagged website to be integrated into the Exception processing. 
 For Improved Visibility, Avoid Firewall Traffic 
 Before settling on this design, I found myself staring at Firewall Log entries, trying to determine what the log entry represented, and whether it should be allowed or blocked. The Firewall log only has IP addresses, not URLs. Frequently, I would find that a reverse-DNS of the destination URL would provide no useful information, or no information at all, so I was at a loss to make a decision. In cases with a persistent source port and IP, I had to log onto the source PC and use the &ldquo;netstat -a -b -n&rdquo; command to determine which source process was initiating the traffic. 
 When evaluating Country Blocking problems, the Firewall log indicates that Country Blocking is the reason for a blocked packet, but does not supply the Country name. My Firewall logs do not show allowed traffic because I have been afraid that the volume would be excessive and create unpredictable performance problems, but my Web Logs include both allowed and blocked entries. I realized that to understand my traffic as much as possible, I wanted web traffic handled by the web proxies, not by Firewall Rules. 
 A Special Surprise from Skype 
 I found PCs with Skype v7 trying to connect through the Transparent Proxy to many sites on many destination ports, in defiance of Microsoft documentation. One bit of Microsoft documentation suggested upgrading to the latest version of Skype, but the installed software said that there were no updates available. After a fair amount of travail, the solution was to remove Skype v7 and download Skype v8. It seems to operate cleanly, using only port 443. 
 Finished Result: Visibility to Normal Network Traffic 
 Awhile back, I read a Cybersecurity book that talked about a nation-state espionage attack that infected a weapons system contractor, then systematically pushed large volumes of weapon system data from the infected country to the attacking country. The story left me wondering how I would ever detect an infection of that type on my network. I simply did not know how to begin constructing a model of my &ldquo;normal&rdquo; traffic flows. The Transparent Proxy rollout has provided something close to an answer to that question: 
 
 Standard Mode Web proxy handles almost exclusively browser-based traffic, including URLs for FTP sites and web sites with non-standard ports. 
 Transparent Mode Web proxy handles almost exclusively fat-client traffic, using ports 80 and 443. 
 The User Agent field on both proxies helps to identify the browser or application that initiated the traffic 
 Transparent FTP proxy handles FTP traffic that bypasses the Standard Mode Web, but the logging is not very useful. Because FTP does not provide encryption, our FTP usage is minimal anyway. 
 Firewall Rules handle non-web traffic. 
 
 The web filtering logs provide URLs, IP pairs, and Country names, so it becomes conceptually possible to do detailed modelling for what constitutes my &ldquo;normal&rdquo; network traffic. 
 Should All Traffic be forced to Standard Mode? 
 I began looking at how to enforce Standard Proxy at the machine level, but quickly ran into challenges. The Group Policy mechanism is designed to push proxy settings to users, not machines. &ldquo;Loopback mode&rdquo; can be used to apply the user setting to a machine, but it may do too much &ndash; it cannot be used to apply just a single setting to all users. There is also a machine policy setting to &ldquo;apply proxy to all users&rdquo;, but the Microsoft documentation is vague about where it obtains the setting to be applied globally. 
 For web traffic initiated by the operating system, I have discovered this command: 
 netsh winhttp proxy-server=&rdquo;proxyaddress;DIRECT&rdquo; bypass-list=&rdquo;<local>;*.domain1;*.domain2&rdquo; 
 This forces anything that uses the WinHTTP API to use standard proxy mode. I have not found an easy way to deploy the setting using group policy; a startup script seems the most viable. Even if I could find a way to deploy it, would this be wise? 
 My Standard Proxy uses AD SSO authentication, while my Transparent Proxy is configured for No Authentication. The decision to omit authentication for Transparent Mode was initially caused by documented problems with AD SSO on a bridged configuration. After evaluating my actual traffic, I realize that requiring authentication for Transparent Mode traffic would probably cause a lot of problems with the operating system functions and fat-client applications that could not pass NTLM credentials and could not respond usefully to an &ldquo;authentication required&rdquo; challenge. Consequently, I expect to leave Transparent Mode without authentication even if the AD SSO restriction is eventually resolved.