This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Failover web filtering?

Two core sites (A & B), each site with 2x SG330 (hot standby) each with a 50mb fibre to the internet. Using multipath to balance between the two.

Using web proxy in transparent mode but would like to change to standard.

Currently all internet traffic goes to core site A (SG330 web proxy) and goes out 50/50  to internet and SITE B via dedicated uplink (leased line)

Now if we knock the SG330 off at SITE A, we can reroute internet traffic to site B SG330 and we can get it to go out via the UTM web proxy there (transparent)

Is it possible to change both UTM's to standard filtering instead of transparent? If all clients browsers (using standard) are pointing towards the UTM at site A, how can you get it to point to site B? Or am I stuck with transparent for this setup?



This thread was automatically locked due to age.
Parents
  • Standard mode does not fully displace Transparent Mode, because Standard Mode is essentially a suggestion to the client browser.  There will always be traffic that does not notice or does not obey the suggestion.   So use both modes.

    You pose an interesting questions.  There are several possible failure modes: 

    1. An ISP connection path fails, so UTM detects a failure to respond.
    2. An ISP router fails, so UTM detects a down link
    3. A UTM fails.

    I think Standard Mode and Transparent Mode will behave identically in the first two cases.   For those failures, you need to reconfigure external routing, either manually or automatically using OSPF or more likely BGP as appropriate.

    In the case of the third failure, Transparent Mode requires that you reconfigure your internal routing, which may be challenging.   Standard Mode requires only that the PROXY command be configured with options. this is an example:

    return "PROXY utm1:8080; utm2:8080; DIRECT";

     

  • Hi Douglas,

    yes its the 3rd option I'm after. The first wo are fine and work as planned.

    With the 3rd option, our 50+ sites are connected via MPLS and use BGP for routing. We have an MPLS router at each core site (A & B) and we then re-advertise the BGP routes via EIGRP to our core routers at site A & B. So the network is fully aware and will failover should we lose an MPLS router at site A or B.

    Internet breakout is done via the UTM's at both sites although all traffic is routed to site A UTM first which then does the load balancing between UTM A & UTM B via the leased line that connects site A & B directly.

    Should an MPLS router fail at site A, the traffic simply reverses and enters via site B and breakouts to the internet via site B UTM which then tries to load balance again eg 50% out of site B and 50% out of site A UTM uplink. Site B UTM is basically a mirror of SITE A UTM

    The part I'm interested in is the proxy command you mentioned. Is there any further details on this? How it works, is applied etc? As stated, we're currently using transparent  and https url scaning only but would like to switch over to standard with full scanning.

  • I assume you will be using a proxy script.   The technology was invented by Netscape, which no longer exists, and I had trouble finding the documentation that I needed when I was building my own script.   The function takes paramaters of url and host, and I even had trouble reading whether "host" refers to the target host or the sending host.   So I built my own logic to strip the host name out of the URL, although I think eventually I found that the host name parameter is the target host, so my parsing was unnecessary,.

    Once I did find documentation, it included functions that are unique to browser implementations of proxy scripts, and I had no idea how to test those functions to see if they worked.  I built my script using syntax that I knew would work with Windows javascript, so I could put a wrapper around my code and test is using the cscript engine.

    Below is a sanitized version of my current proxy script.   Pasting it into the forum destroyed the formatting, but hopefully it is readable.

    When you have a site that does not work with the proxy, just add another regex, and return DIRECT if you find a match.

    Remember that in regex, the period is a wildcard, not a literal.   Consequently, all periods should be "escaped" using a backslash.

    The return command takes a series of options, separated by semicolons.  Just add additional PROXY HOSTNAME entry into the list to provide failover.

    Note that you also need to worry about proxy script availability.   If you are worried about UTM failure, you need to ensure that your proxy script is retrieved from an always-available website, not from UTM itself.   If the script source becomes unavailable, browser performance is bad.

    function FindProxyForURL(url, host) {
    var x;
    var y;
    var z;

    //strip off the protocol, leaving only the : and slashes
    //if no protocol is supplied, add the : and slashes
    //this is probably the same as the host parameter
    x = /:\/\/[^/]*/.exec(url);
    if (x > "") y = x; else y = "://" + url;

    //strip off everything after the host name
    y = /...[^/:]*/.exec(y) + "/";

    //Go direct if it is an internal numeric ip address (192.168.*.*)
    z = /:\/\/192\.168\.[0-9]{1,3}\.[0-9]{1,3}/.exec(y);
    if (z > "") return "DIRECT";

    //Go direct if it is a loopback address (127.0.0.*)
    z = /:\/\/127\.0\.0\./.exec(y);
    if (z > "") return "DIRECT";

    //Go direct if it is an internal FQDN (*.example.com)
    z = /\.example\.com/.exec(y);
    if (z > "") return "DIRECT";

    //Go direct if it is an unqualified domain name, proxy otherwise
    z = /\./.exec(y);
    if (z > "")
    return "PROXY utm.example.com:8080; DIRECT";
    else
    return "DIRECT";

    }

Reply
  • I assume you will be using a proxy script.   The technology was invented by Netscape, which no longer exists, and I had trouble finding the documentation that I needed when I was building my own script.   The function takes paramaters of url and host, and I even had trouble reading whether "host" refers to the target host or the sending host.   So I built my own logic to strip the host name out of the URL, although I think eventually I found that the host name parameter is the target host, so my parsing was unnecessary,.

    Once I did find documentation, it included functions that are unique to browser implementations of proxy scripts, and I had no idea how to test those functions to see if they worked.  I built my script using syntax that I knew would work with Windows javascript, so I could put a wrapper around my code and test is using the cscript engine.

    Below is a sanitized version of my current proxy script.   Pasting it into the forum destroyed the formatting, but hopefully it is readable.

    When you have a site that does not work with the proxy, just add another regex, and return DIRECT if you find a match.

    Remember that in regex, the period is a wildcard, not a literal.   Consequently, all periods should be "escaped" using a backslash.

    The return command takes a series of options, separated by semicolons.  Just add additional PROXY HOSTNAME entry into the list to provide failover.

    Note that you also need to worry about proxy script availability.   If you are worried about UTM failure, you need to ensure that your proxy script is retrieved from an always-available website, not from UTM itself.   If the script source becomes unavailable, browser performance is bad.

    function FindProxyForURL(url, host) {
    var x;
    var y;
    var z;

    //strip off the protocol, leaving only the : and slashes
    //if no protocol is supplied, add the : and slashes
    //this is probably the same as the host parameter
    x = /:\/\/[^/]*/.exec(url);
    if (x > "") y = x; else y = "://" + url;

    //strip off everything after the host name
    y = /...[^/:]*/.exec(y) + "/";

    //Go direct if it is an internal numeric ip address (192.168.*.*)
    z = /:\/\/192\.168\.[0-9]{1,3}\.[0-9]{1,3}/.exec(y);
    if (z > "") return "DIRECT";

    //Go direct if it is a loopback address (127.0.0.*)
    z = /:\/\/127\.0\.0\./.exec(y);
    if (z > "") return "DIRECT";

    //Go direct if it is an internal FQDN (*.example.com)
    z = /\.example\.com/.exec(y);
    if (z > "") return "DIRECT";

    //Go direct if it is an unqualified domain name, proxy otherwise
    z = /\./.exec(y);
    if (z > "")
    return "PROXY utm.example.com:8080; DIRECT";
    else
    return "DIRECT";

    }

Children
  • Ah, good old netscape, I remember them well. Just at the time I was getting into computing. NN4 was the one to have. I digress.

    Found a nice little site that describes it well with basic code below which I think will do the job. I'll probably remove the "DIRECT" as the chances of 2x SG330 (hot standby) at 2 different locations failing is relatively low (although you can never say)

    I'm going to publish the wpad.dat with dhcp as it gives me a lot more scope and granularity to test

    Site with info is located in link below. Goes through a nice little routine for compiling it with IIS etc
    Auto Configuring Proxy Settings (Proxy Autodiscovery wpad.dat)

    function FindProxyForURL(url, host) {
    // our local URLs from the domains below mydomain.com don't need a proxy: 
    if (shExpMatch(url,"*.mydomain.com/*")) {return "DIRECT";}
    if (shExpMatch(url, "*.mydomain.com:*/*")) {return "DIRECT";}
    // Client computers within this network are accessed through 
    // port 8080 on proxy1.mydomain.local: 
    if (isInNet(MyIPAdress(), "192.168.0.0", "255.255.255.0"))
    {return "PROXY proxy1.mydomain.local:8080";
    }
    // All other requests go through port 8080 of proxy2.mydomain.local. 
    // should that fail to respond, go directly to the WWW: 
    return "PROXY proxy2.mydomain.local:8080; DIRECT";
    }

     

  • Keep in mind that if you bypass the Standard Mode proxy using a DIRECT statement, you may still hit the Transparent Proxy.  Bypassing both proxies is difficult because the match criteria in your proxy script are different than the skip list criteria.   Then if you bypass both proxies, you have to be sure that it is allowed at the firewall level.   System managers who are ordinary mortals cannot keep track of all these layers, so something else is needed.

    After fighting this problem for awhile, I found that nearly all of my proxy exceptions were best handled within the Standard Proxy, using an Exception object that disabled all or nearly all of the proxy checks.  So I have very few DIRECT exceptions in my proxy script any more.   (I don't have your redundancy, so I still have the fallback DIRECT statement.

    The odd exception was an application which does an external DNS lookup to the vendor's site, which returned the loopback address.    When standard proxy is involved, the DNS lookup occurs at the proxy and the connection fails.   When standard proxy is bypassed, the DNS lookup occurs on the client and the application works as expected.