This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Partial content (Range) rejected by proxy after upgrading to 9.6

After upgrade from 9.510-5 to 9.600-5, proxy started to respond with status "416 Requested range not satisfiable" to any request that has to be scanned with AV and includes Range: header. In /var/log/http.log, the denied request has reason="range". Sniff from the outside wire shows that the request from proxy to webserver is processed correctly.

 

Workaround: After putting problematic URL patterns to web filtering exception rule with AV scanning turned off, the requests are processed as before upgrade to 9.6.

 

Is this intended behavioral change in 9.6? I know, the partial content serving can be abused to bypass AV scanning, so generally it is good idea to handle requests with partial content differently and not to pass their responses without scanning, but IMHO there should be more sophisticated algorithm at the proxy to deal with this situation:

1) send HEAD to determine the content size, whether it is within size limit for scanning;
2) if it is over the size limit, process request without scanning;
3) if it is within the size limit, request whole content (and possibly cache it for subsequent range requests), scan it and serve partial content to the client.

 

OH

 



This thread was automatically locked due to age.
Parents
  • Yes this is intended.
     
    The way that 9.5 behaved was to strip the range header and pass it along.  So if you requests from 1MB to 2MB of a 20MB file, in 9.5 it served back the 20MB file.  This caused some things to work (albeit poorly) and other things to not work (and often saturate the connection trying to download again and again).
     
    The two most common uses of range requests are:
    1) Streaming media.  In this case, you can use the option 'Bypass content scanning for streaming content'.
    2) Background downloaders.  This include microsoft updates using BITS as well as other things that try and download things in small chunks.  This is best resolved by, if you trust the source company, creating an exception for the destination to turn of AV scanning.
     
  • "This is best resolved by, if you trust the source company, creating an exception for the destination to turn of AV scanning."

    We are getting this problem and users are having problems browsing to PDF files. We don't want to turn off AV scanning.

  • In some scenarios (pdf.js or Chrome or some browser extensions) the browser may try to make multiple requests, which can include range requests.  They are trying to display the first page quickly and then download the rest of the PDF in the background.

    With UTM 9.6 AFAIK we correctly will fail the request with 416 and set Accept-Ranges: None when we do not allow it.  The browser/plugin/page should then fall back to full downloads.  The end user experience should be success.  There may be a logged failure, which is normal and expected.

    What UTM version are you running, what browser are you using, what website are you visiting, and what is the full end user experience?

     

    References:

    https://stackoverflow.com/questions/32725608/chrome-sends-two-requests-when-downloading-a-pdf-and-cancels-one-of-them

    https://stackoverflow.com/questions/1817750/do-most-browsers-make-multiple-http-requests-when-displaying-a-pdf-from-within-t

  • Hello Michael.

    We are using 9.601-5. I can't find any entries in the Web Filtering logs for reason="range" before the upgrade to that.

    We use Microsoft Edge 44.17763.1.0 and Internet Explorer 11.316.17763.0.

    Various websites produce the problem. If I put the website in Web Protection - Web Filter Profiles - Filter Actions - Allowed Sites, the user gets the PDF more often but not always.

    When clicking on the PDF link, either nothing happens, a blank page appears or the PDF loads. For most people, most of the time, the PDF will load.

  • The quick answer is that we tested pdfs when we implemented this in UTM, however there are many combinations so it is possible that some combination that is not right.  But 9.6 has been out for a while and the Dev team has not heard about this from anywhere else.

    If there is a product issue, this needs to be raised through a support ticket.

  • Hi Nigel,

    How about showing us a line from the Web Filtering log where a PDF did not load?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • 2019:03:20-16:11:17 proxy-1 httpproxy[5548]: id="0002" severity="info" sys="SecureWeb" sub="http" name="web request blocked" action="block" method="GET" srcip="xxx.xxx.xxx.xxx" dstip="185.217.40.162" user="xxxxxxxx" group="Active Directory Users" ad_domain="xxxxxxxx" statuscode="416" cached="0" profile="REF_HttProContaInterScc (Corporate)" filteraction="REF_HttCffActivDirecUsers (Standard Active Directory Users)" size="0" request="0xccd16a00" url="www.ellenmacarthurfoundation.org/.../GC-Spring-Report-Summary.pdf" referer="www.ellenmacarthurfoundation.org/.../GC-Spring-Report-Summary.pdf" error="" authtime="966" dnstime="9" aptptime="0" cattime="86" avscantime="0" fullreqtime="80895" device="0" auth="2" ua="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 Edge/15.15063" exceptions="" category="105" reputation="neutral" categoryname="Business" content-type="application/pdf" reason="range"

    These happen hundreds of times per day for PDF files since the UTM upgrade. It is thousands of times per day including other files. It will retry and usually get the PDF, but not always.

Reply
  • 2019:03:20-16:11:17 proxy-1 httpproxy[5548]: id="0002" severity="info" sys="SecureWeb" sub="http" name="web request blocked" action="block" method="GET" srcip="xxx.xxx.xxx.xxx" dstip="185.217.40.162" user="xxxxxxxx" group="Active Directory Users" ad_domain="xxxxxxxx" statuscode="416" cached="0" profile="REF_HttProContaInterScc (Corporate)" filteraction="REF_HttCffActivDirecUsers (Standard Active Directory Users)" size="0" request="0xccd16a00" url="www.ellenmacarthurfoundation.org/.../GC-Spring-Report-Summary.pdf" referer="www.ellenmacarthurfoundation.org/.../GC-Spring-Report-Summary.pdf" error="" authtime="966" dnstime="9" aptptime="0" cattime="86" avscantime="0" fullreqtime="80895" device="0" auth="2" ua="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 Edge/15.15063" exceptions="" category="105" reputation="neutral" categoryname="Business" content-type="application/pdf" reason="range"

    These happen hundreds of times per day for PDF files since the UTM upgrade. It is thousands of times per day including other files. It will retry and usually get the PDF, but not always.

Children
  • My lab is on 9.601, yet I had no trouble loading that PDF, Nigel, neither through the Transparent nor the Standard Proxy.

    2019:03:22-13:33:53 post httpproxy[6621]: id="0001" severity="info" sys="SecureWeb" sub="http" name="http access" action="pass" method="GET" srcip="10.x.y.65" dstip="185.217.40.162" user="myusername" group="Open Web Access" ad_domain="MEDIASOFT" statuscode="200" cached="0" profile="REF_RMxbSZXQTi (Office)" filteraction="REF_IiqUeSGrWr (Open Web Access)" size="3332969" request="0x9d34300" url="https://www.ellenmacarthurfoundation.org/assets/downloads/GC-Spring-Report-Summary.pdf" referer="" error="" authtime="235" dnstime="47401" aptptime="118" cattime="84009" avscantime="110043" fullreqtime="3806549" device="1" auth="2" ua="Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:65.0) Gecko/20100101 Firefox/65.0" exceptions="" category="105" reputation="neutral" categoryname="Business" country="United Kingdom" sandbox="-" content-type="application/pdf"

    I'm beginning to suspect your browser.  Have you tried a different one and from a different machine type with a different OS?

    By the way, since you mentioned this, I grepped 'range' in http.log and found that windows updates were being blocked so I added an Exception in Internet Options and added the following DNS Group object:

    Thanks for stimulating me to check that!

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • A packet sniffer can help figure out what is going on.  One of the things to pay attention to is the response header "Accept-Ranges".
     
    In this case, the far server sets the Accept-Ranges to be "bytes".
    In XG, it will modify this header to "none".
    In UTM, it does not modify this header, unless it has to.
     
    How clients behave is...  dependent on the client.  However here is one scenario:
     
    Client makes a request to the far server for the whole file.
    Header comes back with Accept-Ranges: bytes (range requests are allowed)
    Client makes a second concurrent request for a range.
    The second request is blocked with 416 and Accept-Ranges: none
    The first request should still proceed and download the file.
     
    The client should be smart enough not to drop the original full file download.  Or if it does, when range request fails it should retry with a full file download.
     
    A quick test in my environment worked.
  • Hi.

    I have tried different browsers with the same problem. One of the remote sites seems to get the problem more than most, but anyone can get it intermittently.

    Sophos have remoted on, extracted logs of it happening and escalated to third line.

  • Thanks for keeping us in the loop on this, Nigel.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • No fix yet. We didn't want to bypass AV scanning and so it has been escalated.

  • Hi,

    This issue is also breaking Spotify! It won't play anything, just fails miserably. The web client is fine, but the fat client breaks.

    13:25:14.823 I [cdn_chunk_downloader.cpp:132 ] Requesting data (0 -> 524288) from CDN url: http://audio-sp-lhr.pscdn.co/audio/5c436e96826814569a686e46b848a7eba18f0b65?1556720714_nCh6iMW996ZXMr2snulh8Y9k_wtRH8-lRZqooPtEg_o= 
    13:25:14.874 I [dns.cpp:60 ] Resolved audio-sp-lhr.pscdn.co to 193.182.10.113
    13:25:14.883 E [cdn_chunk_downloader.cpp:367 ] Unknown resource size
    13:25:14.883 E [cdn_chunk_downloader.cpp:317 ] CDN failure 0->524288. Error: download_no_size (5). Http: 416.

    You can see the 416 being returned by the UTM there. I think the blame could be placed on Spotify not handling this error properly perhaps?

  • Hi David,

    Please show us an interesting line related to this from the Web Filtering log.

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • We have identified the issue.
    The issue is not so much how we support range requests, but how we support the advertising of it in headers.

    In UTM 9.5, if the web server says that it supports range requests (Accept-Ranges: bytes) the UTM never changes the header.  Therefore the client thinks that range request is supported.  The client tries to do a range request, we remove the range header, and we return the entire file to the client.  Some clients are ok with the fact that we return something different from what they ask, some clients are not ok with it.

    In UTM 9.6, if the web server says that it supports range requests (Accept-Ranges: bytes) the UTM never  changes the header.  Therefore the client thinks that range request is supported.  The client tries to do a range request and fails with 416.  Some clients are ok with the range request failure and try again with a full file, some clients are not ok with it.
     
    In both UTM 9.5 and 9.6 some clients had problems.  But different clients with different impacts, so 9.6 ended up both better and worse.
     
    The solution is to copy some behavior from the XG.
     
    In XG, if the web server says that it supports range requests (Accept-Ranges: bytes) and the XG is going to AV scan it, the XG changes the header (Accept-Ranges: none).  Therefore the client (should) think that range request are not supported and (should) not make any range requests.  Although we know that some clients still ignore that and try to do a range request anyway, it is a lot rarer.
     
     
    For PDFs specifically:
    Browser starts to download the file (full download).  Browser sees that mime type is pdf.  Browser drops the request.
    Browser then starts pdf plugin and tells plugin about the file and gives it all the headers. 
    PDF plugin starts to download the file, and will either do range request or not depending on what the original headers were.
    If the original headers (on full file download) says Accept-Ranges: bytes then when the plugin tries to download it, it will try a range request.
     
    We have no complaints about downloading PDFs over the XG.  Some background downloaders and applications (I'm looking at you, NetFlix) will still have problems and the solution is to use an exception.