We've had this same issue, twice in under a week. The external WAN interface maxes out our pipe ~100mbit, but we don't see anything close to the same amount of corresponding traffic/usage on any of the internal interfaces. It usually lasts about 30 minutes and makes any internet related activity basically unusable.
The previous ideas about proxy/cache make sense, but has anybody figured out what specific application or process is causing it?
SG310
Firmware 9.408-4
I did, but there is no corresponding traffic on the internal interfaces, which leads me to believe that the client (A/V) isn't receiving the data. If the proxy doesn't have anywhere to send the traffic why/what does it keep downloading? The flow monitor doesn't give you a specific IP for where its connecting to/from, just the IP for our WAN interface and AWS.
If this was just one connection that gets "stuck" downloading something, it shouldn't be noticeable to users. It would just use up whatever pipe is available, but when this happens everything else slows to a crawl, similar to if somebody was using BitTorrent and making thousands of connections that are all using as much available bandwidth as possible.
Steve
"I did, but there is no corresponding traffic on the internal interfaces, which leads me to believe that the client (A/V) isn't receiving the data." Exactly. That's the phenomenon I was describing. The problem is that the sending server times out so the Proxy restarts the download.
Cheers - Bob
Do you have several hits for 'deferred download status refresh timeout, removing' when you search your Websecurity logs?
I had one customer looping a download last week until the data partition was nearly full of broken downloads. After getting the hint to search for that phrase in websecurity logs I was able to identify WHAT was tried to be downloaded and was never successful I could create an exception to stop that bevaviour.
I could only get rid of the nearly 15G 'trash downloads' by clearing the proxy cache (also caching is not even active on that UTM).
Gruß / Regards,
Kevin
Sophos CE/CA (XG+UTM), Gold Partner
Thanks, Kevin - youdaman!
Here's a command that searches all of the http logs in 2017 on your UTM to get the info you want. It gives you every candidate for such an Exception:
zgrep 'deferred download status refresh timeout, removing' /var/log/http/2017/*/* |grep -oP 'url="^https?://.*?/'|sort -n|uniq -c|sort -n
The result at one client's box was:
1 url="http://www.xxxxxxxxxxx.net/
1 url="http://www.xxxxxxxx.org/
2 url="http://xxxxxxxxxxx.yyyyyy.com/
22 url="http://xxxxxxxxxx.zzzzzz.com/
Cheers - Bob
EDIT 2017-05-05: modified grep to look only at the FQDN
We do have several entries that contain that message and they appear around the same time as when the incident was happening and one entry is for an AWS IP. I tried going to the same link again and it didn't reproduce the issue, so I'm going to have to wait until it happens again to confirm. It says blocked in the log entry, but it let me go to it without issue.
THANK YOU for giving us something to look for and hopefully this ends up being the cause.
Possible offending log entry:
2017:03:22-10:02:13 asg-1 httpproxy[4783]: id="0002" severity="info" sys="SecureWeb" sub="http" name="web request blocked" action="block" method="GET" srcip="x.x.x.x" dstip="54.243.187.x" user="" group="" ad_domain="" statuscode="500" cached="0" profile="REF_DefaultHTTPProfile (Default Web Filter Profile)" filteraction="REF_DefaultHTTPCFFAction (Default content filter action)" size="0" request="0xb82ff800" url="X.com/.../FINAL_Corporate Responsibility Program Overview_2017 Refresh 02.pdf" referer="X.com/corporate-responsibility" error="deferred download status refresh timeout, removing" authtime="0" dnstime="570" cattime="1201" avscantime="0" fullreqtime="171762345" device="0" auth="0" ua="Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; Touch; rv:11.0) like Gecko" exceptions="" category="181" reputation="neutral" categoryname="Marketing/Merchandising" country="United States" content-type="application/pdf"
We do have several entries that contain that message and they appear around the same time as when the incident was happening and one entry is for an AWS IP. I tried going to the same link again and it didn't reproduce the issue, so I'm going to have to wait until it happens again to confirm. It says blocked in the log entry, but it let me go to it without issue.
THANK YOU for giving us something to look for and hopefully this ends up being the cause.
Possible offending log entry:
2017:03:22-10:02:13 asg-1 httpproxy[4783]: id="0002" severity="info" sys="SecureWeb" sub="http" name="web request blocked" action="block" method="GET" srcip="x.x.x.x" dstip="54.243.187.x" user="" group="" ad_domain="" statuscode="500" cached="0" profile="REF_DefaultHTTPProfile (Default Web Filter Profile)" filteraction="REF_DefaultHTTPCFFAction (Default content filter action)" size="0" request="0xb82ff800" url="X.com/.../FINAL_Corporate Responsibility Program Overview_2017 Refresh 02.pdf" referer="X.com/corporate-responsibility" error="deferred download status refresh timeout, removing" authtime="0" dnstime="570" cattime="1201" avscantime="0" fullreqtime="171762345" device="0" auth="0" ua="Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; Touch; rv:11.0) like Gecko" exceptions="" category="181" reputation="neutral" categoryname="Marketing/Merchandising" country="United States" content-type="application/pdf"