Important note about SSL VPN compatibility for 20.0 MR1 with EoL SFOS versions and UTM9 OS. Learn more in the release notes.

This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Sophos Firewall WAF Policy Crashing System

Hello Sophos Community

Using the latest firmware as of today (SFOS 19.5.0 GA-Build197) on Sophos Firewall, installed as a virtual appliance in Proxmox 7.3-4.  It's a home license, on 4 virtual CPUs (host), and 6GB memory.  I'm using the official qcow2 images.

I am hosting a Wordpress blog behind it, using WAF, which I use mainly for sharing family pictures and videos.  Below is a screenshot of the policy.

When this policy is applied to the firewall rule, everything works fine until I attempt to upload a very large file to the blog.  Recently, I attempted to upload our family's Christmas morning video to my site, which was 1.3GB.  Not only did it fail, but the entire firewall crashed... and crashed hard.  The house went offline, DNS resolution went down, I couldn't connect to the firewall via the IP address, nothing.  It just died.  I had to go into Proxmox, kill the VM, then restart it.  Upon restart everything was fine.

If I remove the WAF policy, the file uploads.  If I enable the WAF policy and transfer the file, everything comes crashing down.  It's a very easily reproduced.  What could be causing this?



This thread was automatically locked due to age.
Parents
  • Hi John,

    To me it sounds like Wordpress is using some file transfer method that is not part of the core HTTP standard, e.g. something like WebDAV which WAF doesn't support. If that is the case, then that is known to cause increased memory usage in WAF, which can starve other processes on the system, leading to the crash you see.

    You can try to check if you can configure Wordpress to not use any non-standard protocol. If you can do this, that should fix the problem.

    Alternatively you can try to define an exception for the site path route the file is uploaded to and skip all CTF checks for that site path route. This is not guaranteed to work, but might reduce the memory footprint of WAF just enough to avoid a crash.

    Best regards,

    Attila Kovacs

  • I'll look into this as well, thank you for the suggestion.  Even if this ends up being the issue, I still wouldn't expect the system to flat out crash.  A more graceful failure would probably be better.  I'll report back.

  • I have not been able to determine if Wordpress is using anything non-standard for these transfers... it doesn't seem like they are, but I don't know for sure.  I'm using the native media upload feature, no plugins or anything.

    Even so, from the tests I ran earlier today, nothing jumps out at me with regard to memory footprint or CPU cycles.  Everything appears to well within acceptable ranges during the crash.  I'll give another shot tomorrow while tailing the reverse proxy, and sys logs.

  • Ok, I was able to capture this from the reverse proxy log the moment before the system crashed:

    [Tue Jan 31 05:23:03.054004 2023] [security2:error] [pid 22813:tid 139889534220032] [client 192.168.0.71:47856] [client 192.168.0.71] ModSecurity: Request body (Content-Length) is larger than the configured limit (1073741824).

    As this came from the Sophos logs, am I correct in understanding this configuration is in the firewall?  If so, is it adjustable?

  • Yes, this is a logline from WAF. It basically means it received a request that is bigger than the 1GB limit allowed by ModSecurity. It is configurable, but 1GB is the maximum. It sounds like Wordpress is trying to transfer the whole file in a single HTTP request. However, this request should have simply been rejected by WAF with a 413 response and shouldn't cause a crash.

    Just guessing, but it might be that Wordpress tried a standard file upload in a single request first, but got rejected by WAF, which in turn triggered some non-standard chunked file upload from Wordpress causing the crash.

    It would be good to see a HAR capture from the client side to see the actual message flow between the browser and WAF, and also the full reverseproxy.log capturing all logs when the issue happens. If you are willing to do further testing on this and send the logs to me, I can check them.

Reply
  • Yes, this is a logline from WAF. It basically means it received a request that is bigger than the 1GB limit allowed by ModSecurity. It is configurable, but 1GB is the maximum. It sounds like Wordpress is trying to transfer the whole file in a single HTTP request. However, this request should have simply been rejected by WAF with a 413 response and shouldn't cause a crash.

    Just guessing, but it might be that Wordpress tried a standard file upload in a single request first, but got rejected by WAF, which in turn triggered some non-standard chunked file upload from Wordpress causing the crash.

    It would be good to see a HAR capture from the client side to see the actual message flow between the browser and WAF, and also the full reverseproxy.log capturing all logs when the issue happens. If you are willing to do further testing on this and send the logs to me, I can check them.

Children
  • Thanks for your reply.  I ran a few more tests last night and am seeing the same behavior with both 600MB and 300MB files though.  While still large, they seem to be within this limit... and I agree, either way it shouldn't cause the firewall to seize up.

    I should be able to capture some of this today, and will report back.

  • Ok, just ran another test... this time with a 600MB MP4.  Here's the setup now:

    • 600MB file
    • Exception applied for the /wp-admin path for all policy rules.
    • Within 5 seconds the server became unresponsive.  I let it go for about 5 minutes before finally killing it.
    • During this time my SSH connection failed and I could not even log into the SOPHOS console directly via the Proxmox UI.  Completely unresponsive every way I normally connect to it.
    • Before the test I deleted the /log/reverseproxy.log.
    • I restarted the daemon by making a rule change.
    • The reverseproxy log logged a couple entries before it looks like logging stopped... presumably due to the failure.
    • I noticed in Proxmox the CPU of the VM going to about 40% and just staying there.
    • I killed the machine around 11:39, and normal startup numbers are shown.

    I'll PM you the reverse proxy logs.

  • Well, Sophos flagged my account as SPAM/malicious when I tried to PM you the logs... so I had to deal with that Rage.  I think I'm unlocked now and the PM went through.  Let me know what else I can provide.

  • Sorry for the delay in response... SOPHOS somehow flagged me as SPAM/abusive and I had to go through the appeal process.  Rage.  You should have received the PM now.  Thanks for taking the time.

  • can you do a atop logging? 

    So log atop into a log file (not /tmp) and then we check this via atop.

    https://unix.stackexchange.com/questions/276069/how-to-configure-atop-sampling-interval-log-path

    atop -w /var/atop.log should be fine 

    Then you can read it with atop -r /var/atop.log

    __________________________________________________________________________________________________________________

  • I'll work on this and get back to you.  In the meantime, I just found this in reverseproxy.log:

    statuscode="403" reason="WAF Anomaly" extra="Multipart parser detected a possible unmatched

    [Wed Feb 01 01:44:01.774797 2023] [security2:error] [pid 3121:tid 140183714203392] [client 192.168.0.71:47162] [client 192.168.0.71] ModSecurity: Access denied with code 403 (phase 2). Match of "eq 0" against "MULTIPA
    RT_UNMATCHED_BOUNDARY" required. [file "/usr/apache/conf/waf/modsecurity.conf"] [line "14"] [id "200004"] [msg "Multipart parser detected a possible unmatched boundary."] [hostname "<removed>"] [uri "/wp-a
    dmin/async-upload.php"] [unique_id "Y9nD3V6YG94qmui-cq4hggAAAJU"], referer: <removed>/.../upload.php

    Currently looking through this:  community.sophos.com/.../false-positive-which-can-t-be-skipped

    Upon researching this appears to be a mod_security message, though not sure what causes it, or why it would freeze the system.

  • I have ignored filter id 200004, and so far every video I've tried (up to 650MB) is working fine.  It completes and the server does not freeze or crash.  I want to keep testing this through tomorrow, but so far this is a suitable workaround for me.

    I'd still suggest a defect for the Sophos engineers to look at though.  I'm fine ignoring a filter, but I still worry that I may do something else which brings down the firewall (that shouldn't).  It seems like this has the potential to be critically impactful for others.  Fortunately for me it was just my home impacted.  Happy to provide whatever information necessary to help.

    Thank you all for your help and time in troubleshooting the issue.

  • Interesting, rule 200004 is not part of the OWASP CRS, but comes with ModSecurity itself. It is supposed to catch attacks trying to prevent a WAF from parsing the request correctly and thus allowing a bypass. I found several issues reported regarding multipart handling in ModSecurity and they still have bugs open regarding it, so it seems likely that you have ran into this.

    Unfortunately I don't see a fix for it that we could pick up with our next ModSecurity update, so you should keep the rule disabled for now. However, I suggest to only skip the rule on /wp-admin and any other site path where you upload files, just to not open up your site for a potential real attack.

  • I saw all those bugs too.  Lots of suggestions out there for disabling it too, but I didn't want to go down that path. 

    I've added /wp-admin to the path specific routing exceptions list... but that entry does not allow me the ability to specify a filter to exclude.  How is this done?

  • Yes, unfortunately the UI doesn't allow you to skip a rule on site path level, you can only do that in the protection profile, which is applied to the whole WAF rule.

    What you can do is block access to /wp-admin from any network in the original WAF rule, and create a second rule that only provides access to /wp-admin with another protection policy that has the problematic rule skipped.

    This also allows you to have better control over who can access wp-admin. I guess the best practice would be to totally lock down this path from the Internet and only allow access to it from your internal network or from specific IPs, that way no attacker can access it. But this is just tweaking you can think about, not related to the original problem.