Hey Community,
We have a Nextcloud file share Server (preconfigured image from hanssonIT.se) running behind our Sophos UTM (current software version). The webserver is published with WAF, which is working in most cases.
Since last week, we saw that when we visit some specific pages in the system, the site crashes and becomes unavailable. After 2-3 minutes the site is back again, and I can even load this specific page. But if I click on this link for a second time, the site crashes again.
When it crashes, it is still available over the internal IP address/hostname, so the server itself is working, but the Sophos is not forwarding the request (503 - Service Unavailable). In some cases, the green check sign in the "Virtual Webserver" area turns yellow and says "In Error".
What I found until now:
- the pages that crash seem to load items from the internet - one page checks for verison updates, the other one shows the "App Store" where you can integrate additional apps from the webstore.
- external IP is still responding to ping when it crashes.
- nowhere on the webserver I can find something corresponding in the logs - not the Nextcloud logs, not the PHP logs, not the appache logs.
- Sophos WAF log shows:
2020:01:29-15:55:43 gate-2 httpd[16582]: [cookie:error] [pid 16582:tid 3833527152] [client (MY_CLIENT_IP):54874] No signature found, cookie: _ga
2020:01:29-15:55:43 gate-2 httpd[16582]: [cookie:warn] [pid 16582:tid 3833527152] [client (MY_CLIENT_IP):54874] Dropping cookie '_ga' from request due to missing/invalid signature
2020:01:29-15:56:26 gate-2 httpd[16582]: [proxy_http:error] [pid 16582:tid 3833527152] (104)Connection reset by peer: [client (MY_CLIENT_IP):54874] AH01110: error reading response
2020:01:29-15:58:00 gate-2 httpd[16582]: [proxy:error] [pid 16582:tid 3816741744] (110)Connection timed out: AH00957: HTTPS: attempt to connect to (WEBSERVER_IP):443 (WEBSERVER_INTERNAL_HOSTNAME) failed
2020:01:29-15:58:00 gate-2 httpd[16582]: [proxy:error] [pid 16582:tid 3816741744] AH00959: ap_proxy_connect_backend disabling worker for (datashare.ziegler.local) for 60s
2020:01:29-15:58:00 gate-2 httpd[16582]: [proxy_http:error] [pid 16582:tid 3816741744] [client (MY_CLIENT_IP):54877] AH01114: HTTP: failed to make connection to backend: (WEBSERVER_INTERNAL_HOSTNAME)
- since only pages that access the internet seem affected, I set a proxy exception (we use a transparent proxy) for the webserver and added a "Allow Any to Internet" rule for the internal IP.
- we updated the Nextcloud Software a few weeks ago, so to be sure that it is not related to this I reverted the System to a backup before the update (it is a VM, so I reset the whole machine to a state before), without success.
- it is also possible that the behaviour started with the update of the UTM to 9.700-5, which we did last week.
Screenshots of the WAF are attatched. I also tried to play with the settings of the firewall profile: used different profiles with different settings (especially the coockie signing), tried different exceptions for the connection - without success.
I do not know where to look next, and I really don't understand what is happening, maybe someone has an idea how to fix this.
Thanks for any help or advice!
Regards,
Tobias
This thread was automatically locked due to age.