This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Sophos WAF - Disable/Re-enable virtual web server reproducibly fails with "Restarting failed"

Hi,

I've got a problem since approximately 1,5 years ago.  Since then, my WAF isn't reliably starting/restarting. It wasn't the case back in 2018 with the past Sophos UTM firmware versions. I'm now waiting since many firmware upgrade that Sophos fixes this and suspect they are unaware of my problem. So I'm reporting it here.

Scenario a) I disable a virtual web server and then - without config changes - I re-enable it. This often leads to "SIGUSR1 received. Doing graceful restart, Reloading timed out, Restarting failed". 

Scenario b) I reboot the whole UTM via Web UI. Sometimes, all virtual web servers are reachable and sometimes I see the above errors in the logs and they are unreachable.

It does not matter if a) or b) occurs, when I wait about 10 minutes until the Sophos watchdog kicks in and sends me an email like "Sophos WAF not running - restarted" all virtual webservers work fine again after getting this email. The log then says "2020:10:27-14:05:55 firewall httpd[26706]: Started", so that indicates success.

My desperate search for a root cause: I've tried to run with less virtual web servers, putting them off one by one and back on one by one again to find what is causing this problem but even after changing the virtual web server configurations (for testing purposes) I could not work out what's the root cause here. After waiting for the watch dog kicking in, everything works fine no matter which config changes I did. If then just disabling one VWEB server and re-enabling it, the 10 minute delay occurs after the above "Restarting failed" log line. I've also tried to disable NAT/Firewall/etc. rules temporarily. No clue what's causing the behaviour as it's unchanged still there then.

My question: Where can I get detailed apache logs from the box and can Sophos take the logs to analyze this? We are a licensed customer btw.

Failed attempt:

2020:10:27-13:58:03 firewall httpd[25557]: [Tue Oct 27 13:58:03.928961 2020] [core:warn] [pid 25559:tid 4152075968] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-13:58:03 firewall httpd[25557]: [Tue Oct 27 13:58:03.928997 2020] [core:warn] [pid 25559:tid 4152075968] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-13:58:03 firewall httpd[25557]: [Tue Oct 27 13:58:03.932164 2020] [core:warn] [pid 25559:tid 4152075968] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-13:58:03 firewall httpd[25557]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb1] does not exist
2020:10:27-13:58:03 firewall httpd[25557]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb2] does not exist
2020:10:27-13:58:03 firewall httpd[25557]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb3] does not exist
2020:10:27-13:58:03 firewall httpd[25557]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb4] does not exist
2020:10:27-13:58:04 firewall httpd[25557]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb5] does not exist
2020:10:27-13:58:04 firewall httpd[25557]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb6] does not exist
2020:10:27-13:58:04 firewall httpd[6605]: [mpm_worker:notice] [pid 6605:tid 4151686848] AH00297: SIGUSR1 received. Doing graceful restart
2020:10:27-13:59:12 firewall httpd[25805]: Reloading timed out
2020:10:27-13:59:12 firewall httpd[25808]: Restarting failed

Successfull attempt:

2020:10:27-14:05:53 firewall httpd[26509]: Restarting gracefully
2020:10:27-14:05:53 firewall httpd[26513]: Not running
2020:10:27-14:05:53 firewall httpd[26517]: Starting
2020:10:27-14:05:53 firewall httpd[26521]: [Tue Oct 27 14:05:53.874428 2020] [core:warn] [pid 26523:tid 4151817920] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:53 firewall httpd[26521]: [Tue Oct 27 14:05:53.874570 2020] [core:warn] [pid 26523:tid 4151817920] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:53 firewall httpd[26521]: [Tue Oct 27 14:05:53.877497 2020] [core:warn] [pid 26523:tid 4151817920] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:53 firewall httpd[26521]: [Tue Oct 27 14:05:53.877533 2020] [core:warn] [pid 26523:tid 4151817920] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:53 firewall httpd[26521]: [Tue Oct 27 14:05:53.882052 2020] [core:warn] [pid 26523:tid 4151817920] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:53 firewall httpd[26521]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb1] does not exist
2020:10:27-14:05:53 firewall httpd[26521]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb2] does not exist
2020:10:27-14:05:53 firewall httpd[26521]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb3] does not exist
2020:10:27-14:05:53 firewall httpd[26521]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb4] does not exist
2020:10:27-14:05:54 firewall httpd[26521]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb5] does not exist
2020:10:27-14:05:54 firewall httpd[26521]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb6] does not exist
2020:10:27-14:05:54 firewall httpd[26521]: Syntax OK
2020:10:27-14:05:54 firewall httpd[26534]: [Tue Oct 27 14:05:54.422578 2020] [core:warn] [pid 26536:tid 4151994048] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:54 firewall httpd[26534]: [Tue Oct 27 14:05:54.422821 2020] [core:warn] [pid 26536:tid 4151994048] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:54 firewall httpd[26534]: [Tue Oct 27 14:05:54.426238 2020] [core:warn] [pid 26536:tid 4151994048] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:54 firewall httpd[26534]: [Tue Oct 27 14:05:54.426284 2020] [core:warn] [pid 26536:tid 4151994048] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:54 firewall httpd[26534]: [Tue Oct 27 14:05:54.430461 2020] [core:warn] [pid 26536:tid 4151994048] AH00111: Config variable ${URLHardening_HTTP_Hostname} is not defined
2020:10:27-14:05:54 firewall httpd[26534]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb1] does not exist
2020:10:27-14:05:54 firewall httpd[26534]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb2] does not exist
2020:10:27-14:05:54 firewall httpd[26534]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb3] does not exist
2020:10:27-14:05:54 firewall httpd[26534]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb4] does not exist
2020:10:27-14:05:54 firewall httpd[26534]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb5] does not exist
2020:10:27-14:05:54 firewall httpd[26534]: AH00112: Warning: DocumentRoot [/var/www/REF_RevFroVweb6] does not exist
2020:10:27-14:05:54 firewall httpd[26536]: [remoteip:notice] [pid 26536:tid 4151994048] AH03494: RemoteIPProxyProtocol: disabled on 127.0.0.1:4080
2020:10:27-14:05:54 firewall httpd[26536]: [security2:notice] [pid 26536:tid 4151994048] ModSecurity for Apache/2.9.3 (http://www.modsecurity.org/) configured.
2020:10:27-14:05:54 firewall httpd[26540]: [remoteip:notice] [pid 26540:tid 4151994048] AH03494: RemoteIPProxyProtocol: disabled on 127.0.0.1:4080
2020:10:27-14:05:54 firewall httpd[26540]: [mpm_worker:notice] [pid 26540:tid 4151994048] AH00292: Apache/2.4.39 (Unix) OpenSSL/1.0.2j-fips configured -- resuming normal operations
2020:10:27-14:05:54 firewall httpd[26540]: [core:notice] [pid 26540:tid 4151994048] AH00094: Command line: '/usr/apache/bin/httpd'
2020:10:27-14:05:55 firewall httpd[26706]: Started



This thread was automatically locked due to age.
Parents
  • Hello Polygon-tre,

    Thank you for contacting the Sophos Community!

    To open a case with Support you would need to provide basically the reverseproxy.log 

    Also since it seems you are able to reproduce this issue make sure to include the information of the following commands:

    ps aux | grep /usr/apache/bin/httpd | grep -v 'grep /usr/apache/bin/httpd' (Run this probably after you make a change)
    netstat -naltp | grep -E ':80|:443' (Check the port WAF is listening to)
    /var/mdw/scripts/reverseproxy start (This will manually restart WAF)
    Verify if the network socket is created for the WAF (netstat -natup |grep httpd)
    Verify that no other service is running on Port 80 or 443 at the UTM

    Also make sure you follow the KB below when opening the case with Support!

    https://community.sophos.com/support-portal/f/recommended-reads/123488/best-practices-when-opening-a-case-with-sophos-support

    I guess you refer that in previous versions the WAF won't take that long to restart?

    Regards,


     
    Emmanuel (EmmoSophos)
    Community Support Engineer | Sophos Technical Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
Reply
  • Hello Polygon-tre,

    Thank you for contacting the Sophos Community!

    To open a case with Support you would need to provide basically the reverseproxy.log 

    Also since it seems you are able to reproduce this issue make sure to include the information of the following commands:

    ps aux | grep /usr/apache/bin/httpd | grep -v 'grep /usr/apache/bin/httpd' (Run this probably after you make a change)
    netstat -naltp | grep -E ':80|:443' (Check the port WAF is listening to)
    /var/mdw/scripts/reverseproxy start (This will manually restart WAF)
    Verify if the network socket is created for the WAF (netstat -natup |grep httpd)
    Verify that no other service is running on Port 80 or 443 at the UTM

    Also make sure you follow the KB below when opening the case with Support!

    https://community.sophos.com/support-portal/f/recommended-reads/123488/best-practices-when-opening-a-case-with-sophos-support

    I guess you refer that in previous versions the WAF won't take that long to restart?

    Regards,


     
    Emmanuel (EmmoSophos)
    Community Support Engineer | Sophos Technical Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
Children
  • When it's running fine I see:

    ps aux | grep /usr/apache/bin/httpd | grep -v 'grep /usr/apache/bin/httpd'
    root 11161 0.0 0.4 52608 40596 ? Ss 09:08 0:00 /usr/apache/bin/httpd -k start
    nobody 11184 0.0 0.5 160444 41300 ? Sl 09:08 0:00 /usr/apache/bin/httpd -k start

    After I reproduced the failure, e.g. by changing a setting to trigger the reload:

    ## It's in failed state here - I cannot reach my virtual web servers.
    ps aux | grep /usr/apache/bin/httpd | grep -v 'grep /usr/apache/bin/httpd'
    root 15677 0.1 0.6 61156 49212 ? Ss 09:19 0:00 /usr/apache/bin/httpd -k start
    nobody 15715 0.0 0.5 488756 49084 ? Sl 09:19 0:00 /usr/apache/bin/httpd -k start
    firewall:/root # /var/mdw/scripts/reverseproxy restart
    httpd[16286]: Stopping
    (...)
    httpd[16291]: Syntax OK
    (...)
    httpd[16306]: Stopped
    httpd[16310]: Starting
    (...)
    httpd[16314]: Syntax OK
    (...)
    httpd[16493]: Started
    ## It's running fine (always) after I manually entered the "reverseproxy restart" command.
    firewall:/root # ps aux | grep /usr/apache/bin/httpd | grep -v 'grep /usr/apache/bin/httpd'
    root 16326 4.4 0.6 61156 49204 ? Ss 09:20 0:00 /usr/apache/bin/httpd -k start
    nobody 16331 0.0 0.5 169140 47324 ? S 09:20 0:00 /usr/apache/bin/httpd -k start

  • When it's running fine, I see:

    # netstat -naltp | grep -E ':80|:443'
    ## only submitting LISTEN rows
    tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 31773/httpproxy
    tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 4610/openvpn
    tcp 0 0 :::8080 :::* LISTEN 31773/httpproxy

    # netstat -natup |grep httpd
    tcp 0 0 WAN_IP:449 0.0.0.0:* LISTEN 16326/httpd
    tcp 0 0 127.0.0.1:4080 0.0.0.0:* LISTEN 16326/httpd
    tcp 0 0 0.0.0.0:4501 0.0.0.0:* LISTEN 4626/httpd
    tcp 0 0 0.0.0.0:4502 0.0.0.0:* LISTEN 4626/httpd
    tcp 0 0 WAN_IP:440 0.0.0.0:* LISTEN 16326/httpd
    tcp 0 0 WAN_IP:441 0.0.0.0:* LISTEN 16326/httpd
    tcp 0 0 WAN_IP:442 0.0.0.0:* LISTEN 16326/httpd
    tcp 0 0 WAN_IP:20443 0.0.0.0:* LISTEN 16326/httpd
    tcp 0 0 0.0.0.0:4443 0.0.0.0:* LISTEN 4626/httpd
    tcp 0 0 WAN_IP:444 0.0.0.0:* LISTEN 16326/httpd
    tcp 0 0 0.0.0.0:4444 0.0.0.0:* LISTEN 4626/httpd
    tcp 0 0 :::4501 :::* LISTEN 4626/httpd
    tcp 0 0 :::4502 :::* LISTEN 4626/httpd
    tcp 0 0 :::4444 :::* LISTEN 4626/httpd

  • Hello Polygon,

    Thank you for the output of the commands.

    Please open a case with this info, to get it investigated.

    Regards,


     
    Emmanuel (EmmoSophos)
    Community Support Engineer | Sophos Technical Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.
  • Thanks, case ID is # 03343099

  • Hello Polygon,

    Thank you for the Case ID.

    Regards,


     
    Emmanuel (EmmoSophos)
    Community Support Engineer | Sophos Technical Support
    Sophos Support VideosProduct Documentation  |  @SophosSupport  | Sign up for SMS Alerts
    If a post solves your question use the 'Verify Answer' link.