Hi
I have setup Sophos UTM in HA mode running on ESXi servers to my home network. This environment was working fine for several months but couple weeks ago I noticed that I am not getting any reporting data from Firewall module, all other modules were reporting data fine. Also CPU usage was increased to near to max and UTM's performance was decreased.
I investigated the issue and I found out that process /usr/bin/perl /usr/local/bin/reporter/pfilter-reporter.pl was generating high CPU. I tried to test some instructions what I found (delete files /var/log/reporting/rrd/pfilter.rrd* and /var/log/reporting/accu/pfilter.accu) but that didn't help. Rebooting both nodes didn't help either. If I switched another node to active, the same problem started with that active node.
I was using version 9.355 when this problem started. When 9.400 came out I tested update to it using manual update package. That didn't solve the issue.
I decided now to reinstall the whole system, without the HA feature at first until I will get this issue fixed. I took configuration backup, downloaded 9.400 installation disc and reinstalled the system. But as soon as I imported old configuration back, this same issue started again.
Here is what I have found out so far.
Top CPU usage processes from Support -> Advanced -> Process list:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 7874 54.8 0.0 0 0 ? Z 09:44 0:00 \_ [confd.plx] <defunct>
root 7875 77.7 0.2 15404 11432 ? R 09:44 0:00 \_ /usr/bin/perl /usr/local/bin/reporter/pfilter-reporter.pl
root 6916 56.1 0.0 4660 1484 ? RN 09:42 0:56 \_ /usr/bin/openssl dhparam -out /var/sec/chroot-openvpn/etc/openv
This openssl process CPU usage is new problem after reinstall, probably something related to importing old backup back with all RSA keys etc.
/var/log/confd-debug.log is increasing almost 10 MB per minute
# ls -l /var/log/confd-debug.log && sleep 60 && ls -l /var/log/confd-debug.log -rw-r--r-- 1 root log 1649459265 Apr 9 12:07 /var/log/confd-debug.log -rw-r--r-- 1 root log 1657833829 Apr 9 12:08 /var/log/confd-debug.log
# tail /var/log/confd-debug.log 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object" 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object" 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object" 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object" 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object" 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object" 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object" 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="detach" 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::_detach:1667() => id="3100" severity="debug" sys="System" sub="confd" name="detach" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" call="detach" 2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::DESTROY:230() => id="3100" severity="debug" sys="System" sub="confd" name="worker process exiting" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl"
It seems that pfilter reporter is not starting at all and it's files are not updating. Log /var/log/debug/log.pfilter-reporter is full of only these lines.
# mkdir /var/log/debug # tail /var/log/debug/log.pfilter-reporter *** pfilter-reporter starting *** pfilter-reporter starting *** pfilter-reporter starting *** pfilter-reporter starting *** pfilter-reporter starting *** pfilter-reporter starting *** pfilter-reporter starting *** pfilter-reporter starting *** pfilter-reporter starting *** pfilter-reporter starting
# date Sat Apr 9 12:11:13 EEST 2016 # ls -lh /var/log/reporting/rrd/pfilter.* -rw-r--r-- 1 rrdcache rrdcache 344 Apr 8 17:29 /var/log/reporting/rrd/pfilter.pst -rw-r--r-- 1 root root 0 Apr 8 17:29 /var/log/reporting/rrd/pfilter.pst.lock -rw-r--r-- 1 rrdcache rrdcache 230K Apr 9 08:46 /var/log/reporting/rrd/pfilter.rrd -rw-r--r-- 1 root root 0 Apr 8 17:29 /var/log/reporting/rrd/pfilter.rrd.lock # ls -lh /var/log/reporting/accu/ total 28K -rw-r--r-- 1 root root 5.6K Apr 9 12:07 admin.accu -rw-r--r-- 1 root root 1.1K Apr 9 12:07 ipsevent.accu -rw-r--r-- 1 root root 2.2K Apr 9 12:07 mailsec.accu -rw-r--r-- 1 root root 1.8K Apr 9 08:39 pfilter.accu -rw-r--r-- 1 root root 1.3K Apr 9 12:07 waf.accu -rw-r--r-- 1 root root 3.4K Apr 9 12:08 websec-json.accu
I have also tried to disable reporting from Logging & Reporting -> Reporting settings -> Removing checkbox from "Firewall" section. Rebooted but still this same issue continues and reporter tries to start.
Does anybody have any suggestions what to try next?
This thread was automatically locked due to age.