This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Pfilter reporting process not starting, causes high CPU usage

Hi

I have setup Sophos UTM in HA mode running on ESXi servers to my home network. This environment was working fine for several months but couple weeks ago I noticed that I am not getting any reporting data from Firewall module, all other modules were reporting data fine. Also CPU usage was increased to near to max and UTM's performance was decreased.

I investigated the issue and I found out that process /usr/bin/perl /usr/local/bin/reporter/pfilter-reporter.pl was generating high CPU. I tried to test some instructions what I found (delete files /var/log/reporting/rrd/pfilter.rrd* and /var/log/reporting/accu/pfilter.accu) but that didn't help. Rebooting both nodes didn't help either. If I switched another node to active, the same problem started with that active node.

I was using version 9.355 when this problem started. When 9.400 came out I tested update to it using manual update package. That didn't solve the issue.

I decided now to reinstall the whole system, without the HA feature at first until I will get this issue fixed. I took configuration backup, downloaded 9.400 installation disc and reinstalled the system. But as soon as I imported old configuration back, this same issue started again.

Here is what I have found out so far.

Top CPU usage processes from Support -> Advanced -> Process list:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      7874 54.8  0.0      0     0 ?        Z    09:44   0:00      \_ [confd.plx] <defunct>
root      7875 77.7  0.2  15404 11432 ?        R    09:44   0:00      \_ /usr/bin/perl /usr/local/bin/reporter/pfilter-reporter.pl
root      6916 56.1  0.0   4660  1484 ?        RN   09:42   0:56  \_ /usr/bin/openssl dhparam -out /var/sec/chroot-openvpn/etc/openv

This openssl process CPU usage is new problem after reinstall, probably something related to importing old backup back with all RSA keys etc.

/var/log/confd-debug.log is increasing almost 10 MB per minute

# ls -l /var/log/confd-debug.log && sleep 60 && ls -l /var/log/confd-debug.log
-rw-r--r-- 1 root log 1649459265 Apr  9 12:07 /var/log/confd-debug.log
-rw-r--r-- 1 root log 1657833829 Apr  9 12:08 /var/log/confd-debug.log

# tail /var/log/confd-debug.log
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object"
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object"
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object"
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object"
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object"
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object"
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="get_object"
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::AUTOLOAD:302() => id="3100" severity="debug" sys="System" sub="confd" name="external call" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" lock="none" method="detach"
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::_detach:1667() => id="3100" severity="debug" sys="System" sub="confd" name="detach" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl" call="detach"
2016:04:09-12:09:38 sophos-utm confd[32207]: D sys::DESTROY:230() => id="3100" severity="debug" sys="System" sub="confd" name="worker process exiting" user="system" srcip="127.0.0.1" facility="system" client="pfilter-reporter.pl"

It seems that pfilter reporter is not starting at all and it's files are not updating. Log /var/log/debug/log.pfilter-reporter is full of only these lines.

# mkdir /var/log/debug
# tail /var/log/debug/log.pfilter-reporter
*** pfilter-reporter starting
*** pfilter-reporter starting
*** pfilter-reporter starting
*** pfilter-reporter starting
*** pfilter-reporter starting
*** pfilter-reporter starting
*** pfilter-reporter starting
*** pfilter-reporter starting
*** pfilter-reporter starting
*** pfilter-reporter starting

# date
Sat Apr  9 12:11:13 EEST 2016
# ls -lh /var/log/reporting/rrd/pfilter.*
-rw-r--r-- 1 rrdcache rrdcache  344 Apr  8 17:29 /var/log/reporting/rrd/pfilter.pst
-rw-r--r-- 1 root     root        0 Apr  8 17:29 /var/log/reporting/rrd/pfilter.pst.lock
-rw-r--r-- 1 rrdcache rrdcache 230K Apr  9 08:46 /var/log/reporting/rrd/pfilter.rrd
-rw-r--r-- 1 root     root        0 Apr  8 17:29 /var/log/reporting/rrd/pfilter.rrd.lock

# ls -lh /var/log/reporting/accu/
total 28K
-rw-r--r-- 1 root root 5.6K Apr  9 12:07 admin.accu
-rw-r--r-- 1 root root 1.1K Apr  9 12:07 ipsevent.accu
-rw-r--r-- 1 root root 2.2K Apr  9 12:07 mailsec.accu
-rw-r--r-- 1 root root 1.8K Apr  9 08:39 pfilter.accu
-rw-r--r-- 1 root root 1.3K Apr  9 12:07 waf.accu
-rw-r--r-- 1 root root 3.4K Apr  9 12:08 websec-json.accu

I have also tried to disable reporting from Logging & Reporting -> Reporting settings -> Removing checkbox from "Firewall" section. Rebooted but still this same issue continues and reporter tries to start.

Does anybody have any suggestions what to try next?



This thread was automatically locked due to age.
  • For the openssl CPU usage I found this using ps:

    USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
    root      6914  0.0  0.0   4808  1544 ?        SNs  09:42   0:00 /usr/bin/perl -we ??    use Fcnt
    l qw(:flock);??    use POSIX qw(setsid nice);??    my $bits = shift || 2048;??    my $outdir = sh
    ift || "/tmp";??    sub error { print shift() . "\n"; exit 1; }??    error "Unsupported DH parame
    ter size, must be 1024|2048|3072|4096"???unless $bits =~ /^(1024|2048|3072|4096)$/;??    fork and
     exit;??    setsid;??    umask 0;??    chdir "/";??    my $lockfile = "/var/lock/ovpn_dhgen_$bits
    .lock";??    open my $lock_fh, ">$lockfile"???or error "Could not open lockfile $lockfile: $!";??
        error "Generation of local $bits bit DH parameters is already in progress"???unless flock $lo
    ck_fh, LOCK_EX|LOCK_NB;??    print "Starting generation of local DH parameters ($bits bit) in bac
    kground\n";??    nice 19;??    open STDIN, "</dev/null";??    open STDOUT, ">/dev/null";??    ope
    n STDERR, ">/dev/null";??    system qq(/usr/bin/openssl dhparam -out "${outdir}/dh${bits}.local.p
    em" $bits)?? 4096 /var/sec/chroot-openvpn/etc/openvpn
    root      6916 63.5  0.0   4660  1484 ?        RN   09:42 100:31 /usr/bin/openssl dhparam -out /v
    ar/sec/chroot-openvpn/etc/openvpn/dh4096.local.pem 4096

    I should probably try to regenerate some keys?

  • Hi, Jan, and welcome to the UTM Community!

    Since you recently installed, you have nothing to lose by rebuilding the PostgreSQL databases:

    /etc/init.d/postgresql92 rebuild

    Do you still have problems after that?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Thanks for suggestion but unfortunately that didn't help. I also rebooted UTM after that but that didn't made any difference. Pfilter-reporter is still just trying to start and looping infinitely.

  • i've seen some kind of same behaviors while swapping hardware (new installation) and restoring a backup from the former platform. openssl was taking 39% of cpu. I've got around this by "simply"tunring off the openssl server and back on again...

    Cheers,
    m.