On Sophos UTM 9.2, running on a quemu/libvirt Virtual Machine.
Question:
I'd like to know if restarting confd will resolve my issue or not, or if restarting confd will cause more issues.
/etc/rc.d/confd restart
The high-level error I am getting is in webadmin for perl, after kernel.log records a segfault in libperl.so:
hash- or arrayref expected (not a simple scalar, use allow_nonref to allow this) at /</var/webadmin/webadmin.plx>core/modules/core_tools.pm line 711.
I would like to resolve this without rebooting the Sophos UTM. I cannot get a maintenance window scheduled soon enough, and need to add configuration to this Sophos UTM.
Summary:
The Sophos UTM has been running without issue for a long time. Then on Sept 25 there was a segfault event recorded in kernel.log. On Sept 26, confd [listener] appears to have started new (restarted). The parent confd [master] process has been running since system boot. Since Sept 28, backups fail to get created. Since Oct 31, logins to webadmin WebUI hang after successful authentication.
These are two issues are the problems I am experiencing in result of the confd.plx error.
(1)
A cron job makes nightly backups at 9:00pm. Sept 27 9:00pm was the last backup I received.
Since Sept 28, I cannot create a new backups. I get the following error:
# backup.plx -o "cfg_20181101_085100.abf"
$VAR1 = [];
$VAR1 = [];
$VAR1 = [];
$VAR1 = [];
$VAR1 = [];
Could not get new backup from storage.
(2)
Since Oct 31, I cannot login to webadmin. I get the following error:
==> /var/log/webadmin.log <==
2018:11:01-08:26:34 sophos-01 webadmin[30244]: |=========================================================================
2018:11:01-08:26:34 sophos-01 webadmin[30244]: I JSON request processing started
2018:11:01-08:26:34 sophos-01 webadmin[30244]:
2018:11:01-08:26:34 sophos-01 webadmin[30244]: 1. main::top-level:184() asg.plx
==> /var/log/confd.log <==
2018:11:01-08:26:36 sophos-01 confd[30310]: I Role::authenticate:146() => id="3106" severity="info" sys="System" sub="confd" name="authentication successful" user="admin" srcip="10.10.10.22" sid="Wb000000000000000018gE" facility="webadmin" client="webadmin.plx" call="new"
==> /var/log/kernel.log <==
2018:11:01-08:26:36 sophos-01 kernel: [92726063.202567] confd.plx[30310]: segfault at 4367bcb4 ip b735ec80 sp bfd75780 error 4 in libperl.so[b72b4000+14d000]
2018:11:01-08:26:36 sophos-01 kernel: [92726063.378456] confd.plx[30318]: segfault at 4367bcb4 ip b735ec80 sp bfd75780 error 4 in libperl.so[b72b4000+14d000]
2018:11:01-08:26:38 sophos-01 kernel: [92726065.443752] confd.plx[30326]: segfault at 4367bcb4 ip b735ec80 sp bfd755f0 error 4 in libperl.so[b72b4000+14d000]
2018:11:01-08:26:38 sophos-01 kernel: [92726065.611573] confd.plx[30334]: segfault at 4367bcb4 ip b735ec80 sp bfd755f0 error 4 in libperl.so[b72b4000+14d000]
==> /var/log/webadmin.log <==
2018:11:01-08:26:38 sophos-01 webadmin[30244]: |=========================================================================
2018:11:01-08:26:38 sophos-01 webadmin[30244]: E [30244] DIED: hash- or arrayref expected (not a simple scalar, use allow_nonref to allow this) at /</var/webadmin/webadmin.plx>core/modules/core_tools.pm line 711.
2018:11:01-08:26:38 sophos-01 webadmin[30244]:
2018:11:01-08:26:38 sophos-01 webadmin[30244]: 1. main::__ANON__:73() asg.plx
2018:11:01-08:26:38 sophos-01 webadmin[30244]: 2. core::modules::core_tools::obj2json:711() /</var/webadmin/webadmin.plx>core/modules/core_tools.pm
2018:11:01-08:26:38 sophos-01 webadmin[30244]: 3. wfe::asg::modules::asg_cache::cache_objects:114() /</var/webadmin/webadmin.plx>wfe/asg/modules/asg_cache.pm
2018:11:01-08:26:38 sophos-01 webadmin[30244]: 4. main::top-level:265() asg.plx
This is the system information I collected
The "confd [master]" process has been running since system boot.
On Sept 26, the listener child process was started (restarted), though I cannot find any logs showing this.
root 7315 3110 0 Sep26 ? 00:13:21 confd [listener]
The sophos system started receiving segfault errors on Sept 25. Only confd started experiencing segfault. It was quite (no segfaults or other errors) before this.
==> /var/log/kernel <==
2018:09:25-00:13:09 sophos-01 kernel: [89499656.395398] confd.plx[28958]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
... snip - 8 events removed ...
2018:09:25-00:13:50 sophos-01 kernel: [89499697.465577] confd.plx[29030]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
2018:09:25-00:13:53 sophos-01 kernel: [89499700.734961] show_signal_msg: 8 callbacks suppressed
The segfault errors continued through the time the system reports "confd [listener]" started (restarted).
2018:09:26-00:12:26 sophos-01 kernel: [89586013.288675] confd.plx[27639]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
2018:09:26-00:12:26 sophos-01 kernel: [89586013.394724] confd.plx[27698]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
2018:09:26-00:12:26 sophos-01 kernel: [89586013.407172] confd.plx[27696]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
2018:09:26-00:12:26 sophos-01 kernel: [89586013.452024] confd.plx[27702]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
2018:09:26-00:12:26 sophos-01 kernel: [89586013.452679] confd.plx[27703]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
2018:09:26-00:12:30 sophos-01 kernel: [89586017.723244] show_signal_msg: 3 callbacks suppressed
2018:09:26-00:12:30 sophos-01 kernel: [89586017.723249] confd.plx[27788]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
2018:09:26-00:13:26 sophos-01 kernel: [89586073.660045] confd.plx[27804]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
2018:09:26-00:13:26 sophos-01 kernel: [89586074.014605] confd.plx[27803]: segfault at 64ec0aec ip b734d944 sp bfd75a20 error 4 in li
bperl.so[b72b4000+14d000]
After these segfaults began occurring, starting on Sept 28 (when backups started failing to be created), repeating once a day on Sept 29, 30, Oct 8, Nov 11, the logs recorded segfaults for mdw (the middleware)
2018:09:28-23:33:43 sophos-01 kernel: [89842890.235323] mdw.plx[29991]: segfault at 4367bcb4 ip 4367bcb4 sp bf89ea4c error 4
2018:09:28-23:34:00 sophos-01 kernel: [89842907.309718] 8021q: adding VLAN 0 to HW filter on device eth0
2018:09:28-23:34:00 sophos-01 kernel: [89842907.396024] 8021q: adding VLAN 0 to HW filter on device eth8
2018:09:28-23:34:00 sophos-01 kernel: [89842907.456909] 8021q: adding VLAN 0 to HW filter on device eth1
There were few other segfaults during Sept 01 - Nov 01:
2018:09:23-12:08:35 sophos-01 kernel: [89369782.747679] ulogd[21364]: segfault at 0 ip b76cb103 sp b7329fb0 error 4 in libtcmalloc.so.4.1.0[b76a4000+48000]
2018:09:25-00:38:05 sophos-01 kernel: [89501152.603706] netselector.plx[32729]: segfault at fffffffe ip b7623711 sp bfbe8448 error 5 in libc-2.11.3.so[b75af000+167000]
2018:09:25-03:40:32 sophos-01 kernel: [89512099.104025] smtpd.bin[31429]: segfault at 793e338f ip b6732c27 sp bff73db0 error 4 in libperl.so[b667c000+150000]
2018:10:11-23:14:24 sophos-01 kernel: [90964931.335532] exim[30048]: segfault at 10c67b5 ip b774bbbd sp bfe99990 error 4 in ld-2.11.3.so[b7742000+1f000]
This thread was automatically locked due to age.