This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Logging stops, 100% cpu

ver 9.314-13 on hyper-v 2012r2

After a period of 1-3 days, all logging stops, the webadmin indicates 100% cpu usage (top only indicates ~37% to start with but it seems to accumulate over a few days and slow the system). Only solution is to reboot.

I have tried: restarting syslog-ng, ulogd, repairing postgre (gives a bunch of cannot remove read only errors when trying to rm the old stuff), installed from a fresh iso restoring backup config (ran great for 2 days, then same deal).

Stopping syslog-ng gets rid of the 100% cpu use in webadmin.

The system otherwise continues functioning without logging. I have gotten a couple application control daemon restart mail notifications well after logging has failed.

The kernel log spit this 4 times before the last log cuttoff. It's the only thing suspicious.

2015:08:23-23:34:45 sophos kernel: [183447.776236] storvsc: Sense Key : 0x6 [current] 
2015:08:23-23:34:45 sophos kernel: [183447.776240] storvsc: ASC=0x3f ASCQ=0x2
2015:08:23-23:34:45 sophos kernel: [183447.776603] sd 2:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automatically adjust these parameters.

This thread was automatically locked due to age.

0 BAlfson over 9 years ago

This just feels like a "hardware" issue or a problem with Hyper-V.

Cheers - Bob

Sophos UTM Community Moderator
Sophos Certified Architect - UTM
Sophos Certified Engineer - XG
Gold Solution Partner since 2005

MediaSoft, Inc. USA
Cancel
Vote Up 0 Vote Down

Cancel
0 Karstedt over 9 years ago in reply to BAlfson

Is there anyway it would be cause by the ethernet hardware (looks like a disk error)? I previously had an issue with Broadcom + Hyper-v, but after some updates and tweaks those problems went away.

It does seem to happen in the middle of the night... wonder if my backup is screwing it up.
Cancel
Vote Up 0 Vote Down

Cancel
0 William Warren over 9 years ago

first off what hyper-v are you running?

give full details please then i should be able to help.

I run hyper-v with UTM without problems. There is a burp at midnight when my vm's are being backed up because of limitations of utm but nothing like you are seeing.

Owner: Emmanuel Technology Consulting

http://etc-md.com

Former Sophos SG(Astaro) advocate/researcher/Silver Partner

PfSense w/Suricata, ntopng,

Other addons to follow
Cancel
Vote Up 0 Vote Down

Cancel
0 Karstedt over 9 years ago in reply to William Warren

I'm using hyper-v 6.3.6900 on a 2012 R2 full install. The guest is Gen 1, says it's v5.0. The VM is configured pretty similarly to one I have at another location, just different hardware and a few different services on the host.
Cancel
Vote Up 0 Vote Down

Cancel
0 William Warren over 9 years ago

what exactly are the differences on the two hosts?

Owner: Emmanuel Technology Consulting

http://etc-md.com

Former Sophos SG(Astaro) advocate/researcher/Silver Partner

PfSense w/Suricata, ntopng,

Other addons to follow
Cancel
Vote Up 0 Vote Down

Cancel

0 Karstedt over 9 years ago in reply to William Warren

The system running perfect is a gen8 proliant ml310, the one having trouble is a power edge 620r.

HP - Dell
NICs: intel - broadcom
RAID 1 B120i - 5 PERC 710 mini

I updated the PERC firmware a few days ago and it went down last night, but differently. Logging didn't fail which is nice.

boot log suggests postgres failed the previous afternoon and it can't connect to it's reporting DB after reboot. Had to repair to get it running again.

2015:09:01-08:57:06 sophos [local0:info] postgres[3606]:  [1-1] LOG:  loaded library "pg_stat_statements"

2015:09:01-08:57:06 sophos [local0:info] postgres[3607]:  [2-1] LOG:  database system was interrupted; last known up at 2015-08-31 13:30:06 GMT

2015:09:01-08:57:06 sophos [local0:info] postgres[3607]:  [3-1] LOG:  database system was not properly shut down; automatic recovery in progress

2015:09:01-08:57:07 sophos [local0:info] postgres[3607]:  [4-1] LOG:  redo starts at 0/203D2C4

2015:09:01-08:57:07 sophos [local0:info] postgres[3607]:  [5-1] LOG:  record with zero length at 0/203D348

2015:09:01-08:57:07 sophos [local0:info] postgres[3607]:  [6-1] LOG:  redo done at 0/203D2F4

2015:09:01-08:57:07 sophos [local0:info] postgres[3606]:  [2-1] LOG:  database system is ready to accept connections

2015:09:01-08:57:07 sophos [local0:info] postgres[3611]:  [2-1] LOG:  autovacuum launcher started

2015:09:01-08:57:07 sophos [authpriv:info] su:  pam_unix(su:session): session closed for user postgres

2015:09:01-08:57:07 sophos [local0:err] postgres[3637]:  [3-1] FATAL:  database "reporting" does not exist

2015:09:01-08:57:07 sophos [daemon:notice] ulogd[3640]:  ulogd running

2015:09:01-08:57:07 sophos [local0:err] postgres[3643]:  [3-1] FATAL:  database "reporting" does not exist

2015:09:01-08:57:07 sophos [daemon:err] ulogd[3640]:  pg1: connect: FATAL:  database "reporting" does not exist

2015:09:01-08:57:08 sophos [local0:err] postgres[3701]:  [3-1] FATAL:  database "reporting" does not exist

2015:09:01-08:57:08 sophos [daemon:err] ulogd[3640]:  pg1: connect: FATAL:  database "reporting" does not exist

Then the interface failed at 0:30 accompanied by a constant 100% cpu spike (not unlike my previous problem with the broadcom + hyper-v that appeared to be fixed after updating and disabling VMQ about a month back).