This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Logging stops, 100% cpu

ver 9.314-13 on hyper-v 2012r2

After a period of 1-3 days, all logging stops, the webadmin indicates 100% cpu usage (top only indicates ~37% to start with but it seems to accumulate over a few days and slow the system). Only solution is to reboot.

I have tried: restarting syslog-ng, ulogd, repairing postgre (gives a bunch of cannot remove read only errors when trying to rm the old stuff), installed from a fresh iso restoring backup config (ran great for 2 days, then same deal).

Stopping syslog-ng gets rid of the 100% cpu use in webadmin.

The system otherwise continues functioning without logging. I have gotten a couple application control daemon restart mail notifications well after logging has failed.

The kernel log spit this 4 times before the last log cuttoff. It's the only thing suspicious.

2015:08:23-23:34:45 sophos kernel: [183447.776236] storvsc: Sense Key : 0x6 [current] 
2015:08:23-23:34:45 sophos kernel: [183447.776240] storvsc: ASC=0x3f ASCQ=0x2
2015:08:23-23:34:45 sophos kernel: [183447.776603] sd 2:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automatically adjust these parameters.

This thread was automatically locked due to age.

Parents

0 William Warren over 9 years ago

what exactly are the differences on the two hosts?

Owner: Emmanuel Technology Consulting

http://etc-md.com

Former Sophos SG(Astaro) advocate/researcher/Silver Partner

PfSense w/Suricata, ntopng,

Other addons to follow
Cancel
Vote Up 0 Vote Down

Cancel

Reply

0 William Warren over 9 years ago

what exactly are the differences on the two hosts?

Owner: Emmanuel Technology Consulting

http://etc-md.com

Former Sophos SG(Astaro) advocate/researcher/Silver Partner

PfSense w/Suricata, ntopng,

Other addons to follow
Cancel
Vote Up 0 Vote Down

Cancel

Children

0 Karstedt over 9 years ago in reply to William Warren

The system running perfect is a gen8 proliant ml310, the one having trouble is a power edge 620r.

HP - Dell
NICs: intel - broadcom
RAID 1 B120i - 5 PERC 710 mini

I updated the PERC firmware a few days ago and it went down last night, but differently. Logging didn't fail which is nice.

boot log suggests postgres failed the previous afternoon and it can't connect to it's reporting DB after reboot. Had to repair to get it running again.

2015:09:01-08:57:06 sophos [local0:info] postgres[3606]:  [1-1] LOG:  loaded library "pg_stat_statements"

2015:09:01-08:57:06 sophos [local0:info] postgres[3607]:  [2-1] LOG:  database system was interrupted; last known up at 2015-08-31 13:30:06 GMT

2015:09:01-08:57:06 sophos [local0:info] postgres[3607]:  [3-1] LOG:  database system was not properly shut down; automatic recovery in progress

2015:09:01-08:57:07 sophos [local0:info] postgres[3607]:  [4-1] LOG:  redo starts at 0/203D2C4

2015:09:01-08:57:07 sophos [local0:info] postgres[3607]:  [5-1] LOG:  record with zero length at 0/203D348

2015:09:01-08:57:07 sophos [local0:info] postgres[3607]:  [6-1] LOG:  redo done at 0/203D2F4

2015:09:01-08:57:07 sophos [local0:info] postgres[3606]:  [2-1] LOG:  database system is ready to accept connections

2015:09:01-08:57:07 sophos [local0:info] postgres[3611]:  [2-1] LOG:  autovacuum launcher started

2015:09:01-08:57:07 sophos [authpriv:info] su:  pam_unix(su:session): session closed for user postgres

2015:09:01-08:57:07 sophos [local0:err] postgres[3637]:  [3-1] FATAL:  database "reporting" does not exist

2015:09:01-08:57:07 sophos [daemon:notice] ulogd[3640]:  ulogd running

2015:09:01-08:57:07 sophos [local0:err] postgres[3643]:  [3-1] FATAL:  database "reporting" does not exist

2015:09:01-08:57:07 sophos [daemon:err] ulogd[3640]:  pg1: connect: FATAL:  database "reporting" does not exist

2015:09:01-08:57:08 sophos [local0:err] postgres[3701]:  [3-1] FATAL:  database "reporting" does not exist

2015:09:01-08:57:08 sophos [daemon:err] ulogd[3640]:  pg1: connect: FATAL:  database "reporting" does not exist

Then the interface failed at 0:30 accompanied by a constant 100% cpu spike (not unlike my previous problem with the broadcom + hyper-v that appeared to be fixed after updating and disabling VMQ about a month back).