This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

multiple rrdtool high (100%) cpu usage

Since 02:20 this morning 3 seperate systems I look after for friends have shown this problem.

Each is running at 100% CPU load and the FW is slow

After SSH'ing in I found that there are many (20+) instances of rrdtools running.

This problem looks identical to  rrdtool high cpu usage

I have tried what is suggested there which kills the rrdtool task but after a while the instances start again so I have commented out the lines in /etc/crontab.rrd for now

What is the permanent solution to this ?

Jeff



This thread was automatically locked due to age.
  • Looks like commenting out /etc/crontab.rrd does not prevent it running !

    So I expect it will get worse with time

    Jeff

  • Hi Jeff,

    We are experiencing the same issue with a pair of SG450 Hardware Appliances, same Firmware Version, Pattern Version 222572.

    Were you able to resolve your problem? If so, how did you go about it?

    I'm currently on the phone with Sophos Support, not really getting anywhere though.

    Many thanks,

    John P

    2 x SG450 (Version 9.714-4)

    HA = Active-Passive

  • Hi John,

    The problem SEEMS to me precisely the same as the one I linked to so I did the same thing as the advice was there (modified a little)

    I ssh'd in then sudo'd to root

    killall /usr/local/bin/create_rrd_graphs.plx
    killall rrdtool

    I then edited

    /etc/crontab
    /etc/crontab.rrd

    to comment out the cron entry that restarts the process

    If rrdtools has respawned you might have to re run the killall commands

    This prevents the rrdtools running however the graphing is then stopped

    Remove the commenting when you have a proper solution (and tell me !!)

    I'm suprised theres not more people having this problem TBH

    Jeff

  • Hi Jeff,

    Many thanks for the recommendations.

    Just off the phone with a Sophos engineer who now has remote access to the appliances and is currently troubleshooting the issue.

    Will post any further developments.

    Curiously enough, our issue began at around the same time as yours, i.e. 2:20am this morning. Our appliance's CPU usually runs around 11-15%. Gradually started going up from around 2:20am until it reached 100% at approx. 3:35am.

    Surely, we can't be the only ones experiencing this issue.

    Cheers,

    John P

    2 x SG450 (Version 9.714-4)

    HA = Active-Passive

  • Not sure what's happening but getting this on my Sophos Home UTM also. CPU went through the roof starting Saturday morning.

  • And so it begins !

    I wondered whether it was a smoulderer, its not very obvious until its a problem !

    Jeff

  • Struggling to understand why this is an issue, especially with no recent firmware updates.

    Long shot, but any chance this is related to daylight saving time? It started right at 2 Am from the logs I can see....

  • Hmm that 5 UTM's so far then !

    I didn't notice until about 09:00 when I basically lost internet connection due to the load I assume.

    Mine like yours, load increased to 100% at 02:20 by 09:00 I had many rrdtools instances running

    The 3 UTMs I have the problem with are VM's so your problem proves its not only a VM problem.

    I guessing a DST problem but only by correlation !

    Hope you get a fix soon !

    Jeff

  • Funny I just said that in my reply to isooffice !

    Jeff

  • Same issue here on my Home UTM. It hit 100% CPU at 02:10 for me. A friends UTM also has the same issue with theirs going to 100% at 03:35. I also didn't get any daily reports via email...

    I have just managed to reboot mine and CPU usage is back down to normal levels...for now...