This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

multiple rrdtool high (100%) cpu usage

Since 02:20 this morning 3 seperate systems I look after for friends have shown this problem.

Each is running at 100% CPU load and the FW is slow

After SSH'ing in I found that there are many (20+) instances of rrdtools running.

This problem looks identical to  rrdtool high cpu usage

I have tried what is suggested there which kills the rrdtool task but after a while the instances start again so I have commented out the lines in /etc/crontab.rrd for now

What is the permanent solution to this ?

Jeff



This thread was automatically locked due to age.
Parents
  • Hi Jeff,

    We are experiencing the same issue with a pair of SG450 Hardware Appliances, same Firmware Version, Pattern Version 222572.

    Were you able to resolve your problem? If so, how did you go about it?

    I'm currently on the phone with Sophos Support, not really getting anywhere though.

    Many thanks,

    John P

    2 x SG450 (Version 9.714-4)

    HA = Active-Passive

Reply
  • Hi Jeff,

    We are experiencing the same issue with a pair of SG450 Hardware Appliances, same Firmware Version, Pattern Version 222572.

    Were you able to resolve your problem? If so, how did you go about it?

    I'm currently on the phone with Sophos Support, not really getting anywhere though.

    Many thanks,

    John P

    2 x SG450 (Version 9.714-4)

    HA = Active-Passive

Children
  • Hi John,

    The problem SEEMS to me precisely the same as the one I linked to so I did the same thing as the advice was there (modified a little)

    I ssh'd in then sudo'd to root

    killall /usr/local/bin/create_rrd_graphs.plx
    killall rrdtool

    I then edited

    /etc/crontab
    /etc/crontab.rrd

    to comment out the cron entry that restarts the process

    If rrdtools has respawned you might have to re run the killall commands

    This prevents the rrdtools running however the graphing is then stopped

    Remove the commenting when you have a proper solution (and tell me !!)

    I'm suprised theres not more people having this problem TBH

    Jeff

  • Hi Jeff,

    Many thanks for the recommendations.

    Just off the phone with a Sophos engineer who now has remote access to the appliances and is currently troubleshooting the issue.

    Will post any further developments.

    Curiously enough, our issue began at around the same time as yours, i.e. 2:20am this morning. Our appliance's CPU usually runs around 11-15%. Gradually started going up from around 2:20am until it reached 100% at approx. 3:35am.

    Surely, we can't be the only ones experiencing this issue.

    Cheers,

    John P

    2 x SG450 (Version 9.714-4)

    HA = Active-Passive

  • Hmm that 5 UTM's so far then !

    I didn't notice until about 09:00 when I basically lost internet connection due to the load I assume.

    Mine like yours, load increased to 100% at 02:20 by 09:00 I had many rrdtools instances running

    The 3 UTMs I have the problem with are VM's so your problem proves its not only a VM problem.

    I guessing a DST problem but only by correlation !

    Hope you get a fix soon !

    Jeff

  • Thanks for this. I've just followed these steps and CPU utilisation has returned back to normal.