Yesterday afternoon, around 1000 GMT we noticed that a lot of our monitored Astaros were recording 100% utilisation. The problem continued and appeared to be related to the ACC agent. Restarting the agent provided a temporary fix, however the problem recurred within a few minutes.
We have disabled the ACC management script on 35+ machines and now have no ACC management. Whilst it was nice having the ACC providing monitoring it was not good getting complaints from our clients regarding the degradation of their services due to our value added service.
I am comfortable that our trouble shooting correctly identified the cause of the problem. It would be good to find out what caused it and if it was the result of an auto update. If it were, was sufficient testing carried out? How can we be sure we will not be caught again? I seem to recollect that this is the second problem we have had with ACC management.
This thread was automatically locked due to age.