Clients get auto-Isolated after boot - randomized

Hello Everyone,

some of my clinets - have a clean SOPHOS Endpoint installed and everything works well.
But sometimes (once every week) a client - different clients - gets auto-isolated because one or more SOPHOS services could not be started. Because of this it gets isolated and I have to run the diagnostigs trough central and some minutes later the client works well again.

One week - or some days later - another client thinks he has to do the same procedure, too. All of the clients run every day. So I donÄt think it's an update issue because of installing a new version or updates...

All Computers have at least Intel i3 / 4 GB RAM / Win 7. Other clients in the same site do not act like this.

 

Does anybody has ideas?

  • Hi  

    Computers auto isolate when they recieve red health status if the feature is enabled under threat protection policy. This policy is not enabled by default. As you see, one of the services could not be started, could you please provide more details about which services are stopped and gets restarted again. Do you see any errors under event logs on the machine? 

  • In reply to Shweta:

    Unfortunately neither in the eventlog of any of the machines something appears which concerns to this error in my eyes - nor is it a software / windows error. I'm trying to get a eventlog history of this machine which caused the error yesterday.

    Auto isolation was activated deliberately - we do not have any problems with this policy since month - except on this site.

    Additional to this it is not everytime the same service which won't start.

    This week two clients got isolated. One had not started the " Sophos Network Threat Protection" service, the other one was "Sophos Web Intelligence Service". If there were other services which caused the isolation in the weeks before could be possible - but I do not remember all of them excelt "Sophos Device Control Service" which hang once, too.

    all clients are basically the same: one manufaccturer, same model, same hardware. all were made with the same OS image - as the other computers in this site, running SOPHOS without problems.

  • In reply to Kevin Waberski:

    Are these services just timing out starting, i.e you don't take any action to fix them other than manually starting them?

    Or are they broken/missing and you need to take action?


    The Health service is responsible for reporting on the state of the client.  It has a trail of messages under here:
    C:\ProgramData\Sophos\Health\Event Store\Trail\
    The log under:
    C:\ProgramData\Sophos\Health\Logs\health.log
    can also be used.

    Regards,

    Jak

  • In reply to jak:

    Because the mentioned clients are loacted on a physically different location I cannot access them immediately. And because of the isolation I am not able to connect using Teamviewer or something else.
    I run a diagnostic scan on the computer using the Diag button in central, 5mins later the client is back online. The services seem to be there but not startet or they are started lately.

     Tho Log Files look like this:

     JSON Files in the Health Event Store Folder:

     {"id":"931cfdc5-b794-4973-b823-23856fe3bdf4","familyId":"b5bcfbed-e593-40f5-8556-4633a3d7344b","timeStamp":"2019-09-26T05:05:06Z","app":"SHS","sequence":"0","severity":3,"serviceName":"Sophos Device Control Service","resourceId":"health.service.stopped","showNotification":true,"updateSummary":true}

    ---

    {"app" : "MTD", "familyId" : "f1da68ff-b0d2-4e49-ac56-7af56de860c9", "id" : "f8d83158-bbf6-4470-980e-bda9da22306a", "resourceId" : "endpoint.isolation.started", "sequence" : "0", "severity" : 2, "showNotification" : true, "timeStamp" : "2019-09-26T05:05:06Z", "updateSummary" : true}

    ---

    And the Health.log in the other directory (tailed for the mentioned day)

     

    2019-09-26T05:00:32.314Z [ 2744] INFO WinMain ----------------------------------------------------------------------------------------------------
    2019-09-26T05:00:32.345Z [ 2744] INFO WinMain Starting version 2.1.0.33 of the Sophos Health Service.
    2019-09-26T05:00:32.345Z [ 2744] INFO WinMain ----------------------------------------------------------------------------------------------------
    2019-09-26T05:00:32.564Z [ 2900] INFO ServiceCheck::Run SAU Policy Features have changed: APPCNTRL AV CLEAN CORE DLP DVCCNTRL EFW HBT NTP SAV SDU WEBCNTRL XPD
    2019-09-26T05:00:32.564Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:00:47.696Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:01:02.859Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:01:18.209Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:01:34.121Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:01:48.650Z [ 2896] INFO NamedPipeServer::AcceptWorker Client has connected to pipe
    2019-09-26T05:01:49.165Z [ 2896] INFO NamedPipeServer::AcceptWorker Client has connected to pipe
    2019-09-26T05:01:49.289Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:01:52.253Z [ 2896] INFO NamedPipeServer::AcceptWorker Client has connected to pipe
    2019-09-26T05:01:52.253Z [ 2896] INFO NamedPipeServer::AcceptWorker Client has connected to pipe
    2019-09-26T05:01:52.768Z [ 2896] INFO NamedPipeServer::AcceptWorker Client has connected to pipe
    2019-09-26T05:01:53.267Z [ 5052] INFO NamedPipeServer::ClientConnectionWorker Disconnecting client from pipe, as client has exited
    2019-09-26T05:01:53.267Z [ 5048] INFO NamedPipeServer::ClientConnectionWorker Disconnecting client from pipe, as client has exited
    2019-09-26T05:01:53.782Z [ 5056] INFO NamedPipeServer::ClientConnectionWorker Disconnecting client from pipe, as client has exited
    2019-09-26T05:02:04.328Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:02:19.600Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:02:34.779Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:02:49.802Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:03:04.903Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:03:20.047Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:03:36.181Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:03:51.242Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:04:06.403Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:04:21.441Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:04:36.511Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:04:51.693Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:05:06.806Z [ 2900] INFO EventPublisher::PostServiceEvent Posting service stopped event: b5bcfbed-e593-40f5-8556-4633a3d7344b Sophos Device Control Service
    2019-09-26T05:05:07.055Z [ 2892] INFO DatabaseAccessor::AddEvent Processing event id: 931cfdc5-b794-4973-b823-23856fe3bdf4
    2019-09-26T05:05:07.079Z [ 2892] INFO DatabaseAccessor::UpdateHealthCategories Health state has changed to - Overall: 3, Service: 3, Threat: 0
    2019-09-26T05:05:07.090Z [ 2892] INFO DatabaseAccessor::AddEvent Processing event id: f8d83158-bbf6-4470-980e-bda9da22306a
    2019-09-26T05:05:07.092Z [ 2892] INFO DatabaseAccessor::UpdateHealthCategories Health state has changed to - Overall: 3, Service: 3, Threat: 0
    2019-09-26T05:05:52.417Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:06:07.493Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:06:22.652Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:06:38.526Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:06:53.629Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:07:09.579Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:07:24.700Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:07:40.002Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:07:55.071Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:08:10.335Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:08:25.491Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:08:40.698Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:08:55.895Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:09:11.067Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:09:26.160Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:09:28.026Z [ 2896] INFO NamedPipeServer::AcceptWorker Client has connected to pipe
    2019-09-26T05:09:28.526Z [ 2896] INFO NamedPipeServer::AcceptWorker Client has connected to pipe
    2019-09-26T05:09:28.529Z [ 5480] INFO NamedPipeServer::ClientConnectionWorker Disconnecting client from pipe, as client has exited
    2019-09-26T05:09:28.535Z [ 2896] INFO NamedPipeServer::AcceptWorker Client has connected to pipe
    2019-09-26T05:09:28.544Z [ 5188] INFO NamedPipeServer::ClientConnectionWorker Disconnecting client from pipe, as client has exited
    2019-09-26T05:09:29.027Z [ 7008] INFO NamedPipeServer::ClientConnectionWorker Disconnecting client from pipe, as client has exited
    2019-09-26T05:09:41.534Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:09:56.801Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:10:12.332Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:10:28.136Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:10:43.235Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:10:58.742Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'SAVService' returned: 1060
    2019-09-26T05:10:58.742Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'SAVAdminService' returned: 1060
    2019-09-26T05:10:58.743Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'swi_service' returned: 1060
    2019-09-26T05:10:58.745Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:11:16.258Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'SAVService' returned: 1060
    2019-09-26T05:11:16.258Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'SAVAdminService' returned: 1060
    2019-09-26T05:11:16.258Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'swi_service' returned: 1060
    2019-09-26T05:11:16.260Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:11:31.454Z [ 2900] INFO ServiceHelper::GetServiceStatus No longer failing to open service: SAVService
    2019-09-26T05:11:31.454Z [ 2900] INFO ServiceHelper::GetServiceStatus No longer failing to open service: SAVAdminService
    2019-09-26T05:11:31.455Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'swi_service' returned: 1060
    2019-09-26T05:11:31.456Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:11:46.628Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'swi_service' returned: 1060
    2019-09-26T05:11:46.630Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period

  • In reply to Kevin Waberski:

    From the Health log:

    2019-09-26T05:00:32.314Z [ 2744] INFO WinMain ----------------------------------------------------------------------------------------------------
    2019-09-26T05:00:32.345Z [ 2744] INFO WinMain Starting version 2.1.0.33 of the Sophos Health Service.
    2019-09-26T05:00:32.345Z [ 2744] INFO WinMain ----------------------------------------------------------------------------------------------------

    So at this time, I assume this is the computer starting up and the Sophos Health Service starting.  You would also get this if Sophos Health updated I guess or any reason what would cause this service to be restated but given the subsequent messages about 'during startup grace period', it seems safe to say the computer started up at this time.

    2019-09-26T05:04:51.693Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during startup grace period
    2019-09-26T05:05:06.806Z [ 2900] INFO EventPublisher::PostServiceEvent Posting service stopped event: b5bcfbed-e593-40f5-8556-4633a3d7344b Sophos Device Control Service
    2019-09-26T05:05:07.055Z [ 2892] INFO DatabaseAccessor::AddEvent Processing event id: 931cfdc5-b794-4973-b823-23856fe3bdf4
    2019-09-26T05:05:07.079Z [ 2892] INFO DatabaseAccessor::UpdateHealthCategories Health state has changed to - Overall: 3, Service: 3, Threat: 0

    so after a 5 minute startup update grace period, the health service notices that the "Sophos Device Control Service" is stopped and posts the event, which switches it to red health.

    Then we switch to this:

    2019-09-26T05:10:43.235Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:10:58.742Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'SAVService' returned: 1060
    2019-09-26T05:10:58.742Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'SAVAdminService' returned: 1060
    2019-09-26T05:10:58.743Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'swi_service' returned: 1060
    2019-09-26T05:10:58.745Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period
    2019-09-26T05:11:16.258Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'SAVService' returned: 1060
    2019-09-26T05:11:16.258Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'SAVAdminService' returned: 1060
    2019-09-26T05:11:16.258Z [ 2900] WARN ServiceHelper::GetServiceStatus OpenService() for 'swi_service' returned: 1060
    2019-09-26T05:11:16.260Z [ 2900] INFO ServiceCheckLogic::CalculateResult Ignored service check results: during update grace period

    This suggests that at least some of the Sophos Anti-Virus components services are now missing.

    If you look under \windows\temp\ at the installation logs of SAV, is it somehow uninstalling/reinstalling.  Seems odd these services would disappear.

    Regards,

    Jak

     

     

  • In reply to jak:

    Hello jak,

     

    thanks for your effort helping me solfe my strange case.

    Ten Minutes ago another client from this site auto isolted itself. This time it was one which never did this before. Actually another service (Sopho Auto Updater) could not be started. The user has some technical skills and was able to list me all of the services at the phone - and the service which caused the error is in the list. But not started.

    So I ran a diagnostics in central for this client, 2 minutes later it was back online. So I checked the OS-Temp folder. There were only 2 empty log files from today, no SOPHOS Log from today (just from yesterday). Centrals events for this client do not show an update or something yesterday afternoon/evening.

    I am happy about all suggestions and helpful advices...

    Thans & Regards

    Kevin

     

      

     

  • In reply to Kevin Waberski:

    Are we saying here that the computer started up and the AutoUpdate service failed to start but could be manually started without issue?

    I think we need to understand if these services are simply timing out at startup?  In the case above, had the computer just started again?

    Is there anything in the System event log to suggest that the service timed out starting?

    Also, under:  "C:\ProgramData\Sophos\AutoUpdate\Logs\", there should be both the log from the service (susvc.log) and the log from the SophosUpdate.exe process the service launches to perform the updates. Was there an error in the service log or anything at all for this service when it failed to start?

    If on these computers services are timing out starting, and 30 seconds is the default service timeout, would it help to set the ServicesPipeTimeout registry key and make the timeout 60 seconds?  Would this solve the problem?  I guess you'd need to check the Windows System event log to see if you see references to the services timing out.

    Regards,
    Jak