This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

High CPU Usage - SEDService.exe offline

Hi, I have an annoying problem with the Sophos Endpoint Agent. When I am connected to the internet everything is fine. However, when I unplug the cable and am offline, the load on SEDService.exe goes way up. I have now noticed that under C:\ProgramData\Sophos\Endpoint Defense\Data\Event Journals\SophosED\Dns several .bin files are permanently created 100Mb in size and then zipped as .xz files. This takes a lot of performance and is certainly not the way it should be. Does anyone know the problem or have an idea which setting causes this? As soon as the Internet is available again, the utilization of the process goes down and no more files are created in the path.

There are various blocking entries in the sed log. Do they have anything to do with this?
What could it be?



This thread was automatically locked due to age.
Parents
  • SEDService.exe performs 2 operations here that are related to processing the events.

    1. It creates a trace session and subscribes to the "Microsoft-Windows-DNS-Client" trace provider. So this is the source of most of the the DNS events. I say most as 
    it also gets some DNS events from the "Microsoft-Windows-WinINet" trace provider also shown in the screenshot below but I assume most come from the DNS-Client provider.

    These are stored in the journal files you found and written to by the SOPHOSED.SYS driver. 

    The .bin file is the current one and has a max size of 100MB before a new .bin file (or a new one is created every hour) is created.

    2. Every minute, the SEDService.exe checks if it needs to compress the bin files to xz files the current bin file(s) for each subject, in this case DNS.  There are others, e.g. Registry, Process, etc.. Typically there is only one active .bin file unless you have a lot of events for the specific subject.

    SedService.exe typically doesn't perform work every minute as sophosed.sys typically only flushes new journal data from memory to the .bin file every 5 mins.  However, if there are a lot of events, then sophosed.sys will flush more often to avoid using too much memory.

    So typically you see a spike of work by SEDService.exe every 5 mins, which takes place at most 1 minute after sophosed.sys flushes the latest data.

    So you could do the following:.

    1. Create your own trace session and add the DNS client provider to it, just the logs to a file and after a minute or collection see what you have.

    2. Turn on debug logging of SEDService.exe to get it to log the DNS addresses being recorded.

    [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Sophos Endpoint Defense Service]
    "DebugFacilities"=dword:00200000
    "DebugLevel"=dword:00000001

    Make a note of the current values to restore.  Tamper needs to be disabled. SEDService.exe picks up the reg changes automatically. 

    You can then tail the logs for the DNS messages, DNS1 and DNS2.  One is from the WinInet Source, the other is the from the DNS-Client provider.

    gc 'C:\ProgramData\Sophos\Endpoint Defense\Logs\seds.log' -wait -tail 1 | Select-String "Debug DNS"

    I hope this helps.  For DNS to be creating so many events, it seems like something isn't functioning correctly.

    HTH..

  •   Thanks for the instructions.

    Here my log. Its every time the same. I translated it from german. My DNS-Server in LAN/WLAN is DC dc03.mydomain.local and dc04.mydomain.local
    This log goes up in 1 minute to extremely many entries.

    A DNS query is called up for the name "dc04.mydomain.local". Type: 28, query options: 1073897472, server list: , network query: 0, network index: 0, interface index: 0, asynchronous query: 0
    A network query is initiated for the name "dc04.mydomain.local". Parallel query: 0, network index: 0, number of interfaces: 0, first interface name: NULL, local addresses: , DNS server: 
    A cache lookup was called for the name "dc04.mydomain.local" (type: 1, options: 2251800887582720, interface index: 0).
    The cache lookup for the name "dc04.mydomain.local" (type: 1, option: 2251800887582720) returned "9701". Results: 
    A cache lookup was called for the name "dc04.mydomain.local" (type: 28, options: 2251800887582720, interface index: 0).
    The cache lookup for the name "dc04.mydomain.local" (type: 28, option: 2251800887582720) returned "9701". Results: 
    The DNS query for the name "dc04.mydomain.local" is complete. Type: 28, query options: 2251800887582720, status: 9852. Results: 
    A DNS query is called for the name "dc04.mydomain.local". Type: 1, Query options: 1073766400, Server list: , network query: 0, network index: 0, interface index: 0, asynchronous query: 0
    The DNS query for the name "dc04.mydomain.local" is complete. Type: 1, Query options: 1073766400, Status: 87, Results: 

    For DNS to be creating so many events, it seems like something isn't functioning correctly.
    Yes, absolutely. But what exactly? It cannot resolve the names dc04.mydomain.local because it is offline. But then why it keep trying? And where does the dns call come from anyway? Can I set the DNS servers somewhere in Sophos?

Reply
  •   Thanks for the instructions.

    Here my log. Its every time the same. I translated it from german. My DNS-Server in LAN/WLAN is DC dc03.mydomain.local and dc04.mydomain.local
    This log goes up in 1 minute to extremely many entries.

    A DNS query is called up for the name "dc04.mydomain.local". Type: 28, query options: 1073897472, server list: , network query: 0, network index: 0, interface index: 0, asynchronous query: 0
    A network query is initiated for the name "dc04.mydomain.local". Parallel query: 0, network index: 0, number of interfaces: 0, first interface name: NULL, local addresses: , DNS server: 
    A cache lookup was called for the name "dc04.mydomain.local" (type: 1, options: 2251800887582720, interface index: 0).
    The cache lookup for the name "dc04.mydomain.local" (type: 1, option: 2251800887582720) returned "9701". Results: 
    A cache lookup was called for the name "dc04.mydomain.local" (type: 28, options: 2251800887582720, interface index: 0).
    The cache lookup for the name "dc04.mydomain.local" (type: 28, option: 2251800887582720) returned "9701". Results: 
    The DNS query for the name "dc04.mydomain.local" is complete. Type: 28, query options: 2251800887582720, status: 9852. Results: 
    A DNS query is called for the name "dc04.mydomain.local". Type: 1, Query options: 1073766400, Server list: , network query: 0, network index: 0, interface index: 0, asynchronous query: 0
    The DNS query for the name "dc04.mydomain.local" is complete. Type: 1, Query options: 1073766400, Status: 87, Results: 

    For DNS to be creating so many events, it seems like something isn't functioning correctly.
    Yes, absolutely. But what exactly? It cannot resolve the names dc04.mydomain.local because it is offline. But then why it keep trying? And where does the dns call come from anyway? Can I set the DNS servers somewhere in Sophos?

Children
  • If we take the Event ID 3006 from the event viewer (Microsoft-Windows-DNS-Client/Operational) as a start: If you look at the details of the event log entry, I would switch to XML view, look for "Execution ProcessID":. e.g.

    <Execution ProcessID">" ThreadID">" />

    If you check Task Manager, is that ProcessID running?  What is it?  If it's not running, maybe it's transient, run Process Monitor for a while to help see the history of processes or use another source.  

    The Sophos Event journals will have the info but this is easier to extract I suspect.

  • Okay so its ProcessID 1524 Microsoft-Windows-DNS-Client 






    Buffer Overflow?

  • Okay wow. The whole problem is really exciting, but we have now found the solution.
    Many thanks to you for your help.
    Of course you also want to know what the problem was.
    It was the Windows time service W32Time which, in conjunction with DNS queries and Sophos, led to this load.
    The timeserver and settings set to the clients via a GPO. Regkey Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\W32time\TimeProviders\NtpClient
    ResolvePeerBackoffMinutes value which unfortunately was set to 0.
    ResolvePeerBackoffMinutes: This value, specified in minutes, controls how long W32time waits until it tries to resolve a DNS name again after a failed attempt. The default value is 15 minutes.
    As a result, DNS queries were repeatedly started which then resulted in the problem.
    Set the value back to the default 15 minutes and everything is fine.

  • thanks for posting your root cause!

    so in this case the load seen from Sophos was just by it logging a ton of DNS requests by w32time to the event journals.

  • Good to know.  Thanks for taking the time to provide the details. Hopefully the computers are running smoother now Slight smile

  • yes they running smoother Relaxed