New Account Health Check feature

11 Sep 2022

We're working on a new feature to help identify when Central accounts are configured in ways that reduce security, so that admins can take action to improve their protection.

Feedback is appreciated at any stage, you can comment on this blog post.

Update January 2022: the first stage of this project (checking for protection installed) is now live, and you can see it in your on Central account.

From now until the middle of 2022, this will be focussed on endpoint and server protection, and specifically:

Software installs: whether all of the right software is installed
Threat protection policy settings: whether this is configured for maximum protection
Exclusions: check for common exclusions known to create significant security risks

The options assessed are all present in the product for valid purposes, such as addressing performance issues or false positives. However, in many cases the trade-off made may not have been fully understood, or the original reason may no longer be valid. Enabling them could be a significant change to your environment and so needs admin involvement. This is why we aren't simply removing the options that can cause issues, or automatically changing settings.

In some cases, the changes suggested by the checks will not be appropriate, but in most cases we expect this will be a useful way to identify issues admins aren't aware of.

We will be gradually building this feature, releasing checks and enhancements as they are ready. It will start off as a small notification check and evolve into a wider ranges of assessments with associated abilities like a one click "fix automatically" action should the admin agree the issue needs addressing straight away.

The first stage, labelled "beta" to clarify it is still under development, will have a check for whether all protection software is installed. It will be visible for all accounts with endpoint and/or server protection licenses: no opt in required. There will be an entry in the left navigation menu to provide access to the dedicated page.

A common scenario where this is useful is when a customer has upgraded their license to include better protection, but not realised or forgotten that additional software needs to be deployed to put the enhanced license fully into effect. For example, they were originally licensed for "Endpoint Protection" and so had installed "Endpoint Protection" as the only available component. Later, they upgraded their license to "Intercept X Advanced", but not added the Intercept software component to all their endpoints.

Although correcting this is a simple process involving just a few clicks in the admin UI, if missed it can remain a security gap indefinitely. Enabling, for example, the anti-ransomware protection in policy will not have an effect if endpoints are not running the "Intercept" software component the setting applies to. Flagging this issue up will let admins know they need to take action. Admins can decide whether to roll the extra software out to all devices straight away, or whether to stage it, such as testing on some devices first and then rolling out in batches of machines afterwards.

After releasing the software check, we will add further checks and capabilities gradually over the next several months. The next check will be the threat protection policy settings, which we estimate will be visible by the end of February 2022. Later releases will follow as soon as they are ready, and we will remove the "beta" label once the key capabilities are all present, likely midway through 2022.

Below is more information on the functionality we expect to deliver.

Protection software check

This will check whether all protection related software is installed on all devices. "Protection related" means it doesn't check whether the encryption management software is installed. This is because it is often not licensed for all devices. e.g. at a customer all 1,000 desktop and laptops have the protection software installed, but only 200 laptops have the encryption management software licensed and installed. We clearly don't want to warn that 800 computers don't have the encryption management software installed on them if they aren't licensed for it.

Historically protection licensing and installing was more complex, but there are only 2 states accounts can now be in, depending on whether MTR is licensed or not. All devices should have the "endpoint" and "intercept" components installed, and so there will be a warning if either or both are missing. Note: there are some operating system specific compatibility limitations; we won't warn of issues where software can't be installed, e.g. a Linux server unable to run the Intercept component. The intention is to only warn where there is an action to take! Accounts that are licensed for MTR have an additional software component to install, this will (only) be checked for when licensed.

As an example:

A customer is licensed for "Intercept X Advanced" and "Device Encryption"
They have only installed the legacy "Endpoint Protection" software (and no encryption)
They will initially see a software warning due to the missing protection software
If they assign the "Intercept X Advanced" software (this is the same as when it was previously shown as having the "Endpoint" and "Intercept" components):

they they pass the software check: the absence of the encryption software does not cause the check to fail.
If they then upgrade the license to MTR (e.g. "Intercept X Advanced with XDR and MTR Advanced", they will fail the software check again, until they assign the new software (MTR). The software install process will show that the software being installed now is different (matching the new license), and will add the MTR components to the already present components (Endpoint and Intercept).

After this, the software check will return to a green pass again. The absence of encryption will still not cause the check to fail.

The software checks will be split into separate computer and server assessments to allow for customers who have only licensed protection for one or other, or where RBAC (role based admin control) is in use, for example a given admin is only able to see computer information, but not server.

An example of the check in the first release is below:

Clicking on "1 of 9 endpoints" will link the admin directly to the computer list view, pre filtered for "Computers without all protection". The admin would only see 1 device (out of the 9 total computers in their account) which would be missing some/all protection software (e.g. if it only had the encryption software on, no protection software).

The device list view has the "Manage "Endpoint Software" button to easily add missing software to relevant devices and resolve the issue.

Note that the software check will have more functionality later in this project, such as export and "Fix all automatically" actions. These are discussed below and will be added when ready, an early mockup of how it might look by mid-2022 is below:

Threat Policy check

This will check the configuration of each threat protection policy in the account to see if the settings match Sophos recommendations. Anything that doesn't match Sophos recommendations will be reducing security and so must be carefully weighed against any benefit it is providing.

If you're uncertain why you have configured a setting contrary to Sophos recommendations then careful testing should be used to establish if it can be changed. Settings may be the way they are for a variety of reasons, such as historical issues that no longer occur (e.g. a long since fixed bug), the default has changed over time or it was changed to control license usage in older licensing arrangements that aren't relevant any more. Ideally any disparities with Sophos recommendations would either be changed to comply or there would be a clearly documented reason for the change, it's scope and timeline, for example "Setting X needs to be disabled for computer group Y owing to false positives. It can be re-enabled once the issue in Sophos Support case Z has been confirmed as fixed". We will be adding a "defer" action on checks and notes, see below.

Note: exclusions will be checked separately (see below), this check is about the fixed configuration options in the policy.

Exclusions check

This will look for specific exclusions that commonly cause a significant protection reduction. Any exclusion is a reduction in protection and should be weighed carefully against the benefit it provides. Exclusions that aren't flagged up by these checks should still be assessed to see if they are necessary, and even if they are then care should be taken to make them as specific as possible to minimise the security reduction.

Unnecessary or overly broad exclusions are often seen where exclusions are kept from previous security vendors but are not necessary with Sophos. 3rd party software also often recommends certain exclusions for any security products. These aren't specific to Sophos and so may not be necessary. Those vendors cannot realistically be an expert on every security solution and have a vested interest in reducing the chances of issues with their software and less pressure to avoid security impact. Their suggested exclusions should be carefully assed and tested to identify what is, and is not, required. The final example of unnecessary exclusions is where there are automatic Sophos exclusions, such as for Microsoft SQL servers; manual exclusions shouldn't be needed and often duplicate the automatic exclusions, but more broadly and so introduce additional unnecessary security gaps.

Common examples of overly broad exclusions would be to exclude an entire directory when only a certain file is required. For example excluding the entire directory that a single database file is located in. Specifically excluding the file would mean that malware attempting to execute from the directory would still be prevented.

Sometimes ineffective exclusions are added, which testing should help identify (i.e. they can be removed without impact). For example, the file corresponding to a compiler might be excluded due to performance issues when compiling, but this will only skip the check when it is first launched, not any checks while compiling. Instead, a process exclusion is likely required to avoid check all of the temporary files it is creating and accessing as part of the compilation process.

The identified exclusions will be ones we often see that are overly broad, such as excluding the entire c:\ drive, or where an exclusion for a valid business purpose unwittingly compromises protection against common hacking tools. An example being allowing psexec - a legitimate tool produced by Microsoft and used by many organisations for internal purposes, but also used in many hacking attempts, such as for lateral movement from one compromised system to an uncompromised one. The suggestion is to find an alternative method for remotely running software that isn't so accessible to malicious actors, such as dedicated asset management tools which are controlled safely from a secure centralised console.

An example mockup of what the check might look like is below. There will be checks for computer policy exclusions, server policy exclusions and global exclusions. The detail for each individual policy identified as having a significant security risk will be shown, as they could each have multiple items needing attention.

"Fix automatically" button

This will carry out the resolution action relevant to the specific check, e.g. changing threat policy settings or adding missing software. There will be a confirmation modal to clarify how many changes will be made and to how many devices/policies, but otherwise it will be a simple way to correct the issue.

"Tell me how to fix" button

For the more advanced user, this will link to a help page specific to the check, detailing how to address the issue. This ensures an admin has full control over exactly what changes are made, and when. For example, they could read the software assignment steps, but might want to test the new software on certain machines first, and gradually roll out to remaining ones later in batches rather than all at once.

"Defer for 30 days" button

We expect that most issues identified will be resolved soon and so become "green". In some cases, the configuration might be a necessary long term security trade-off and so we plan to add a "defer for 30 days" action. This will make the check "grey" for 30 days, at which point it will return to red if still not addressed. The intention behind this is to encourage the process of periodic reassessment. The configuration may be needed for some time, but not forever, so a periodic review helps identify when it can be changed back. Also, the detail of the check failure can change, and it is worth ensuring it is still as expected. For example, you might have made an exclusion for one server, but it is now unintentionally applying to other servers. This could be reverted to just the necessary server to improve overall protection. Similarly, it might be that only one settings needs to be changed, but now 2 are, and a periodic review helps identify that increased risk and revert it to only the one setting needing to be altered.

Export

We expect that some customers will want to keep a record of their complete assessment for compliance purposes. It might be for annual 3rd party security audits, or evidence of best practise compliance for cyber insurance purposes. Also, some customers may want to send a copy of their health check results to another party for review, for example customers with a Sophos Technical Account Manager (TAM) might send a copy prior to a regular review meeting, so that the TAM can arrive with a list of recommended steps to discuss.

The export will include details on all configuration (i.e. also including settings and exclusions that were not identified as differing from Sophos recommendations), clarifying the default state, Sophos recommendation and the current configuration in effect for each policy.

We expect the export to be in xlsx format. Most Central reports are .csv or .pdf files, but PDF does not easily allow sorting/filtering and .csv does not allow formatting to ensure a clear layout.

Dashboard entry

The dashboard is the normal entry point when logging into Central, and is also where customers commonly expect summary information on their account to be displayed. We intend to add a section for the health check to give an overview of the current health check results.

Below is an early mock up of what it might look like:

Alerts (and associated alert emails)

Many customers rely on alerts, and particularly alert emails, to notify them of items needing attention. Email provides a way to be made aware of issues without having to log in, which some admins do infrequently. We will create an alert when each check changes from green to red. So if both computers and servers no longer had all protection software you would see 2 alerts. The alerts can be cleared and won't reappear until the relevant check has gone back to green before going red again. For example, a second computer starting to lack some security software will not generate a second alert.

The health checks are performed continuously, so there is no refresh button. If a particular check starts to pass where it didn't before, the alert will be automatically cleared.

Comments

We plan to add a "comments" field to each check so that admins can keep track of relevant information. For example "Setting X needs to be disabled for computer group Y owing to false positives. It can be re-enabled once the issue in Sophos Support case Z has been confirmed as fixed".

Audit logs

Actions performed will be captured in the audit logs. For example, if an admin uses the "Fix all automatically" action on 2 policies with settings that differ from Sophos recommended, the audit log will record which admin triggered the action and also what policy changes it made. For example Deep learning and anti-ransomware enabled in Policy X and Live Protection enabled in Policy Y.

Future enhancements

There are numerous additional checks we intend to add over time. For example:

Tamper protection: is it enabled on all devices
Are devices fully protected: not just whether the software is set to be installed, but is it successfully installed, running correctly and up to date.
Check other endpoint features, e.g. app control. This can help boost security by controlling whether higher risk applications like older web browsers are allowed to run or not.
Check other products, such as an encryption software check- perhaps deployment number relative to the license, for the reasons outlined earlier that encryption may be licensed for a subset of devices.

Parents

Robert Rinde over 2 years ago

Will these features make it possible to see, and send alerts, when devices become inactive? Inactive devices might have been decomisisoned, by they might also be inactive because of changes in firewalls/networks that make them unable to communicate with Sophos Central. We need a good way to monitor inactive devices fråm a central point of view that covers all customers.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
JS over 2 years ago in reply to Robert Rinde

I agree there can be devices that are powered on but not communicating with Central. Differentiating them from devices that are not powered on (and so not connected to Central) is the challenge.

Central shows the last connected time on the device list, so you can see devices that were recently connected, but no longer, if you'd like to investigate them. The computer and server reports have filters for offline greater than 2 weeks and over 2 months for the same reason. Generally, the devices will simply be turned off or decommissioned though.

Loosely related:

1. we have the ability to sync devices from AD which can help to find online devices which AD knows about but aren't managed by Central, i.e. ones where the Sophos agent has not been installed.

2. The "heartbeat" functionality available with the XG firewall allows you to prevent devices with "bad health" from connecting to the Internet, and I believe report on those. This means a broken install (a potential reason for not speaking to Central) would come to your attention. If the install is fine though and it is simply other network config that blocks central comms, like in your example, then it wouldn't help though

To differentiate devices that are "live" but Central doesn't know about them, you need another source of information on what devices are live. You can create your own e.g. by retrieving DHCP activity to compare to a list of devices currently connected to Central. This will usually be quite "noisy" data though, e.g. you will get printers and all sorts of other devices that you don't expect to see in Central anyway.

This is a valid management challenge, but not something we're able to address in this particular feature, sorry.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel

Comment

JS over 2 years ago in reply to Robert Rinde

I agree there can be devices that are powered on but not communicating with Central. Differentiating them from devices that are not powered on (and so not connected to Central) is the challenge.

Central shows the last connected time on the device list, so you can see devices that were recently connected, but no longer, if you'd like to investigate them. The computer and server reports have filters for offline greater than 2 weeks and over 2 months for the same reason. Generally, the devices will simply be turned off or decommissioned though.

Loosely related:

1. we have the ability to sync devices from AD which can help to find online devices which AD knows about but aren't managed by Central, i.e. ones where the Sophos agent has not been installed.

2. The "heartbeat" functionality available with the XG firewall allows you to prevent devices with "bad health" from connecting to the Internet, and I believe report on those. This means a broken install (a potential reason for not speaking to Central) would come to your attention. If the install is fine though and it is simply other network config that blocks central comms, like in your example, then it wouldn't help though

To differentiate devices that are "live" but Central doesn't know about them, you need another source of information on what devices are live. You can create your own e.g. by retrieving DHCP activity to compare to a list of devices currently connected to Central. This will usually be quite "noisy" data though, e.g. you will get printers and all sorts of other devices that you don't expect to see in Central anyway.

This is a valid management challenge, but not something we're able to address in this particular feature, sorry.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel

Children

No Data