You can find the getting started guide for the EDR Data Lake APIs available here on the apigee.io site we use.

Overview

This guide takes you through a few simple steps to start using the new EDR Data Lake APIs in Sophos Central.

All our APIs are offered as RESTful HTTP endpoints over the public internet. We use standard authentication, JSON requests and responses and standard HTTP verbs. All communication is over HTTPS.

This document tells you how to run an XDR query, monitor its status, and retrieve results when the query run has completed.

NOTE: All the following documentation is being released as part of an early preview and is subject to change.

Pre-requisites

You must have a set of API credentials (service principal) to be able to call the EDR Data Lake APIs. Please refer to one of these quick start guides:

  • Sophos Partners: Please read the Parter Getting Started guide first.

  • Enterprise customers: Go to the Organization Getting started guide if you use Enterprise Dashboard to manage multiple tenants.

  • Other customers: Read the Tenant Getting started guide.

You can make the API calls in the next few sections using cURL. Follow the instructions on cURL's website to install this tool.

Each cUrl command in the rest of this guide assumes you will replace the following variables:

  • <tenant-id>: The ID of the tenant you want to query
  • <jwt>: The JWT access token returned when the IDP authenticate the service principal.
  • <data-region>: The regional API host in the data geography where the tenant data is located.

What these mean will become obvious once you have completed one of Getting Started guides above.

Step 1 - Run a query

First create a file named query.sql containing the query text:

-- running_processes_windows_sophos
SELECT
   -- Device ID DETAILS
   meta_hostname, meta_ip_address,
   -- Query Details
   query_name, cmdline, file_size, gid, global_rep,
   global_rep_data, local_rep, local_rep_data, ml_score, ml_score_data,
   name, parent, parent_name, parent_path, parent_sophos_pid,
   path, pid, pua_score, sha1, sha256,
   sophos_pid, time, uid, username,
   -- Decoration
   meta_boot_time, meta_eid, meta_endpoint_type,
   meta_ip_mask, meta_mac_address, meta_os_name, meta_os_platform, meta_os_type,
   meta_os_version, meta_public_ip, meta_query_pack_version, meta_username,
   --- Generic
   calendar_time, counter, epoch, host_identifier, numerics
   osquery_action, unix_time,
   -- Data Lake
   customer_id, endpoint_id, upload_size
FROM xdr_data
WHERE query_name = 'running_processes_windows_sophos'

Once you are familiar with the schema of the data, you will be able to write your own queries.

curl -XPOST -H 'Content-Type: application/json' \
            -H 'Authorization: Bearer <jwt>' \
            -H 'X-Tenant-ID: <tenant-id>' \
            --data query.sql  \
                   <data-region>/xdr-query/v1/queries/runs

This will return a response such as:

{
    "id": "bb0cbb9b-a189-495b-9cc9-7027df7110d6",
    "createdAt": "2020-11-02T12:51:09.933Z",
    "expiresAt": "2020-11-02T14:51:09.933Z",
    "result": "notAvailable",
    "status": "pending"
}

Step 2 - Monitor query run status

Run the following command, replacing <run-id> with the id from the response in previous step.

curl -XGET -H 'Content-Type: application/json' \
           -H 'Authorization: Bearer <jwt>' \
           -H 'X-Tenant-ID: <tenant-id>' \
                  <data-region>/xdr-query/v1/queries/runs/<run-id>

This command returns the same API response as in Step 1. You should call this repeatedly until the status changes to finished.

{
    "id": "bb0cbb9b-a189-495b-9cc9-7027df7110d6",
    "createdAt": "2020-11-02T12:51:09.933Z",
    "expiresAt": "2020-11-02T14:51:09.933Z",
    "finishedAt": "2020-11-02T12:51:12.512Z",
    "result": "succeeded",
    "status": "finished"
}

Please leave a few seconds between API calls to avoid being rate limited, where your API calls will fail with the HTTP response status code of 429 (Too Many Requests).

Step 3 - Fetch the results

Use this command to fetch the results:

curl -XGET -H 'Content-Type: application/json' \
           -H 'Authorization: Bearer <jwt>' \
           -H 'X-Tenant-ID: <tenant-id>' \
                  <data-region>/xdr-query/v1/queries/runs/:runId/results?maxSize=1000

A successful response looks like this:

{
    "items": [
        {
            "parent": 668,
            "sha256": null,
            "upload_size": 948,
            "osquery_action": "added",
            "script_path": null,
            "profile_path": null,
            "package_id": null,
            "path": "C:\\Program Files\\Sophos\\Sophos Network Threat Protection\\SophosNtpService.exe",
            "account_expires": null,
            "timestamps": null,
            // ... (other fields in the first row of results)
        },
        // ... other rows in the result, one object per row
    ],
    "metadata": {
        "columns": [
            {
                "name": "endpoint_id",
                "type": "string"
            },
            {
                "name": "ingestion_timestamp",
                "type": "timestamp"
            },
            {
                "name": "schema_version",
                "type": "string"
            },

            // ... metadata for other columns in the results
        ]
    },
    "pages": {
        "nextKey": "AVCcdoKeugP... (snip) ... Ve1kw==",
        "size": 1000,
        "maxSize": 1000
    }

At this time, XDR queries return only the 1000 rows. To get more specific results, make the focus of the query narrower. We will add pagination support to fetch all results soon.

Conclusion

You can now call the EDR Data Lake APIs to issue a query, monitor its status, and fetch the results. You can also perform other operations such as canceling a long running query and fetching the original query request -- read the API reference. There are some things you need to know about error handling, pagination, partial responses and other aspects of calling RESTful APIs that are specific to Sophos APIs. You can read about them here.