Message History API

17 Oct 2022

What’s new

We have introduced a new feature, Message History API. The Message History API uses Sophos Central XDR Data Lake to collect the Message History data, which can then be queried using this API. Watch the video attached at end of this post to familiarize yourself with this feature.

Applies to the following Sophos products
Sophos Email
Sophos XDR
Sophos MDR

How to use

The Message History API significantly enhances the previous Sophos Email data set in the XDR Data Lake. Earlier, only data pertaining to the Post Delivery Protection for Microsoft 365 was available. Now, all Message History data is available in Sophos Central XDR Data Lake.

You may use the Live Discover console to run any of the queries. We have added new queries in Live Discover query pack for Email.

You can use the “Designer Mode” to modify the query to suit your needs. Likewise, you can create a new query too.

You can export the query result in a CSV format. The CSV file can then be imported in a SIEM tool of your choice.

The Message History API can be called in multiple ways, as documented here. You will need XDR or MDR license to use the API.

When to use

You can query the Message History data in XDR Data Lake for a variety of purposes such as:
* downloading the message history data to safe keep it for a future reference
* searching a specific message based on attributes such as sender, recipient, etc.
* searching a specific type of message such as impersonation email, spam, etc.
* searching messages with a specific attachment or URL
* gathering statistics of messages with specific email metadata
* listing the events triggered on a message
* importing the events triggered on messages into a SIEM tool
* threat hunting

In order to facilitate your effort in creating queries that meet your needs, we have created a set of 18 queries in the Email query pack. We recommend that you try these queries and modify them suitably for your purpose. You can also create new queries by adding or removing fields and table. For instance, if you want to find the messages by their metadata, then you should use the following query in the query pack:
1. Find emails by message id, sender, recipient, subject, client IP or action

If you want to find the messages by the attachment present in them, then you would need either the attachment name or the SHA256 of the attachment, based on which can pick the appropriate query:
2. Find Emails with a specific Attachment - SHA256
3. Find Emails with Attachments

To find the messages by the URL present in them, we have provided 3 different queries. You can find all the URLs in different messages of a sender. You can also find all emails containing an URL or a part of the URL. You can also find all the emails that match a set of other criteria, including an URL or a part of the URL. The respective queries are:
4. Find all URLs of Emails for specific Sender
5. Find Emails with specific text within a URL
6. Find URLs based on Email metadata

You can list the quarantined messages – i.e. message currently in quarantine, message that were released from quarantine, or the messages that were deleted from the quarantine:
7. List emails in admin quarantine
8. List emails released from quarantine
9. List emails deleted from quarantine

You can also list top protection violators or targets of attacks in the messages. The following queries are name appropriately:
10. List top attackers
11. List top targets of attacks
12. List top senders of unwanted emails (outbound)

Lastly, you can list the messages by the protection triggered on them. The names of the following queries are self-explanatory:
13. List impersonation emails
14. List sender authentication failure emails
15. List data control emails
16. Show message statistics
17. List malicious attachment emails (to be released in future)
18. List malicious URL emails (to be released in future)

Note, we start collecting the data pertaining to a query, once it is released. We have released 16 of the 18 queries in the query pack. As indicated above, the remaining two queries will be released in future. Once they are released, we will update this post. It will take 30 days for the data pertaining to a query to be collected in XDR Data Lake to match the information available in Message History. In the Live Discover console, you can select the appropriate time period for which you want to run the query – i.e. last 24 hours, last 7 days, last 30 days, or a custom time range. The custom time range cannot be longer than a period of 30 days. Unlike Message History which has data of the last 30 days only, XDR Data Lake stores data of the last 90 days. In order to collect data for the last 90 days, 3 queries of 30 days must be run, and the data combined outside of the XDR reports.

Watch the demo video

References

Message History data schema (tables and fields)
Live Discover documentation
Getting started with XDR Data Lake Query
XDR API documentation