Help us enhance your Sophos Community experience. Share your thoughts in our Sophos Community survey.

BRIEF Video on EMAIL and the Data Lake.

In this video we show the EMAIL Attachment and URL table that is available in the data lake, we also pivot from a URL seen an an email to ask if any endpoint have ever communicated to that URL and if so what process was it. From that we start the investigation and generate a process tree to see what that URL made happen on the device

 

Getting Started

For email to work you need to have the Sophos EMAL Product deployed and configured with Office 365 with auto search and destroy enabled. Some setup and configuration instructions are below.... (Hey I managed to get it going, so how hard can it be)

The pre-requisites for using the email service is that you have your own domain, that you can re-direct the MX record for and then push the email through us. You will need to enable the search and destroy feature highlighted below and be using Office365.

 SETUP

https://docs.sophos.com/central/Customer/help/en-us/central/Customer/learningContents/Configure365.html (Office 365)

For the last step you will need to join the EAP from early January and you will need to be using Office 365 as your mailbox store. Once you have signed up to the beta we will start collecting that information and populating the table.

A trial license is sufficient, just follow the instructions for configuring Sophos EMAIL for an office 365 account.  

CONFIGURATION

You also need the 'Search and Destroy' feature enabled

CONFIRMING SETUP IS DONE

Once setup you should start seeing emails in the email report section of central.

Running some Queries

It may take a few minutes before email attachment and url data starts getting into the lake

Here are some queries I wrote, others will show up under the OTHER category (We will be adding a new category for EMAIL soon)

-- (DATA LAKE QUERY) EMAIL SENDER DETAILS

-- Generic EMAIL URL LINK SEARCH

-- VARIABLE $$Sender$$          STRING

With Email_List AS (
SELECT 
   timestamp Received_date,
   "from" FROM_,
   envelope_recipient To,
   subject,
   domain,
   url Url_Link,
   client_ip,
   scheme,
   mime_date Send_date,
   reply_to,
   array_join(to,',') TO_ARRAY,
   array_join(cc,',') CC_ARRAY
FROM xdr_xge_url_data
WHERE "from" LIKE '%$$Sender$$%'
ORDER BY timestamp DESC
)
SELECT 
   From_,
   COUNT(From_) Instances,
   array_join(array_agg(DISTINCT To), CHR(10)) To_LIST,
   array_join(array_agg(DISTINCT subject), CHR(10)) subject_LIST,
   array_join(array_agg(DISTINCT domain), CHR(10)) domain_LIST,
   array_join(array_agg(DISTINCT url_Link), CHR(10)) url_LIST,
   array_join(array_agg(DISTINCT client_ip), CHR(10)) Client_IP_LIST,
   MIN(Received_date) First_Email,
   MAX(Received_date) Last_Email
FROM Email_List
GROUP BY From_
ORDER BY Instances DESC

-- (DATA LAKE QUERY) EMAIL GENERIC URL SEARCH

-- Generic EMAIL URL LINK SEARCH

-- VARIABLE $$From contains$$          STRING
-- VARIABLE $$To contains$$            STRING
-- VARIABLE $$URL contains$$           URL
-- VARIABLE $$client IP$$              IP_Address
-- VARIABLE $$domain contains$$        STRING
-- VARIABLE $$subject contains$$       STRING
SELECT 
   timestamp Received_date,
   "from" FROM_,
   envelope_recipient To,
   subject,
   domain,
   url Url_Link,
   client_ip,
   scheme,
   mime_date Send_date,
   reply_to,
   array_join(to,',') TO_ARRAY,
   array_join(cc,',') CC_ARRAY
FROM xdr_xge_url_data
WHERE (url LIKE '%$$URL contains$$%' OR resolved_url LIKE '%$$URL contains$$%')
 AND domain LIKE '%$$domain contains$$%'
 AND subject LIKE '%$$subject contains$$%'
 AND client_IP LIKE '%$$client IP$$%'
 AND "from" LIKE '%$$From contains$$%'
 AND envelope_recipient LIKE '%$$To contains$$%'
ORDER BY timestamp DESC

-- (DATA LAKE QUERY) EMAIL Generic attachment search

-- EMAIL Generic attachment search

-- VARIABLE  $$From$$                           STRING
-- VARIABLE  $$To$$                             STRING
-- VARIABLE  $$Subject contains$$               STRING
-- VARIABLE  $$client_IP$$                      IP_Address
-- VARIABLE  $$Name of attachment file$$        STRING
-- VARIABLE  $$SHA_256 of  the attachments$$    SHA-256
SELECT 
   timestamp,
   "from" FROM_,
   envelope_recipient To,
   subject,
   name File_Name,
   size size,
   checksum File_SHA256
FROM symlink_xdr_xge_att_data
WHERE
   "from" LIKE '%$$From$$%'
   AND envelope_recipient LIKE '%$$To$$%'
   AND subject LIKE '%$$Subject contains$$%'
   AND client_IP LIKE '%$$client_IP$$%'
   AND name LIKE '%$$Name of attachment file$$%'
   AND checksum LIKE '%$$SHA_256 of  the attachments$$%'
ORDER BY timestamp DESC