Data Lake queries

Data Lake queries let you search security and compliance data that your devices upload to the cloud.

You can run Data Lake queries with Live Discover, a feature in our Threat Analysis Center.

Live Discover lets you choose which data source you use when you set up and run a query:

For help with Live Discover see Live Discover.

How the Data Lake works

We host the Data Lake and provide scheduled “hydration queries” that define which data your endpoints upload to it.

By default, data uploads from Endpoint Protection and Server Protection are turned on. To check this setting, do as follows:

Go to My Products > Endpoint Protection (or Server Protection for servers).
Click Policies.
Click the Data Collection and Investigation policy.
Check that Upload to the Data Lake is turned on.

For information about uploading data from other products, see Data Lake uploads.

We store the data for 90 days.

We provide pre-prepared Data Lake queries you can run. You can use them as they are or edit them. You can also create your own queries.

Data Lake queries have some advantages over endpoint queries.

They always give results for all endpoints, whether they're connected or not.
They can query data that's up to 90 days old. You can configure the time period so that they only generate as much data as you need.
They can be scheduled.
They can give you access to data uploaded by other Sophos products you're using, for example Sophos Firewall or Sophos Email. These are shown as "sensors" in Live Discover.
They can also give access to data uploaded by third-party products that you integrate with Sophos Central.

For information about setting up uploads from other Sophos products or from third-party products, see Products.