Implement joins without a match section
This document explains the concept of join operations without a match section
or data join within Google Security Operations Search.
You can use join operations to correlate and combine raw data from multiple sources based on common field values. By combining related security events and entities into a single, comprehensive view, you can provide a more effective threat detection and investigation.
Unlike statistical joins that require a match section to aggregate results,
a data join retrieves the complete event or entity data and displays them
without any aggregation.
How data joins work
You can create a data join by correlating common fields across different event or entity blocks. You can do this by one of these methods:
Directly equate the fields (for example,
$e1.principal.hostname = $e2.principal.hostname)Assign both fields to the same placeholder variable (for example,
$host = $e1.principal.hostnameand$host = $e2.principal.hostname).
In both cases, Search implicitly joins the blocks where the values of those fields are identical.
Supported data join types
You can use the following data join types in your Search queries:
Event-to-Event joins: Correlate data between two different Unified Data Model (UDM) event types.
Event-to-ECG joins: Enrich UDM event data with information from the Entity Context Graph (ECG).
Event-to-Event joins
Event-to-Event joins are best for correlating fields between two distinct UDM event types. This is useful for finding sequences of events or actions that involve the same entities across different log sources or event types.
The following query example finds all network connections (NETWORK_CONNECTION)
originating from a host where a user login (USER_LOGIN) also occurred:
// Find user logins and assign the hostname to the $host placeholder
$e1.metadata.event_type = "USER_LOGIN"
$host = $e1.principal.hostname
// Find network connections and join them where the hostname matches the
$host placeholder
$e2.metadata.event_type = "NETWORK_CONNECTION"
$host = $e2.principal.hostname
Limitations
A maximum of two events can be joined.
The query time range is limited to a maximum of 14 days.
The query limit is 120 queries per hour (QPH).
Examples
The following query example finds all network connections
(NETWORK_CONNECTION) originating from a host where a user login
(USER_LOGIN) also occurred:
// Find user logins and assign the hostname to the $host placeholder
$e1.metadata.event_type = "USER_LOGIN"
$host = $e1.principal.hostname
// Find network connections and join them where the hostname matches the $host
placeholder
$e2.metadata.event_type = "NETWORK_CONNECTION"
$host = $e2.principal.hostname
Join on user ID
$e1.metadata.event_type = "USER_LOGIN"
$e1.security_result.action = "ALLOW"
$e1.principal.user.userid = $user
$e2.metadata.event_type = "NETWORK_CONNECTION"
$e2.principal.user.userid = $user
Join on IP address
$e1.metadata.event_type = "USER_LOGIN"
$e1.security_result.action = "ALLOW"
$e1.principal.ip = $ip
$e2.metadata.event_type = "NETWORK_CONNECTION"
$e2.principal.ip = $ip
Event-to-Entity Context Graph joins
Event-to-ECG joins are best for enriching UDM events with contextual data about the involved entities (such as assets, users) from the ECG. This join provides a more complete picture by combining real-time event data with historical and relational entity information.
Limitations
The query time range is limited to a maximum of 14 days.
The query limit is 120 QPH.
A maximum of two UDM events can be joined in the query.
A maximum of one ECG event can be joined in the query.
Export to a datatable isn't supported for Event-to-ECG join queries.
ECG-to-ECG joins are not supported.
ECG-to-datatable joins are not supported.
Examples
This query enriches network connection events with asset information from the ECG by joining on the hostname.
// Find network connections and assign the hostname to the $host placeholder
$e1.metadata.event_type = "NETWORK_CONNECTION"
$host = $e1.principal.asset.hostname
// Find asset entities in the graph and join where the hostname matches the
$host placeholder
$g1.graph.metadata.entity_type = "ASSET"
$host = $g1.graph.entity.asset.hostname
Join on IP address with a specific log type
$ip = $e1.principal.ip
$ip = $g1.graph.entity.ip
$e1.metadata.log_type = "WINDOWS_DEFENDER_ATP"
$g1.graph.entity.ip = "10.19.6.24"
Join on hostname with a specific IP filter
$e1.metadata.event_type = "FILE_CREATION"
$host = $e1.principal.hostname
$e1.principal.ip = "10.0.0.76"
$g1.graph.metadata.entity_type = "ASSET"
$host = $g1.graph.entity.hostname
Best practices
To avoid slow performance and query
timeouts, use specific and narrow filters within each block ($e1,
$e2, $g1) in your join queries.
For example, a broad query like the following:
$e1.metadata.event_type = "USER_LOGIN"
$e2.metadata.event_type = "NETWORK_CONNECTION"
right join $e1.principal.hostname = $e2.principal.hostname
Can be optimized by adding specific criteria follows:
$e1.metadata.event_type = "USER_LOGIN"
$e1.principal.ip = "192.168.1.101"
$e1.principal.user.userid = "alex"
$e2.metadata.event_type = "NETWORK_CONNECTION"
$e2.src.hostname = "altostrat.com"
right join $e1.principal.hostname = $e2.principal.hostname
Work with results
The results of a data join display in a Joins table, including the combined fields from both correlated events. This table differs from a statistics view, where it provides the complete event or entity data, and not an aggregated count.
After running a query, you can work with the results in the following ways:
Download as CSV: Export the complete result set to a CSV file for offline analysis.
Export to datatables: Save the results to a datatable within your instance for reference or further correlation (only for Event-to-Event joins).
Need more help? Get answers from Community members and Google SecOps professionals.