Enrichment

Supported in:

Enrichment uses the following methods to add context to a UDM indicator or event:

  • Identifies alias entities that describe an indicator, typically a UDM field.
  • Populates the UDM message with additional details from the identified aliases or entities.
  • Adds global enrichment data, such as GeoIP and VirusTotal, to UDM events.

To ensure full data coverage for rules, searches, or dashboards that depend on enriched fields, use real-time enrichment with data tables and entity graph joins.

Asset enrichment

For each asset event, the pipeline extracts the following UDM fields from the principal, src, and target entities:

UDM Field Indicator Type
hostname HOSTNAME
asset_id PRODUCT_SPECIFIC_ID
mac MAC
ip IP

User enrichment

For each user event, the pipeline extracts the following UDM fields from principal, src, and target:

UDM field Indicator type
email_addresses EMAIL
userid USERNAME
windows_sid WINDOWS_SID
employee_id EMPLOYEE_ID
product_object_id PRODUCT_OBJECT_ID

For each indicator, the pipeline performs the following actions:

  • Retrieves a list of user entities. For example, the entities of principal.email_address and principal.userid might be the same, or they might be different.
  • Chooses the aliases from the highest priority indicator type, using this priority order: WINDOWS_SID, EMAIL, USERNAME, EMPLOYEE_ID, and PRODUCT_OBJECT_ID.
  • Populates noun.user with the entity whose validity interval intersects with the event time.

Process enrichment

Use process enrichment to map a product-specific process ID (product_specific_process_id), or PSPI, to the actual process and retrieve details about the parent process. This process relies on the EDR event batch type.

For each UDM event, the pipeline extracts the PSPI from the following fields:

  • principal
  • src
  • target
  • principal.process.parent_process
  • src.process.parent_process
  • target.process.parent_process

The pipeline uses process aliasing to identify the actual process from the PSPI and retrieves information about the parent process. It then merges this data into the corresponding noun.process field within the enriched message.

EDR indexed fields for process aliasing

When a process launches, the system collects metadata (for example, command lines, file hashes, and parent process details). The EDR software running on the machine assigns a vendor-specific process UUID.

The following table lists the fields that are indexed during a process launch event:

UDM field Indicator type
target.product_specific_process_id PROCESS_ID
target.process Whole process; not just the indicator

In addition to the target.process field from the normalized event, Google SecOps collects and indexes parent process information.

Artifact enrichment

Artifact enrichment adds file hash metadata from VirusTotal and geolocation data for IP addresses. For each UDM event, the pipeline extracts and queries context data for the following artifact indicators from the principal, src, and target entities:

  • IP address: Queries data only if it's public or routable.
  • File hashes: Queries hashes in the following order:
    • file.sha256
    • file.sha1
    • file.md5
    • process.file.sha256
    • process.file.sha1
    • process.file.md5

The pipeline uses UNIX epoch and event hour to define the time range for the file artifact queries. If geolocation data is available, the pipeline overwrites the following UDM fields for the principal, src, and target entities, based on the origin of the geolocation data:

  • artifact.ip
  • artifact.location
  • artifact.network (only if the data includes IP network context)
  • location (only if the original data doesn't include this field)

If the pipeline finds file hash metadata, it adds that metadata to the file or process.file fields, depending on the origin of the indicator. The pipeline keeps any existing values that don't overlap with the new data.

IP geolocation enrichment

Geographic aliasing provides geolocation data for external IP addresses. For each unaliased IP address in the principal, target, or src field for a UDM event, an ip_geo_artifact subprotocol buffer is created with the associated location and ASN information.

Geographic aliasing doesn't use lookback or caching. Due to the high volume of events, Google SecOps maintains an index in memory.

Enrich events with VirusTotal file metadata

Google SecOps enriches file hashes into UDM events and provides additional context during an investigation. Hash aliasing enriches UDM events by combining all types of file hashes and providing information about a file hash during a search.

Google SecOps integrates VirusTotal file metadata and relationship enrichment to identify patterns of malicious activity and track malware movements across a network.

A raw log provides limited information about the file. VirusTotal enriches the event with file metadata, including details about malicious hashes and files. The metadata includes information, for example, filenames, types, imported functions, and tags. You can use this information in the UDM search and detection engine with YARA-L to understand malicious file events and during threat hunting. For example, you can detect modifications to the original file that use the file metadata for threat detection.

The following information is stored with the record. For a list of all UDM fields, see Unified Data Model field list.

Type of data UDM field
sha-256 ( principal | target | src | observer ).file.sha256
md5 ( principal | target | src | observer ).file.md5
sha-1 ( principal | target | src | observer ).file.sha1
size ( principal | target | src | observer ).file.size
ssdeep ( principal | target | src | observer ).file.ssdeep
vhash ( principal | target | src | observer ).file.vhash
authentihash ( principal | target | src | observer ).file.authentihash
PE file metadata Imphash ( principal | target | src | observer ).file.pe_file.imphash
security_result.threat_verdict ( principal | target | src | observer ).(process | file).security_result.threat_verdict
security_result.severity ( principal | target | src | observer ).(process | file).security_result.severity
last_modification_time ( principal | target | src | observer ).file.last_modification_time
first_seen_time ( principal | target | src | observer ).file.first_seen_time
last_seen_time ( principal | target | src | observer ).file.last_seen_time
last_analysis_time ( principal | target | src | observer ).file.last_analysis_time
exif_info.original_file ( principal | target | src | observer ).file.exif_info.original_file
exif_info.product ( principal | target | src | observer ).file.exif_info.product
exif_info.company ( principal | target | src | observer ).file.exif_info.company
exif_info.file_description ( principal | target | src | observer ).file.exif_info.file_description
signature_info.codesign.id ( principal | target | src | observer ).file.signature_info.codesign.id
signature_info.sigcheck.verfied ( principal | target | src | observer ).file.signature_info.sigcheck.verified
signature_info.sigcheck.verification_message ( principal | target | src | observer ).file.signature_info.sigcheck.verification_message
signature_info.sigcheck.signers.name ( principal | target | src | observer ).file.signature_info.sigcheck.signers.name
signature_info.sigcheck.status ( principal | target | src | observer ).file.signature_info.sigcheck.signers.status
signature_info.sigcheck.valid_usage ( principal | target | src | observer ).file.signature_info.sigcheck.signers.valid_usage
signature_info.sigcheck.cert_issuer ( principal | target | src | observer ).file.signature_info.sigcheck.signers.cert_issuer
file_type ( principal | target | src | observer ).file.file_type

What's next

For information about how to use enriched data with other Google SecOps features, see the following:

Need more help? Get answers from Community members and Google SecOps professionals.