Ingestion methods and data types
To effectively monitor your environment and investigate incidents, Google Security Operations lets you ingest a wide variety of security data. Understanding the types of data you can bring into the platform and the methods used to ingest them is the first step in building a robust security posture.
Types of ingestion data
Google SecOps categorizes incoming data into four primary types, each serving a distinct purpose in the detection and investigation lifecycle:
- Raw logs: These are the original, untouched data streams from your security sources (such as, firewalls, EDR tools, and cloud platforms). Arriving in formats like JSON, Syslog, CSV, or unstructured text, the logs serve as the "source of truth" for deep forensics and compliance. Because field names vary by vendor, raw logs act as the initial input before the platform parses and normalizes the field names.
- UDM events: Unified Data Model (UDM) events are created when parsers convert your raw logs into a consistent, vendor-agnostic format. For example, disparate terms like
src_ipandclient-ipare standardized into a singleprincipal.ipfield. Downstream systems use UDM to provide capabilities like unified search and detection rules. - Entity context data: This data provides the "who, what, and where" to turn generic events into meaningful leads. Context data tells you if an IP address belongs to a high-ranking executive or a critical production server. By enriching events with metadata from sources like Active Directory or CMDBs, analysts can prioritize threats based on actual organizational risk.
- Alerts: These are high-fidelity signals that signify activity requiring immediate attention. Alerts can be ingested directly from external security products (like CrowdStrike) or generated internally by the Google SecOps YARA-L detection engine when UDM events or entities trigger a rule. The alerts serve as the primary building blocks for incident cases.
Understand ingestion entities
Entities provide crucial context to network events. A standard network event might show that user abc@foo.corp launched shady.exe, but it won't indicate if that user is a recently terminated employee.
The entity data model lets you ingest these relationships, capturing new context from IAM, vulnerability management, and data protection systems to provide rich threat intelligence.
Out-of-the-Box entity context parsers
To make ingesting data as seamless as possible, Google SecOps includes API connectors and default parsers for many common authoritative sources. You can ingest asset or user context data from the following supported sources:
- Identity, HR and Access Management: Azure AD Organizational Context, Duo User Context, Google Cloud IAM Analysis, Google Cloud IAM Context, Google Cloud Identity Context, Microsoft AD, Okta User Context, SailPoint IAM, Workday, Workspace Privileges, and Workspace Users.
- Asset and Device Management: JAMF, ServiceNow CMDB, Tanium Asset, Workspace ChromeOS Devices, and Workspace Mobile Devices.
- Security and Vulnerability Management: Microsoft Defender for Endpoint, Nucleus Unified Vulnerability Management, Nucleus Asset Metadata, and Rapid7 Insight.
Overview of data ingestion methods
The Google SecOps ingestion service acts as a gateway for all your incoming data. Depending on where your data lives and how it is formatted, Google SecOps uses the following primary systems to retrieve it:
- Google Cloud (Direct Integration): This is the primary, most cost-effective, and highest-performing method for all standard Google Cloud logs (e.g., Audit, VPC Flow, DNS, and Firewall logs). Google SecOps retrieves this data directly from your Google Cloud organization.
- Bindplane Agent: A managed telemetry pipeline and agent used for collecting logs from on-premises environments and servers (Windows or Linux). It provides massive flexibility for logs that don't easily fit other methods (like on-premises firewalls) and lets you preprocess, filter, or refine cloud data before it reaches Google SecOps. The Bindplane agent is managed using the Bindplane OP Management console.
- Data Feeds: Best used for cloud-based logs (like EDRs or SaaS apps) that are already aggregated into object stores (like Cloud Storage or Amazon S3), or for third parties that support push-based webhooks. Data feeds send logs directly to the ingestion service and provide out-of-the-box support for predefined API integrations (supporting log lines up to 4 MB in size).
- Ingestion API: Designed for custom, high-volume, or home-grown applications that don't fit into the standard methods. While slightly more complex to configure, it offers total control over direct ingestion.
Need more help? Get answers from Community members and Google SecOps professionals.