- Resource: Pipeline
- PipelineType
- State
- Workload
- LaunchTemplateRequest
- LaunchTemplateParameters
- RuntimeEnvironment
- WorkerIPAddressConfiguration
- LaunchFlexTemplateRequest
- LaunchFlexTemplateParameter
- FlexTemplateRuntimeEnvironment
- FlexResourceSchedulingGoal
- ScheduleSpec
- Methods
Resource: Pipeline
The main pipeline entity and all the necessary metadata for launching and managing linked jobs.
| JSON representation | 
|---|
| { "name": string, "displayName": string, "type": enum ( | 
| Fields | |
|---|---|
| name | 
 The pipeline name. For example:  
 | 
| displayName | 
 Required. The display name of the pipeline. It can contain only letters ([A-Za-z]), numbers ([0-9]), hyphens (-), and underscores (_). | 
| type | 
 Required. The type of the pipeline. This field affects the scheduling of the pipeline and the type of metrics to show for the pipeline. | 
| state | 
 Required. The state of the pipeline. When the pipeline is created, the state is set to 'PIPELINE_STATE_ACTIVE' by default. State changes can be requested by setting the state to stopping, paused, or resuming. State cannot be changed through pipelines.patch requests. | 
| createTime | 
 Output only. Immutable. The timestamp when the pipeline was initially created. Set by the Data Pipelines service. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples:  | 
| lastUpdateTime | 
 Output only. Immutable. The timestamp when the pipeline was last modified. Set by the Data Pipelines service. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples:  | 
| workload | 
 Workload information for creating new jobs. | 
| scheduleInfo | 
 Internal scheduling information for a pipeline. If this information is provided, periodic jobs will be created per the schedule. If not, users are responsible for creating jobs externally. | 
| jobCount | 
 Output only. Number of jobs. | 
| schedulerServiceAccountEmail | 
 Optional. A service account email to be used with the Cloud Scheduler job. If not specified, the default compute engine service account will be used. | 
| pipelineSources | 
 Immutable. The sources of the pipeline (for example, Dataplex). The keys and values are set by the corresponding sources during pipeline creation. An object containing a list of  | 
PipelineType
The type of a pipeline. For example, batch or streaming.
| Enums | |
|---|---|
| PIPELINE_TYPE_UNSPECIFIED | The pipeline type isn't specified. | 
| PIPELINE_TYPE_BATCH | A batch pipeline. It runs jobs on a specific schedule, and each job will automatically terminate once execution is finished. | 
| PIPELINE_TYPE_STREAMING | A streaming pipeline. The underlying job is continuously running until it is manually terminated by the user. This type of pipeline doesn't have a schedule to run on, and the linked job gets created when the pipeline is created. | 
State
The current state of pipeline execution.
| Enums | |
|---|---|
| STATE_UNSPECIFIED | The pipeline state isn't specified. | 
| STATE_RESUMING | The pipeline is getting started or resumed. When finished, the pipeline state will be 'PIPELINE_STATE_ACTIVE'. | 
| STATE_ACTIVE | The pipeline is actively running. | 
| STATE_STOPPING | The pipeline is in the process of stopping. When finished, the pipeline state will be 'PIPELINE_STATE_ARCHIVED'. | 
| STATE_ARCHIVED | The pipeline has been stopped. This is a terminal state and cannot be undone. | 
| STATE_PAUSED | The pipeline is paused. This is a non-terminal state. When the pipeline is paused, it will hold processing jobs, but can be resumed later. For a batch pipeline, this means pausing the scheduler job. For a streaming pipeline, creating a job snapshot to resume from will give the same effect. | 
Workload
Workload details for creating the pipeline jobs.
| JSON representation | 
|---|
| { // Union field | 
| Fields | |
|---|---|
| Union field  
 | |
| dataflowLaunchTemplateRequest | 
 Template information and additional parameters needed to launch a Dataflow job using the standard launch API. | 
| dataflowFlexTemplateRequest | 
 Template information and additional parameters needed to launch a Dataflow job using the flex launch API. | 
LaunchTemplateRequest
A request to launch a template.
| JSON representation | 
|---|
| {
  "projectId": string,
  "validateOnly": boolean,
  "launchParameters": {
    object ( | 
| Fields | |
|---|---|
| projectId | 
 Required. The ID of the Cloud Platform project that the job belongs to. | 
| validateOnly | 
 If true, the request is validated but not actually executed. Defaults to false. | 
| launchParameters | 
 The parameters of the template to launch. This should be part of the body of the POST request. | 
| location | 
 The regional endpoint to which to direct the request. | 
| gcsPath | 
 A Cloud Storage path to the template from which to create the job. Must be a valid Cloud Storage URL, beginning with 'gs://'. | 
LaunchTemplateParameters
Parameters to provide to the template being launched.
| JSON representation | 
|---|
| {
  "jobName": string,
  "parameters": {
    string: string,
    ...
  },
  "environment": {
    object ( | 
| Fields | |
|---|---|
| jobName | 
 Required. The job name to use for the created job. | 
| parameters | 
 The runtime parameters to pass to the job. An object containing a list of  | 
| environment | 
 The runtime environment for the job. | 
| update | 
 If set, replace the existing pipeline with the name specified by jobName with this pipeline, preserving state. | 
| transformNameMapping | 
 Map of transform name prefixes of the job to be replaced to the corresponding name prefixes of the new job. Only applicable when updating a pipeline. An object containing a list of  | 
RuntimeEnvironment
The environment values to set at runtime.
| JSON representation | 
|---|
| {
  "numWorkers": integer,
  "maxWorkers": integer,
  "zone": string,
  "serviceAccountEmail": string,
  "tempLocation": string,
  "bypassTempDirValidation": boolean,
  "machineType": string,
  "additionalExperiments": [
    string
  ],
  "network": string,
  "subnetwork": string,
  "additionalUserLabels": {
    string: string,
    ...
  },
  "kmsKeyName": string,
  "ipConfiguration": enum ( | 
| Fields | |
|---|---|
| numWorkers | 
 The initial number of Compute Engine instances for the job. | 
| maxWorkers | 
 The maximum number of Compute Engine instances to be made available to your pipeline during execution, from 1 to 1000. | 
| zone | 
 The Compute Engine availability zone for launching worker instances to run your pipeline. In the future, workerZone will take precedence. | 
| serviceAccountEmail | 
 The email address of the service account to run the job as. | 
| tempLocation | 
 The Cloud Storage path to use for temporary files. Must be a valid Cloud Storage URL, beginning with  | 
| bypassTempDirValidation | 
 Whether to bypass the safety checks for the job's temporary directory. Use with caution. | 
| machineType | 
 The machine type to use for the job. Defaults to the value from the template if not specified. | 
| additionalExperiments[] | 
 Additional experiment flags for the job. | 
| network | 
 Network to which VMs will be assigned. If empty or unspecified, the service will use the network "default". | 
| subnetwork | 
 Subnetwork to which VMs will be assigned, if desired. You can specify a subnetwork using either a complete URL or an abbreviated path. Expected to be of the form "https://www.googleapis.com/compute/v1/projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNETWORK" or "regions/REGION/subnetworks/SUBNETWORK". If the subnetwork is located in a Shared VPC network, you must use the complete URL. | 
| additionalUserLabels | 
 Additional user labels to be specified for the job. Keys and values should follow the restrictions specified in the labeling restrictions page. An object containing a list of key/value pairs. Example: { "name": "wrench", "mass": "1kg", "count": "3" }. An object containing a list of  | 
| kmsKeyName | 
 Name for the Cloud KMS key for the job. The key format is: projects/ | 
| ipConfiguration | 
 Configuration for VM IPs. | 
| workerRegion | 
 The Compute Engine region (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1". Mutually exclusive with workerZone. If neither workerRegion nor workerZone is specified, default to the control plane's region. | 
| workerZone | 
 The Compute Engine zone (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1-a". Mutually exclusive with workerRegion. If neither workerRegion nor workerZone is specified, a zone in the control plane's region is chosen based on available capacity. If both  | 
| enableStreamingEngine | 
 Whether to enable Streaming Engine for the job. | 
WorkerIPAddressConfiguration
Specifies how IP addresses should be allocated to the worker machines.
| Enums | |
|---|---|
| WORKER_IP_UNSPECIFIED | The configuration is unknown, or unspecified. | 
| WORKER_IP_PUBLIC | Workers should have public IP addresses. | 
| WORKER_IP_PRIVATE | Workers should have private IP addresses. | 
LaunchFlexTemplateRequest
A request to launch a Dataflow job from a Flex Template.
| JSON representation | 
|---|
| {
  "projectId": string,
  "launchParameter": {
    object ( | 
| Fields | |
|---|---|
| projectId | 
 Required. The ID of the Cloud Platform project that the job belongs to. | 
| launchParameter | 
 Required. Parameter to launch a job from a Flex Template. | 
| location | 
 Required. The regional endpoint to which to direct the request. For example,  | 
| validateOnly | 
 If true, the request is validated but not actually executed. Defaults to false. | 
LaunchFlexTemplateParameter
Launch Flex Template parameter.
| JSON representation | 
|---|
| {
  "jobName": string,
  "parameters": {
    string: string,
    ...
  },
  "launchOptions": {
    string: string,
    ...
  },
  "environment": {
    object ( | 
| Fields | |
|---|---|
| jobName | 
 Required. The job name to use for the created job. For an update job request, the job name should be the same as the existing running job. | 
| parameters | 
 The parameters for the Flex Template. Example:  An object containing a list of  | 
| launchOptions | 
 Launch options for this Flex Template job. This is a common set of options across languages and templates. This should not be used to pass job parameters. An object containing a list of  | 
| environment | 
 The runtime environment for the Flex Template job. | 
| update | 
 Set this to true if you are sending a request to update a running streaming job. When set, the job name should be the same as the running job. | 
| transformNameMappings | 
 Use this to pass transform name mappings for streaming update jobs. Example:  An object containing a list of  | 
| containerSpecGcsPath | 
 Cloud Storage path to a file with a JSON-serialized ContainerSpec as content. | 
FlexTemplateRuntimeEnvironment
The environment values to be set at runtime for a Flex Template.
| JSON representation | 
|---|
| { "numWorkers": integer, "maxWorkers": integer, "zone": string, "serviceAccountEmail": string, "tempLocation": string, "machineType": string, "additionalExperiments": [ string ], "network": string, "subnetwork": string, "additionalUserLabels": { string: string, ... }, "kmsKeyName": string, "ipConfiguration": enum ( | 
| Fields | |
|---|---|
| numWorkers | 
 The initial number of Compute Engine instances for the job. | 
| maxWorkers | 
 The maximum number of Compute Engine instances to be made available to your pipeline during execution, from 1 to 1000. | 
| zone | 
 The Compute Engine availability zone for launching worker instances to run your pipeline. In the future, workerZone will take precedence. | 
| serviceAccountEmail | 
 The email address of the service account to run the job as. | 
| tempLocation | 
 The Cloud Storage path to use for temporary files. Must be a valid Cloud Storage URL, beginning with  | 
| machineType | 
 The machine type to use for the job. Defaults to the value from the template if not specified. | 
| additionalExperiments[] | 
 Additional experiment flags for the job. | 
| network | 
 Network to which VMs will be assigned. If empty or unspecified, the service will use the network "default". | 
| subnetwork | 
 Subnetwork to which VMs will be assigned, if desired. You can specify a subnetwork using either a complete URL or an abbreviated path. Expected to be of the form "https://www.googleapis.com/compute/v1/projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNETWORK" or "regions/REGION/subnetworks/SUBNETWORK". If the subnetwork is located in a Shared VPC network, you must use the complete URL. | 
| additionalUserLabels | 
 Additional user labels to be specified for the job. Keys and values must follow the restrictions specified in the labeling restrictions. An object containing a list of key/value pairs. Example:  An object containing a list of  | 
| kmsKeyName | 
 Name for the Cloud KMS key for the job. Key format is: projects/ | 
| ipConfiguration | 
 Configuration for VM IPs. | 
| workerRegion | 
 The Compute Engine region (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1". Mutually exclusive with workerZone. If neither workerRegion nor workerZone is specified, defaults to the control plane region. | 
| workerZone | 
 The Compute Engine zone (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1-a". Mutually exclusive with workerRegion. If neither workerRegion nor workerZone is specified, a zone in the control plane region is chosen based on available capacity. If both  | 
| enableStreamingEngine | 
 Whether to enable Streaming Engine for the job. | 
| flexrsGoal | 
 Set FlexRS goal for the job. https://cloud.google.com/dataflow/docs/guides/flexrs | 
FlexResourceSchedulingGoal
Specifies the resource to optimize for in Flexible Resource Scheduling.
| Enums | |
|---|---|
| FLEXRS_UNSPECIFIED | Run in the default mode. | 
| FLEXRS_SPEED_OPTIMIZED | Optimize for lower execution time. | 
| FLEXRS_COST_OPTIMIZED | Optimize for lower cost. | 
ScheduleSpec
Details of the schedule the pipeline runs on.
| JSON representation | 
|---|
| { "schedule": string, "timeZone": string, "nextJobTime": string } | 
| Fields | |
|---|---|
| schedule | 
 Unix-cron format of the schedule. This information is retrieved from the linked Cloud Scheduler. | 
| timeZone | 
 Timezone ID. This matches the timezone IDs used by the Cloud Scheduler API. If empty, UTC time is assumed. | 
| nextJobTime | 
 Output only. When the next Scheduler job is going to run. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples:  | 
| Methods | |
|---|---|
| 
 | Creates a pipeline. | 
| 
 | Deletes a pipeline. | 
| 
 | Looks up a single pipeline. | 
| 
 | Lists pipelines. | 
| 
 | Updates a pipeline. | 
| 
 | Creates a job for the specified pipeline directly. | 
| 
 | Freezes pipeline execution permanently. |