Unlike standard workflows that instantiate a previously created workflow template resource, inline workflows use a YAML file or an embedded WorkflowTemplate definition to run a workflow.
.Create and run an inline workflow
gcloud
REST
Before using any of the request data, make the following replacements:
- project-id: Google Cloud project ID
- region: cluster region, such as "us-central1"
- zoneUri: Specify a zone within the cluster's region, such as "us-central1-b", or leave empty ("") to use Dataproc Auto Zone placement
- clusterName: cluster name
HTTP method and URL:
POST https://dataproc.googleapis.com/v1/projects/project-id/regions/region/workflowTemplates:instantiateInline
Request JSON body:
{
"jobs": [
{
"hadoopJob": {
"mainJarFileUri": "file:///usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar",
"args": [
"teragen",
"1000",
"hdfs:///gen/"
]
},
"stepId": "teragen"
},
{
"hadoopJob": {
"mainJarFileUri": "file:///usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar",
"args": [
"terasort",
"hdfs:///gen/",
"hdfs:///sort/"
]
},
"stepId": "terasort",
"prerequisiteStepIds": [
"teragen"
]
}
],
"placement": {
"managedCluster": {
"clusterName": "cluster-name",
"config": {
"gceClusterConfig": {
"zoneUri": "zone"
}
}
}
}
}
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{
"name": "projects/project-id/regions/region/operations/2fbd0dad-...",
"metadata": {
"@type": "type.googleapis.com/google.cloud.dataproc.v1.WorkflowMetadata",
"graph": {
"nodes": [
{
"stepId": "teragen",
"state": "RUNNABLE"
},
{
"stepId": "terasort",
"prerequisiteStepIds": [
"teragen"
],
"state": "BLOCKED"
}
]
},
"state": "PENDING",
"startTime": "2020-04-02T22:50:44.826Z"
}
}
Console
Currently, the creation of inline workflows is not supported in the Google Cloud console. Workflow templates and instantiated workflows can be viewed from Dataproc Workflows page.