"Managed Service for Apache Spark" is the new name for the product formerly known as "Dataproc on Compute Engine" (cluster deployment) and "Google Cloud Serverless for Apache Spark" (serverless deployment).

MCP Tools Reference: dataproc.googleapis.com

Tool: `get_job`

Get a Dataproc job in a Google Cloud project

The following sample demonstrate how to use curl to invoke the get_job MCP tool.

Curl Request
curl --location 'https://dataproc.googleapis.com/mcp' \ --header 'content-type: application/json' \ --header 'accept: application/json, text/event-stream' \ --data '{ "method": "tools/call", "params": { "name": "get_job", "arguments": { // provide these details according to the tool's MCP specification } }, "jsonrpc": "2.0", "id": 1 }'

Curl Request

                  
curl --location 'https://dataproc.googleapis.com/mcp' \
--header 'content-type: application/json' \
--header 'accept: application/json, text/event-stream' \
--data '{
  "method": "tools/call",
  "params": {
    "name": "get_job",
    "arguments": {
      // provide these details according to the tool's MCP specification
    }
  },
  "jsonrpc": "2.0",
  "id": 1
}'

Input Schema

A request to get the resource representation for a job in a project.

GetJobRequest

JSON representation
{ "projectId": string, "region": string, "jobId": string }

Fields

Fields
`projectId`	`string` Required. The ID of the Google Cloud Platform project that the job belongs to.
`region`	`string` Required. The Dataproc region in which to handle the request.
`jobId`	`string` Required. The job ID.

projectId

string

Required. The ID of the Google Cloud Platform project that the job belongs to.

region

string

Required. The Dataproc region in which to handle the request.

jobId

string

Required. The job ID.

Output Schema

A Dataproc job resource.

Job

JSON representation

JSON representation
{ "reference": { object (`JobReference`) }, "placement": { object (`JobPlacement`) }, "status": { object (`JobStatus`) }, "statusHistory": [ { object (`JobStatus`) } ], "yarnApplications": [ { object (`YarnApplication`) } ], "driverOutputResourceUri": string, "driverControlFilesUri": string, "labels": { string: string, ... }, "scheduling": { object (`JobScheduling`) }, "jobUuid": string, "done": boolean, "driverSchedulingConfig": { object (`DriverSchedulingConfig`) }, // Union field `type_job` can be only one of the following: "hadoopJob": { object (`HadoopJob`) }, "sparkJob": { object (`SparkJob`) }, "pysparkJob": { object (`PySparkJob`) }, "hiveJob": { object (`HiveJob`) }, "pigJob": { object (`PigJob`) }, "sparkRJob": { object (`SparkRJob`) }, "sparkSqlJob": { object (`SparkSqlJob`) }, "prestoJob": { object (`PrestoJob`) }, "trinoJob": { object (`TrinoJob`) }, "flinkJob": { object (`FlinkJob`) } // End of list of possible types for union field `type_job`. }

{
  "reference": {
    object (JobReference)
  },
  "placement": {
    object (JobPlacement)
  },
  "status": {
    object (JobStatus)
  },
  "statusHistory": [
    {
      object (JobStatus)
    }
  ],
  "yarnApplications": [
    {
      object (YarnApplication)
    }
  ],
  "driverOutputResourceUri": string,
  "driverControlFilesUri": string,
  "labels": {
    string: string,
    ...
  },
  "scheduling": {
    object (JobScheduling)
  },
  "jobUuid": string,
  "done": boolean,
  "driverSchedulingConfig": {
    object (DriverSchedulingConfig)
  },

  // Union field type_job can be only one of the following:
  "hadoopJob": {
    object (HadoopJob)
  },
  "sparkJob": {
    object (SparkJob)
  },
  "pysparkJob": {
    object (PySparkJob)
  },
  "hiveJob": {
    object (HiveJob)
  },
  "pigJob": {
    object (PigJob)
  },
  "sparkRJob": {
    object (SparkRJob)
  },
  "sparkSqlJob": {
    object (SparkSqlJob)
  },
  "prestoJob": {
    object (PrestoJob)
  },
  "trinoJob": {
    object (TrinoJob)
  },
  "flinkJob": {
    object (FlinkJob)
  }
  // End of list of possible types for union field type_job.
}

Fields
`reference`	`object (JobReference)` Optional. The fully qualified reference to the job, which can be used to obtain the equivalent REST path of the job resource. If this property is not specified when a job is created, the server generates a `job_id` .
`placement`	`object (JobPlacement)` Required. Job information, including how, when, and where to run the job.
`status`	`object (JobStatus)` Output only. The job status. Additional application-specific status information might be contained in the `type_job` and `yarn_applications` fields.
`statusHistory[]`	`object (JobStatus)` Output only. The previous job status.
`yarnApplications[]`	`object (YarnApplication)` Output only. The collection of YARN applications spun up by this job. Beta Feature: This report is available for testing purposes only. It might be changed before final release.
`driverOutputResourceUri`	`string` Output only. A URI pointing to the location of the stdout of the job's driver program.
`driverControlFilesUri`	`string` Output only. If present, the location of miscellaneous control files which can be used as part of job setup and handling. If not present, control files might be placed in the same location as `driver_output_uri`.
`labels`	`map (key: string, value: string)` Optional. The labels to associate with this job. Label keys must contain 1 to 63 characters, and must conform to RFC 1035. Label values can be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035. No more than 32 labels can be associated with a job. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`scheduling`	`object (JobScheduling)` Optional. Job scheduling configuration.
`jobUuid`	`string` Output only. A UUID that uniquely identifies a job within the project over time. This is in contrast to a user-settable reference.job_id that might be reused over time.
`done`	`boolean` Output only. Indicates whether the job is completed. If the value is `false`, the job is still in progress. If `true`, the job is completed, and `status.state` field will indicate if it was successful, failed, or cancelled.
`driverSchedulingConfig`	`object (DriverSchedulingConfig)` Optional. Driver scheduling configuration.
Union field `type_job`. Required. The application/framework-specific portion of the job. `type_job` can be only one of the following:
`hadoopJob`	`object (HadoopJob)` Optional. Job is a Hadoop job.
`sparkJob`	`object (SparkJob)` Optional. Job is a Spark job.
`pysparkJob`	`object (PySparkJob)` Optional. Job is a PySpark job.
`hiveJob`	`object (HiveJob)` Optional. Job is a Hive job.
`pigJob`	`object (PigJob)` Optional. Job is a Pig job.
`sparkRJob`	`object (SparkRJob)` Optional. Job is a SparkR job.
`sparkSqlJob`	`object (SparkSqlJob)` Optional. Job is a SparkSql job.
`prestoJob`	`object (PrestoJob)` Optional. Job is a Presto job.
`trinoJob`	`object (TrinoJob)` Optional. Job is a Trino job.
`flinkJob`	`object (FlinkJob)` Optional. Job is a Flink job.

JobReference

JSON representation
{ "projectId": string, "jobId": string }

Fields

Fields
`projectId`	`string` Optional. The ID of the Google Cloud Platform project that the job belongs to. If specified, must match the request project ID.
`jobId`	`string` Optional. The job ID, which must be unique within the project. The ID must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), or hyphens (-). The maximum length is 100 characters. If not specified by the caller, the job ID will be provided by the server.

projectId

string

Optional. The ID of the Google Cloud Platform project that the job belongs to. If specified, must match the request project ID.

jobId

string

Optional. The job ID, which must be unique within the project.

The ID must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), or hyphens (-). The maximum length is 100 characters.

If not specified by the caller, the job ID will be provided by the server.

JobPlacement

JSON representation
{ "clusterName": string, "clusterUuid": string, "clusterLabels": { string: string, ... } }

Fields

Fields
`clusterName`	`string` Required. The name of the cluster where the job will be submitted.
`clusterUuid`	`string` Output only. A cluster UUID generated by the Dataproc service when the job is submitted.
`clusterLabels`	`map (key: string, value: string)` Optional. Cluster labels to identify a cluster where the job will be submitted. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.

clusterName

string

Required. The name of the cluster where the job will be submitted.

clusterUuid

string

Output only. A cluster UUID generated by the Dataproc service when the job is submitted.

clusterLabels

map (key: string, value: string)

Optional. Cluster labels to identify a cluster where the job will be submitted.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

ClusterLabelsEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

HadoopJob

JSON representation

JSON representation
{ "args": [ string ], "jarFileUris": [ string ], "fileUris": [ string ], "archiveUris": [ string ], "properties": { string: string, ... }, "loggingConfig": { object (`LoggingConfig`) }, // Union field `driver` can be only one of the following: "mainJarFileUri": string, "mainClass": string // End of list of possible types for union field `driver`. }

{
  "args": [
    string
  ],
  "jarFileUris": [
    string
  ],
  "fileUris": [
    string
  ],
  "archiveUris": [
    string
  ],
  "properties": {
    string: string,
    ...
  },
  "loggingConfig": {
    object (LoggingConfig)
  },

  // Union field driver can be only one of the following:
  "mainJarFileUri": string,
  "mainClass": string
  // End of list of possible types for union field driver.
}

Fields
`args[]`	`string` Optional. The arguments to pass to the driver. Do not include arguments, such as `-libjars` or `-Dfoo=bar`, that can be set as job properties, since a collision might occur that causes an incorrect job submission.
`jarFileUris[]`	`string` Optional. Jar file URIs to add to the CLASSPATHs of the Hadoop driver and tasks.
`fileUris[]`	`string` Optional. HCFS (Hadoop Compatible Filesystem) URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
`archiveUris[]`	`string` Optional. HCFS URIs of archives to be extracted in the working directory of Hadoop drivers and tasks. Supported file types: .jar, .tar, .tar.gz, .tgz, or .zip.
`properties`	`map (key: string, value: string)` Optional. A mapping of property names to values, used to configure Hadoop. Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in `/etc/hadoop/conf/*-site` and classes in user code. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`loggingConfig`	`object (LoggingConfig)` Optional. The runtime log config for job execution.
Union field `driver`. Required. Indicates the location of the driver's main class. Specify either the jar file that contains the main class or the main class name. To specify both, add the jar file to `jar_file_uris`, and then specify the main class name in this property. `driver` can be only one of the following:
`mainJarFileUri`	`string` The HCFS URI of the jar file containing the main class. Examples: 'gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar' 'hdfs:/tmp/test-samples/custom-wordcount.jar' 'file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar'
`mainClass`	`string` The name of the driver's main class. The jar file containing the class must be in the default CLASSPATH or specified in `jar_file_uris`.

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

LoggingConfig

JSON representation
{ "driverLogLevels": { string: enum (`Level`), ... } }

Fields

Fields
`driverLogLevels`	`map (key: string, value: enum (Level))` The per-package log levels for the driver. This can include "root" package name to configure rootLogger. Examples: - 'com.google = FATAL' - 'root = INFO' - 'org.apache = DEBUG' An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.

driverLogLevels

map (key: string, value: enum (Level))

The per-package log levels for the driver. This can include "root" package name to configure rootLogger. Examples: - 'com.google = FATAL' - 'root = INFO' - 'org.apache = DEBUG'

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

DriverLogLevelsEntry

JSON representation
{ "key": string, "value": enum (`Level`) }

Fields
`key`	`string`
`value`	`enum (Level)`

SparkJob

JSON representation

JSON representation
{ "args": [ string ], "jarFileUris": [ string ], "fileUris": [ string ], "archiveUris": [ string ], "properties": { string: string, ... }, "loggingConfig": { object (`LoggingConfig`) }, // Union field `driver` can be only one of the following: "mainJarFileUri": string, "mainClass": string // End of list of possible types for union field `driver`. }

{
  "args": [
    string
  ],
  "jarFileUris": [
    string
  ],
  "fileUris": [
    string
  ],
  "archiveUris": [
    string
  ],
  "properties": {
    string: string,
    ...
  },
  "loggingConfig": {
    object (LoggingConfig)
  },

  // Union field driver can be only one of the following:
  "mainJarFileUri": string,
  "mainClass": string
  // End of list of possible types for union field driver.
}

Fields
`args[]`	`string` Optional. The arguments to pass to the driver. Do not include arguments, such as `--conf`, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
`jarFileUris[]`	`string` Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Spark driver and tasks.
`fileUris[]`	`string` Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
`archiveUris[]`	`string` Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
`properties`	`map (key: string, value: string)` Optional. A mapping of property names to values, used to configure Spark. Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`loggingConfig`	`object (LoggingConfig)` Optional. The runtime log config for job execution.
Union field `driver`. Required. The specification of the main method to call to drive the job. Specify either the jar file that contains the main class or the main class name. To pass both a main jar and a main class in that jar, add the jar to `jarFileUris`, and then specify the main class name in `mainClass`. `driver` can be only one of the following:
`mainJarFileUri`	`string` The HCFS URI of the jar file that contains the main class.
`mainClass`	`string` The name of the driver's main class. The jar file that contains the class must be in the default CLASSPATH or specified in SparkJob.jar_file_uris.

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

PySparkJob

JSON representation

JSON representation
{ "mainPythonFileUri": string, "args": [ string ], "pythonFileUris": [ string ], "jarFileUris": [ string ], "fileUris": [ string ], "archiveUris": [ string ], "properties": { string: string, ... }, "loggingConfig": { object (`LoggingConfig`) } }

{
  "mainPythonFileUri": string,
  "args": [
    string
  ],
  "pythonFileUris": [
    string
  ],
  "jarFileUris": [
    string
  ],
  "fileUris": [
    string
  ],
  "archiveUris": [
    string
  ],
  "properties": {
    string: string,
    ...
  },
  "loggingConfig": {
    object (LoggingConfig)
  }
}

Fields
`mainPythonFileUri`	`string` Required. The HCFS URI of the main Python file to use as the driver. Must be a .py file.
`args[]`	`string` Optional. The arguments to pass to the driver. Do not include arguments, such as `--conf`, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
`pythonFileUris[]`	`string` Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
`jarFileUris[]`	`string` Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Python driver and tasks.
`fileUris[]`	`string` Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
`archiveUris[]`	`string` Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip. Note: Spark applications must be deployed in cluster mode for correct environment propagation.
`properties`	`map (key: string, value: string)` Optional. A mapping of property names to values, used to configure PySpark. Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`loggingConfig`	`object (LoggingConfig)` Optional. The runtime log config for job execution.

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

HiveJob

JSON representation

JSON representation
{ "continueOnFailure": boolean, "scriptVariables": { string: string, ... }, "properties": { string: string, ... }, "jarFileUris": [ string ], // Union field `queries` can be only one of the following: "queryFileUri": string, "queryList": { object (`QueryList`) } // End of list of possible types for union field `queries`. }

{
  "continueOnFailure": boolean,
  "scriptVariables": {
    string: string,
    ...
  },
  "properties": {
    string: string,
    ...
  },
  "jarFileUris": [
    string
  ],

  // Union field queries can be only one of the following:
  "queryFileUri": string,
  "queryList": {
    object (QueryList)
  }
  // End of list of possible types for union field queries.
}

Fields
`continueOnFailure`	`boolean` Optional. Whether to continue executing queries if a query fails. The default value is `false`. Setting to `true` can be useful when executing independent parallel queries.
`scriptVariables`	`map (key: string, value: string)` Optional. Mapping of query variable names to values (equivalent to the Hive command: `SET name="value";`). An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`properties`	`map (key: string, value: string)` Optional. A mapping of property names and values, used to configure Hive. Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in `/etc/hadoop/conf/*-site.xml`, /etc/hive/conf/hive-site.xml, and classes in user code. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`jarFileUris[]`	`string` Optional. HCFS URIs of jar files to add to the CLASSPATH of the Hive server and Hadoop MapReduce (MR) tasks. Can contain Hive SerDes and UDFs.
Union field `queries`. Required. The sequence of Hive queries to execute, specified as either an HCFS file URI or a list of queries. `queries` can be only one of the following:
`queryFileUri`	`string` The HCFS URI of the script that contains Hive queries.
`queryList`	`object (QueryList)` A list of queries.

QueryList

JSON representation
{ "queries": [ string ] }

Fields

Fields
`queries[]`	`string` Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: `"hiveJob": { "queryList": { "queries": [ "query1", "query2", "query3;query4", ] } }`

queries[]

string

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob:

"hiveJob": {
  "queryList": {
    "queries": [
      "query1",
      "query2",
      "query3;query4",
    ]
  }
}

ScriptVariablesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

PigJob

JSON representation

JSON representation
{ "continueOnFailure": boolean, "scriptVariables": { string: string, ... }, "properties": { string: string, ... }, "jarFileUris": [ string ], "loggingConfig": { object (`LoggingConfig`) }, // Union field `queries` can be only one of the following: "queryFileUri": string, "queryList": { object (`QueryList`) } // End of list of possible types for union field `queries`. }

{
  "continueOnFailure": boolean,
  "scriptVariables": {
    string: string,
    ...
  },
  "properties": {
    string: string,
    ...
  },
  "jarFileUris": [
    string
  ],
  "loggingConfig": {
    object (LoggingConfig)
  },

  // Union field queries can be only one of the following:
  "queryFileUri": string,
  "queryList": {
    object (QueryList)
  }
  // End of list of possible types for union field queries.
}

Fields
`continueOnFailure`	`boolean` Optional. Whether to continue executing queries if a query fails. The default value is `false`. Setting to `true` can be useful when executing independent parallel queries.
`scriptVariables`	`map (key: string, value: string)` Optional. Mapping of query variable names to values (equivalent to the Pig command: `name=[value]`). An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`properties`	`map (key: string, value: string)` Optional. A mapping of property names to values, used to configure Pig. Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in `/etc/hadoop/conf/*-site.xml`, /etc/pig/conf/pig.properties, and classes in user code. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`jarFileUris[]`	`string` Optional. HCFS URIs of jar files to add to the CLASSPATH of the Pig Client and Hadoop MapReduce (MR) tasks. Can contain Pig UDFs.
`loggingConfig`	`object (LoggingConfig)` Optional. The runtime log config for job execution.
Union field `queries`. Required. The sequence of Pig queries to execute, specified as an HCFS file URI or a list of queries. `queries` can be only one of the following:
`queryFileUri`	`string` The HCFS URI of the script that contains the Pig queries.
`queryList`	`object (QueryList)` A list of queries.

ScriptVariablesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

SparkRJob

JSON representation
{ "mainRFileUri": string, "args": [ string ], "fileUris": [ string ], "archiveUris": [ string ], "properties": { string: string, ... }, "loggingConfig": { object (`LoggingConfig`) } }

Fields
`mainRFileUri`	`string` Required. The HCFS URI of the main R file to use as the driver. Must be a .R file.
`args[]`	`string` Optional. The arguments to pass to the driver. Do not include arguments, such as `--conf`, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
`fileUris[]`	`string` Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.
`archiveUris[]`	`string` Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.
`properties`	`map (key: string, value: string)` Optional. A mapping of property names to values, used to configure SparkR. Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`loggingConfig`	`object (LoggingConfig)` Optional. The runtime log config for job execution.

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

SparkSqlJob

JSON representation

JSON representation
{ "scriptVariables": { string: string, ... }, "properties": { string: string, ... }, "jarFileUris": [ string ], "loggingConfig": { object (`LoggingConfig`) }, // Union field `queries` can be only one of the following: "queryFileUri": string, "queryList": { object (`QueryList`) } // End of list of possible types for union field `queries`. }

{
  "scriptVariables": {
    string: string,
    ...
  },
  "properties": {
    string: string,
    ...
  },
  "jarFileUris": [
    string
  ],
  "loggingConfig": {
    object (LoggingConfig)
  },

  // Union field queries can be only one of the following:
  "queryFileUri": string,
  "queryList": {
    object (QueryList)
  }
  // End of list of possible types for union field queries.
}

Fields
`scriptVariables`	`map (key: string, value: string)` Optional. Mapping of query variable names to values (equivalent to the Spark SQL command: SET `name="value";`). An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`properties`	`map (key: string, value: string)` Optional. A mapping of property names to values, used to configure Spark SQL's SparkConf. Properties that conflict with values set by the Dataproc API might be overwritten. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`jarFileUris[]`	`string` Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.
`loggingConfig`	`object (LoggingConfig)` Optional. The runtime log config for job execution.
Union field `queries`. Required. The sequence of Spark SQL queries to execute, specified as either an HCFS file URI or as a list of queries. `queries` can be only one of the following:
`queryFileUri`	`string` The HCFS URI of the script that contains SQL queries.
`queryList`	`object (QueryList)` A list of queries.

ScriptVariablesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

PrestoJob

JSON representation

{
  "continueOnFailure": boolean,
  "outputFormat": string,
  "clientTags": [
    string
  ],
  "properties": {
    string: string,
    ...
  },
  "loggingConfig": {
    object (LoggingConfig)
  },

  // Union field queries can be only one of the following:
  "queryFileUri": string,
  "queryList": {
    object (QueryList)
  }
  // End of list of possible types for union field queries.
}

Fields
`continueOnFailure`	`boolean` Optional. Whether to continue executing queries if a query fails. The default value is `false`. Setting to `true` can be useful when executing independent parallel queries.
`outputFormat`	`string` Optional. The format in which query output will be displayed. See the Presto documentation for supported output formats
`clientTags[]`	`string` Optional. Presto client tags to attach to this query
`properties`	`map (key: string, value: string)` Optional. A mapping of property names to values. Used to set Presto session properties Equivalent to using the --session flag in the Presto CLI An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`loggingConfig`	`object (LoggingConfig)` Optional. The runtime log config for job execution.
Union field `queries`. Required. The sequence of Presto queries to execute, specified as either an HCFS file URI or as a list of queries. `queries` can be only one of the following:
`queryFileUri`	`string` The HCFS URI of the script that contains SQL queries.
`queryList`	`object (QueryList)` A list of queries.

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

TrinoJob

JSON representation

{
  "continueOnFailure": boolean,
  "outputFormat": string,
  "clientTags": [
    string
  ],
  "properties": {
    string: string,
    ...
  },
  "loggingConfig": {
    object (LoggingConfig)
  },

  // Union field queries can be only one of the following:
  "queryFileUri": string,
  "queryList": {
    object (QueryList)
  }
  // End of list of possible types for union field queries.
}

Fields
`continueOnFailure`	`boolean` Optional. Whether to continue executing queries if a query fails. The default value is `false`. Setting to `true` can be useful when executing independent parallel queries.
`outputFormat`	`string` Optional. The format in which query output will be displayed. See the Trino documentation for supported output formats
`clientTags[]`	`string` Optional. Trino client tags to attach to this query
`properties`	`map (key: string, value: string)` Optional. A mapping of property names to values. Used to set Trino session properties Equivalent to using the --session flag in the Trino CLI An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`loggingConfig`	`object (LoggingConfig)` Optional. The runtime log config for job execution.
Union field `queries`. Required. The sequence of Trino queries to execute, specified as either an HCFS file URI or as a list of queries. `queries` can be only one of the following:
`queryFileUri`	`string` The HCFS URI of the script that contains SQL queries.
`queryList`	`object (QueryList)` A list of queries.

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

FlinkJob

JSON representation

{
  "args": [
    string
  ],
  "jarFileUris": [
    string
  ],
  "savepointUri": string,
  "properties": {
    string: string,
    ...
  },
  "loggingConfig": {
    object (LoggingConfig)
  },

  // Union field driver can be only one of the following:
  "mainJarFileUri": string,
  "mainClass": string
  // End of list of possible types for union field driver.
}

Fields
`args[]`	`string` Optional. The arguments to pass to the driver. Do not include arguments, such as `--conf`, that can be set as job properties, since a collision might occur that causes an incorrect job submission.
`jarFileUris[]`	`string` Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Flink driver and tasks.
`savepointUri`	`string` Optional. HCFS URI of the savepoint, which contains the last saved progress for starting the current job.
`properties`	`map (key: string, value: string)` Optional. A mapping of property names to values, used to configure Flink. Properties that conflict with values set by the Dataproc API might be overwritten. Can include properties set in `/etc/flink/conf/flink-defaults.conf` and classes in user code. An object containing a list of `"key": value` pairs. Example: `{ "name": "wrench", "mass": "1.3kg", "count": "3" }`.
`loggingConfig`	`object (LoggingConfig)` Optional. The runtime log config for job execution.
Union field `driver`. Required. The specification of the main method to call to drive the job. Specify either the jar file that contains the main class or the main class name. To pass both a main jar and a main class in the jar, add the jar to `jarFileUris`, and then specify the main class name in `mainClass`. `driver` can be only one of the following:
`mainJarFileUri`	`string` The HCFS URI of the jar file that contains the main class.
`mainClass`	`string` The name of the driver's main class. The jar file that contains the class must be in the default CLASSPATH or specified in `jarFileUris`.

PropertiesEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

JobStatus

JSON representation
{ "state": enum (`State`), "details": string, "stateStartTime": string, "substate": enum (`Substate`) }

Fields
`state`	`enum (State)` Output only. A state message specifying the overall job state.
`details`	`string` Optional. Output only. Job state details, such as an error description if the state is `ERROR`.
`stateStartTime`	`string (Timestamp format)` Output only. The time when this state was entered. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: `"2014-10-02T15:01:23Z"`, `"2014-10-02T15:01:23.045123456Z"` or `"2014-10-02T15:01:23+05:30"`.
`substate`	`enum (Substate)` Output only. Additional state information, which includes status reported by the agent.

Timestamp

JSON representation
{ "seconds": string, "nanos": integer }

Fields

seconds

string (int64 format)

Represents seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z. Must be between -62135596800 and 253402300799 inclusive (which corresponds to 0001-01-01T00:00:00Z to 9999-12-31T23:59:59Z).

nanos

integer

Non-negative fractions of a second at nanosecond resolution. This field is the nanosecond portion of the duration, not an alternative to seconds. Negative second values with fractions must still have non-negative nanos values that count forward in time. Must be between 0 and 999,999,999 inclusive.

YarnApplication

JSON representation
{ "name": string, "state": enum (`State`), "progress": number, "trackingUrl": string, "vcoreSeconds": string, "memoryMbSeconds": string }

Fields
`name`	`string` Required. The application name.
`state`	`enum (State)` Required. The application state.
`progress`	`number` Required. The numerical progress of the application, from 1 to 100.
`trackingUrl`	`string` Optional. The HTTP URL of the ApplicationMaster, HistoryServer, or TimelineServer that provides application-specific information. The URL uses the internal hostname, and requires a proxy server for resolution and, possibly, access.
`vcoreSeconds`	`string (int64 format)` Optional. The cumulative CPU time consumed by the application for a job, measured in vcore-seconds.
`memoryMbSeconds`	`string (int64 format)` Optional. The cumulative memory usage of the application for a job, measured in mb-seconds.

LabelsEntry

JSON representation
{ "key": string, "value": string }

Fields
`key`	`string`
`value`	`string`

JobScheduling

JSON representation
{ "maxFailuresPerHour": integer, "maxFailuresTotal": integer }

Fields

maxFailuresPerHour

integer

Optional. Maximum number of times per hour a driver can be restarted as a result of driver exiting with non-zero code before job is reported failed.

A job might be reported as thrashing if the driver exits with a non-zero code four times within a 10-minute window.

Maximum value is 10.

Note: This restartable job option is not supported in Dataproc workflow templates.

maxFailuresTotal

integer

Optional. Maximum total number of times a driver can be restarted as a result of the driver exiting with a non-zero code. After the maximum number is reached, the job will be reported as failed.

Maximum value is 240.

Note: Currently, this restartable job option is not supported in Dataproc workflow templates.

DriverSchedulingConfig

JSON representation
{ "memoryMb": integer, "vcores": integer }

Fields

memoryMb

integer

Required. The amount of memory in MB the driver is requesting.

vcores

integer

Required. The number of vCPUs the driver is requesting.

Tool Annotations

Destructive Hint: ❌ | Idempotent Hint: ❌ | Read Only Hint: ✅ | Open World Hint: ❌

MCP Tools Reference: dataproc.googleapis.com Stay organized with collections Save and categorize content based on your preferences.

Tool: get_job

Input Schema

GetJobRequest

Output Schema

Job

JobReference

JobPlacement

ClusterLabelsEntry

HadoopJob

PropertiesEntry

LoggingConfig

DriverLogLevelsEntry

SparkJob

PropertiesEntry

PySparkJob

PropertiesEntry

HiveJob

QueryList

ScriptVariablesEntry

PropertiesEntry

PigJob

ScriptVariablesEntry

PropertiesEntry

SparkRJob

PropertiesEntry

SparkSqlJob

ScriptVariablesEntry

PropertiesEntry

PrestoJob

PropertiesEntry

TrinoJob

PropertiesEntry

FlinkJob

PropertiesEntry

JobStatus

Timestamp

YarnApplication

LabelsEntry

JobScheduling

DriverSchedulingConfig

Tool Annotations

MCP Tools Reference: dataproc.googleapis.com

Tool: `get_job`