REST Resource: projects.locations.batchPredictionJobs

Resource: BatchPredictionJob

A job that uses a Model to produce predictions on multiple input instances. If predictions for significant portion of the instances fail, the job may finish without attempting predictions for all remaining instances.

Fields
name string

Output only. Resource name of the BatchPredictionJob.

displayName string

Required. The user-defined name of this BatchPredictionJob.

model string

The name of the Model resource that produces the predictions via this job, must share the same ancestor Location. Starting this job has no impact on any existing deployments of the Model and their resources. Exactly one of model, unmanagedContainerModel, or endpoint must be set.

The model resource name may contain version id or version alias to specify the version. Example: projects/{project}/locations/{location}/models/{model}@2 or projects/{project}/locations/{location}/models/{model}@golden if no version is specified, the default version will be deployed.

The model resource could also be a publisher model. Example: publishers/{publisher}/models/{model} or projects/{project}/locations/{location}/publishers/{publisher}/models/{model}

modelVersionId string

Output only. The version id of the Model that produces the predictions via this job.

unmanagedContainerModel object (UnmanagedContainerModel)

Contains model information necessary to perform batch prediction without requiring uploading to model registry. Exactly one of model, unmanagedContainerModel, or endpoint must be set.

inputConfig object (InputConfig)

Required. Input configuration of the instances on which predictions are performed. The schema of any single instance may be specified via the Model's PredictSchemata's instanceSchemaUri.

instanceConfig object (InstanceConfig)

Configuration for how to convert batch prediction input instances to the prediction instances that are sent to the Model.

modelParameters value (Value format)

The parameters that govern the predictions. The schema of the parameters may be specified via the Model's PredictSchemata's parametersSchemaUri.

outputConfig object (OutputConfig)

Required. The Configuration specifying where output predictions should be written. The schema of any single prediction may be specified as a concatenation of Model's PredictSchemata's instanceSchemaUri and predictionSchemaUri.

dedicatedResources object (BatchDedicatedResources)

The config of resources used by the Model during the batch prediction. If the Model supports DEDICATED_RESOURCES this config may be provided (and the job will use these resources), if the Model doesn't support AUTOMATIC_RESOURCES, this config must be provided.

serviceAccount string

The service account that the DeployedModel's container runs as. If not specified, a system generated one will be used, which has minimal permissions and the custom container, if used, may not have enough permission to access other Google Cloud resources.

Users deploying the Model must have the iam.serviceAccounts.actAs permission on this service account.

manualBatchTuningParameters object (ManualBatchTuningParameters)

Immutable. Parameters configuring the batch behavior. Currently only applicable when dedicatedResources are used (in other cases Vertex AI does the tuning itself).

generateExplanation boolean

Generate explanation with the batch prediction results.

When set to true, the batch prediction output changes based on the predictionsFormat field of the BatchPredictionJob.output_config object:

  • bigquery: output includes a column named explanation. The value is a struct that conforms to the Explanation object.
  • jsonl: The JSON objects on each line include an additional entry keyed explanation. The value of the entry is a JSON object that conforms to the Explanation object.
  • csv: Generating explanations for CSV format is not supported.

If this field is set to true, either the Model.explanation_spec or explanationSpec must be populated.

explanationSpec object (ExplanationSpec)

Explanation configuration for this BatchPredictionJob. Can be specified only if generateExplanation is set to true.

This value overrides the value of Model.explanation_spec. All fields of explanationSpec are optional in the request. If a field of the explanationSpec object is not populated, the corresponding field of the Model.explanation_spec object is inherited.

outputInfo object (OutputInfo)

Output only. Information further describing the output of this job.

state enum (JobState)

Output only. The detailed state of the job.

error object (Status)

Output only. Only populated when the job's state is JOB_STATE_FAILED or JOB_STATE_CANCELLED.

partialFailures[] object (Status)

Output only. Partial failures encountered. For example, single files that can't be read. This field never exceeds 20 entries. status details fields contain standard Google Cloud error details.

resourcesConsumed object (ResourcesConsumed)

Output only. Information about resources that had been consumed by this job. Provided in real time at best effort basis, as well as a final value once the job completes.

Note: This field currently may be not populated for batch predictions that use AutoML Models.

completionStats object (CompletionStats)

Output only. Statistics on completed and failed prediction instances.

createTime string (Timestamp format)

Output only. time when the BatchPredictionJob was created.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

startTime string (Timestamp format)

Output only. time when the BatchPredictionJob for the first time entered the JOB_STATE_RUNNING state.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

endTime string (Timestamp format)

Output only. time when the BatchPredictionJob entered any of the following states: JOB_STATE_SUCCEEDED, JOB_STATE_FAILED, JOB_STATE_CANCELLED.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

updateTime string (Timestamp format)

Output only. time when the BatchPredictionJob was most recently updated.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

labels map (key: string, value: string)

The labels with user-defined metadata to organize BatchPredictionJobs.

label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed.

See https://goo.gl/xmQnxf for more information and examples of labels.

encryptionSpec object (EncryptionSpec)

Customer-managed encryption key options for a BatchPredictionJob. If this is set, then all resources created by the BatchPredictionJob will be encrypted with the provided encryption key.

modelMonitoringConfig object (ModelMonitoringConfig)

Model monitoring config will be used for analysis model behaviors, based on the input and output to the batch prediction job, as well as the provided training dataset.

modelMonitoringStatsAnomalies[] object (ModelMonitoringStatsAnomalies)

Get batch prediction job monitoring statistics.

modelMonitoringStatus object (Status)

Output only. The running status of the model monitoring pipeline.

disableContainerLogging boolean

For custom-trained Models and AutoML Tabular Models, the container of the DeployedModel instances will send stderr and stdout streams to Cloud Logging by default. Please note that the logs incur cost, which are subject to Cloud Logging pricing.

user can disable container logging by setting this flag to true.

satisfiesPzs boolean

Output only. reserved for future use.

satisfiesPzi boolean

Output only. reserved for future use.

JSON representation
{
  "name": string,
  "displayName": string,
  "model": string,
  "modelVersionId": string,
  "unmanagedContainerModel": {
    object (UnmanagedContainerModel)
  },
  "inputConfig": {
    object (InputConfig)
  },
  "instanceConfig": {
    object (InstanceConfig)
  },
  "modelParameters": value,
  "outputConfig": {
    object (OutputConfig)
  },
  "dedicatedResources": {
    object (BatchDedicatedResources)
  },
  "serviceAccount": string,
  "manualBatchTuningParameters": {
    object (ManualBatchTuningParameters)
  },
  "generateExplanation": boolean,
  "explanationSpec": {
    object (ExplanationSpec)
  },
  "outputInfo": {
    object (OutputInfo)
  },
  "state": enum (JobState),
  "error": {
    object (Status)
  },
  "partialFailures": [
    {
      object (Status)
    }
  ],
  "resourcesConsumed": {
    object (ResourcesConsumed)
  },
  "completionStats": {
    object (CompletionStats)
  },
  "createTime": string,
  "startTime": string,
  "endTime": string,
  "updateTime": string,
  "labels": {
    string: string,
    ...
  },
  "encryptionSpec": {
    object (EncryptionSpec)
  },
  "modelMonitoringConfig": {
    object (ModelMonitoringConfig)
  },
  "modelMonitoringStatsAnomalies": [
    {
      object (ModelMonitoringStatsAnomalies)
    }
  ],
  "modelMonitoringStatus": {
    object (Status)
  },
  "disableContainerLogging": boolean,
  "satisfiesPzs": boolean,
  "satisfiesPzi": boolean
}

UnmanagedContainerModel

Contains model information necessary to perform batch prediction without requiring a full model import.

Fields
artifactUri string

The path to the directory containing the Model artifact and any of its supporting files.

predictSchemata object (PredictSchemata)

Contains the schemata used in Model's predictions and explanations

containerSpec object (ModelContainerSpec)

Input only. The specification of the container that is to be used when deploying this Model.

JSON representation
{
  "artifactUri": string,
  "predictSchemata": {
    object (PredictSchemata)
  },
  "containerSpec": {
    object (ModelContainerSpec)
  }
}

PredictSchemata

Contains the schemata used in Model's predictions and explanations via PredictionService.Predict, PredictionService.Explain and BatchPredictionJob.

Fields
instanceSchemaUri string

Immutable. Points to a YAML file stored on Google Cloud Storage describing the format of a single instance, which are used in PredictRequest.instances, ExplainRequest.instances and BatchPredictionJob.input_config. The schema is defined as an OpenAPI 3.0.2 Schema Object. AutoML Models always have this field populated by Vertex AI. Note: The URI given on output will be immutable and probably different, including the URI scheme, than the one given on input. The output URI will point to a location where the user only has a read access.

parametersSchemaUri string

Immutable. Points to a YAML file stored on Google Cloud Storage describing the parameters of prediction and explanation via PredictRequest.parameters, ExplainRequest.parameters and BatchPredictionJob.model_parameters. The schema is defined as an OpenAPI 3.0.2 Schema Object. AutoML Models always have this field populated by Vertex AI, if no parameters are supported, then it is set to an empty string. Note: The URI given on output will be immutable and probably different, including the URI scheme, than the one given on input. The output URI will point to a location where the user only has a read access.

predictionSchemaUri string

Immutable. Points to a YAML file stored on Google Cloud Storage describing the format of a single prediction produced by this Model, which are returned via PredictResponse.predictions, ExplainResponse.explanations, and BatchPredictionJob.output_config. The schema is defined as an OpenAPI 3.0.2 Schema Object. AutoML Models always have this field populated by Vertex AI. Note: The URI given on output will be immutable and probably different, including the URI scheme, than the one given on input. The output URI will point to a location where the user only has a read access.

JSON representation
{
  "instanceSchemaUri": string,
  "parametersSchemaUri": string,
  "predictionSchemaUri": string
}

ModelContainerSpec

Specification of a container for serving predictions. Some fields in this message correspond to fields in the Kubernetes Container v1 core specification.

Fields
imageUri string

Required. Immutable. URI of the Docker image to be used as the custom container for serving predictions. This URI must identify an image in Artifact Registry or Container Registry. Learn more about the container publishing requirements, including permissions requirements for the Vertex AI service Agent.

The container image is ingested upon ModelService.UploadModel, stored internally, and this original path is afterwards not used.

To learn about the requirements for the Docker image itself, see Custom container requirements.

You can use the URI to one of Vertex AI's pre-built container images for prediction in this field.

command[] string

Immutable. Specifies the command that runs when the container starts. This overrides the container's ENTRYPOINT. Specify this field as an array of executable and arguments, similar to a Docker ENTRYPOINT's "exec" form, not its "shell" form.

If you do not specify this field, then the container's ENTRYPOINT runs, in conjunction with the args field or the container's CMD, if either exists. If this field is not specified and the container does not have an ENTRYPOINT, then refer to the Docker documentation about how CMD and ENTRYPOINT interact.

If you specify this field, then you can also specify the args field to provide additional arguments for this command. However, if you specify this field, then the container's CMD is ignored. See the Kubernetes documentation about how the command and args fields interact with a container's ENTRYPOINT and CMD.

In this field, you can reference environment variables set by Vertex AI and environment variables set in the env field. You cannot reference environment variables set in the Docker image. In order for environment variables to be expanded, reference them by using the following syntax:

$(VARIABLE_NAME)

Note that this differs from Bash variable expansion, which does not use parentheses. If a variable cannot be resolved, the reference in the input string is used unchanged. To avoid variable expansion, you can escape this syntax with $$; for example:

$$(VARIABLE_NAME)

This field corresponds to the command field of the Kubernetes Containers v1 core API.

args[] string

Immutable. Specifies arguments for the command that runs when the container starts. This overrides the container's CMD. Specify this field as an array of executable and arguments, similar to a Docker CMD's "default parameters" form.

If you don't specify this field but do specify the command field, then the command from the command field runs without any additional arguments. See the Kubernetes documentation about how the command and args fields interact with a container's ENTRYPOINT and CMD.

If you don't specify this field and don't specify the command field, then the container's ENTRYPOINT and CMD determine what runs based on their default behavior. See the Docker documentation about how CMD and ENTRYPOINT interact.

In this field, you can reference environment variables set by Vertex AI and environment variables set in the env field. You cannot reference environment variables set in the Docker image. In order for environment variables to be expanded, reference them by using the following syntax:

$(VARIABLE_NAME)

Note that this differs from Bash variable expansion, which does not use parentheses. If a variable cannot be resolved, the reference in the input string is used unchanged. To avoid variable expansion, you can escape this syntax with $$; for example:

$$(VARIABLE_NAME)

This field corresponds to the args field of the Kubernetes Containers v1 core API.

env[] object (EnvVar)

Immutable. List of environment variables to set in the container. After the container starts running, code running in the container can read these environment variables.

Additionally, the command and args fields can reference these variables. Later entries in this list can also reference earlier entries. For example, the following example sets the variable VAR_2 to have the value foo bar:

[
  {
    "name": "VAR_1",
    "value": "foo"
  },
  {
    "name": "VAR_2",
    "value": "$(VAR_1) bar"
  }
]

If you switch the order of the variables in the example, then the expansion does not occur.

This field corresponds to the env field of the Kubernetes Containers v1 core API.

ports[] object (Port)

Immutable. List of ports to expose from the container. Vertex AI sends any prediction requests that it receives to the first port on this list. Vertex AI also sends liveness and health checks to this port.

If you do not specify this field, it defaults to following value:

[
  {
    "containerPort": 8080
  }
]

Vertex AI does not use ports other than the first one listed. This field corresponds to the ports field of the Kubernetes Containers v1 core API.

predictRoute string

Immutable. HTTP path on the container to send prediction requests to. Vertex AI forwards requests sent using projects.locations.endpoints.predict to this path on the container's IP address and port. Vertex AI then returns the container's response in the API response.

For example, if you set this field to /foo, then when Vertex AI receives a prediction request, it forwards the request body in a POST request to the /foo path on the port of your container specified by the first value of this ModelContainerSpec's ports field.

If you don't specify this field, it defaults to the following value when you deploy this Model to an Endpoint:

/v1/endpoints/ENDPOINT/deployedModels/DEPLOYED_MODEL:predict

The placeholders in this value are replaced as follows:

  • ENDPOINT: The last segment (following endpoints/)of the Endpoint.name][] field of the Endpoint where this Model has been deployed. (Vertex AI makes this value available to your container code as the AIP_ENDPOINT_ID environment variable.)

  • DEPLOYED_MODEL: DeployedModel.id of the DeployedModel. (Vertex AI makes this value available to your container code as the AIP_DEPLOYED_MODEL_ID environment variable.)

healthRoute string

Immutable. HTTP path on the container to send health checks to. Vertex AI intermittently sends GET requests to this path on the container's IP address and port to check that the container is healthy. Read more about health checks.

For example, if you set this field to /bar, then Vertex AI intermittently sends a GET request to the /bar path on the port of your container specified by the first value of this ModelContainerSpec's ports field.

If you don't specify this field, it defaults to the following value when you deploy this Model to an Endpoint:

/v1/endpoints/ENDPOINT/deployedModels/DEPLOYED_MODEL:predict

The placeholders in this value are replaced as follows:

  • ENDPOINT: The last segment (following endpoints/)of the Endpoint.name][] field of the Endpoint where this Model has been deployed. (Vertex AI makes this value available to your container code as the AIP_ENDPOINT_ID environment variable.)

  • DEPLOYED_MODEL: DeployedModel.id of the DeployedModel. (Vertex AI makes this value available to your container code as the AIP_DEPLOYED_MODEL_ID environment variable.)

invokeRoutePrefix string

Immutable. Invoke route prefix for the custom container. "/*" is the only supported value right now. By setting this field, any non-root route on this model will be accessible with invoke http call eg: "/invoke/foo/bar", however the [PredictionService.Invoke] RPC is not supported yet.

Only one of predictRoute or invokeRoutePrefix can be set, and we default to using predictRoute if this field is not set. If this field is set, the Model can only be deployed to dedicated endpoint.

grpcPorts[] object (Port)

Immutable. List of ports to expose from the container. Vertex AI sends gRPC prediction requests that it receives to the first port on this list. Vertex AI also sends liveness and health checks to this port.

If you do not specify this field, gRPC requests to the container will be disabled.

Vertex AI does not use ports other than the first one listed. This field corresponds to the ports field of the Kubernetes Containers v1 core API.

deploymentTimeout string (Duration format)

Immutable. Deployment timeout. Limit for deployment timeout is 2 hours.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

sharedMemorySizeMb string (int64 format)

Immutable. The amount of the VM memory to reserve as the shared memory for the model in megabytes.

startupProbe object (Probe)

Immutable. Specification for Kubernetes startup probe.

healthProbe object (Probe)

Immutable. Specification for Kubernetes readiness probe.

livenessProbe object (Probe)

Immutable. Specification for Kubernetes liveness probe.

JSON representation
{
  "imageUri": string,
  "command": [
    string
  ],
  "args": [
    string
  ],
  "env": [
    {
      object (EnvVar)
    }
  ],
  "ports": [
    {
      object (Port)
    }
  ],
  "predictRoute": string,
  "healthRoute": string,
  "invokeRoutePrefix": string,
  "grpcPorts": [
    {
      object (Port)
    }
  ],
  "deploymentTimeout": string,
  "sharedMemorySizeMb": string,
  "startupProbe": {
    object (Probe)
  },
  "healthProbe": {
    object (Probe)
  },
  "livenessProbe": {
    object (Probe)
  }
}

Port

Represents a network port in a container.

Fields
containerPort integer

The number of the port to expose on the pod's IP address. Must be a valid port number, between 1 and 65535 inclusive.

JSON representation
{
  "containerPort": integer
}

Probe

Probe describes a health check to be performed against a container to determine whether it is alive or ready to receive traffic.

Fields
periodSeconds integer

How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1. Must be less than timeoutSeconds.

Maps to Kubernetes probe argument 'periodSeconds'.

timeoutSeconds integer

Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. Must be greater or equal to periodSeconds.

Maps to Kubernetes probe argument 'timeoutSeconds'.

failureThreshold integer

Number of consecutive failures before the probe is considered failed. Defaults to 3. Minimum value is 1.

Maps to Kubernetes probe argument 'failureThreshold'.

successThreshold integer

Number of consecutive successes before the probe is considered successful. Defaults to 1. Minimum value is 1.

Maps to Kubernetes probe argument 'successThreshold'.

initialDelaySeconds integer

Number of seconds to wait before starting the probe. Defaults to 0. Minimum value is 0.

Maps to Kubernetes probe argument 'initialDelaySeconds'.

probe_type Union type
probe_type can be only one of the following:
exec object (ExecAction)

ExecAction probes the health of a container by executing a command.

httpGet object (HttpGetAction)

HttpGetAction probes the health of a container by sending an HTTP GET request.

grpc object (GrpcAction)

GrpcAction probes the health of a container by sending a gRPC request.

tcpSocket object (TcpSocketAction)

TcpSocketAction probes the health of a container by opening a TCP socket connection.

JSON representation
{
  "periodSeconds": integer,
  "timeoutSeconds": integer,
  "failureThreshold": integer,
  "successThreshold": integer,
  "initialDelaySeconds": integer,

  // probe_type
  "exec": {
    object (ExecAction)
  },
  "httpGet": {
    object (HttpGetAction)
  },
  "grpc": {
    object (GrpcAction)
  },
  "tcpSocket": {
    object (TcpSocketAction)
  }
  // Union type
}

ExecAction

ExecAction specifies a command to execute.

Fields
command[] string

Command is the command line to execute inside the container, the working directory for the command is root ('/') in the container's filesystem. The command is simply exec'd, it is not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use a shell, you need to explicitly call out to that shell. Exit status of 0 is treated as live/healthy and non-zero is unhealthy.

JSON representation
{
  "command": [
    string
  ]
}

HttpGetAction

HttpGetAction describes an action based on HTTP Get requests.

Fields
path string

Path to access on the HTTP server.

port integer

Number of the port to access on the container. Number must be in the range 1 to 65535.

host string

host name to connect to, defaults to the model serving container's IP. You probably want to set "host" in httpHeaders instead.

scheme string

Scheme to use for connecting to the host. Defaults to HTTP. Acceptable values are "HTTP" or "HTTPS".

httpHeaders[] object (HttpHeader)

Custom headers to set in the request. HTTP allows repeated headers.

JSON representation
{
  "path": string,
  "port": integer,
  "host": string,
  "scheme": string,
  "httpHeaders": [
    {
      object (HttpHeader)
    }
  ]
}

HttpHeader

HttpHeader describes a custom header to be used in HTTP probes

Fields
name string

The header field name. This will be canonicalized upon output, so case-variant names will be understood as the same header.

value string

The header field value

JSON representation
{
  "name": string,
  "value": string
}

GrpcAction

GrpcAction checks the health of a container using a gRPC service.

Fields
port integer

Port number of the gRPC service. Number must be in the range 1 to 65535.

service string

service is the name of the service to place in the gRPC HealthCheckRequest. See https://github.com/grpc/grpc/blob/master/doc/health-checking.md.

If this is not specified, the default behavior is defined by gRPC.

JSON representation
{
  "port": integer,
  "service": string
}

TcpSocketAction

TcpSocketAction probes the health of a container by opening a TCP socket connection.

Fields
port integer

Number of the port to access on the container. Number must be in the range 1 to 65535.

host string

Optional: host name to connect to, defaults to the model serving container's IP.

JSON representation
{
  "port": integer,
  "host": string
}

InputConfig

Configures the input to BatchPredictionJob. See Model.supported_input_storage_formats for Model's supported input formats, and how instances should be expressed via any of them.

Fields
instancesFormat string

Required. The format in which instances are given, must be one of the Model's supportedInputStorageFormats.

source Union type
Required. The source of the input. source can be only one of the following:
gcsSource object (GcsSource)

The Cloud Storage location for the input instances.

bigquerySource object (BigQuerySource)

The BigQuery location of the input table. The schema of the table should be in the format described by the given context OpenAPI Schema, if one is provided. The table may contain additional columns that are not described by the schema, and they will be ignored.

JSON representation
{
  "instancesFormat": string,

  // source
  "gcsSource": {
    object (GcsSource)
  },
  "bigquerySource": {
    object (BigQuerySource)
  }
  // Union type
}

InstanceConfig

Configuration defining how to transform batch prediction input instances to the instances that the Model accepts.

Fields
instanceType string

The format of the instance that the Model accepts. Vertex AI will convert compatible batch prediction input instance formats to the specified format.

Supported values are:

  • object: Each input is converted to JSON object format.

    • For bigquery, each row is converted to an object.
    • For jsonl, each line of the JSONL input must be an object.
    • Does not apply to csv, file-list, tf-record, or tf-record-gzip.
  • array: Each input is converted to JSON array format.

    • For bigquery, each row is converted to an array. The order of columns is determined by the BigQuery column order, unless includedFields is populated. includedFields must be populated for specifying field orders.
    • For jsonl, if each line of the JSONL input is an object, includedFields must be populated for specifying field orders.
    • Does not apply to csv, file-list, tf-record, or tf-record-gzip.

If not specified, Vertex AI converts the batch prediction input as follows:

  • For bigquery and csv, the behavior is the same as array. The order of columns is the same as defined in the file or table, unless includedFields is populated.
  • For jsonl, the prediction instance format is determined by each line of the input.
  • For tf-record/tf-record-gzip, each record will be converted to an object in the format of {"b64": <value>}, where <value> is the Base64-encoded string of the content of the record.
  • For file-list, each file in the list will be converted to an object in the format of {"b64": <value>}, where <value> is the Base64-encoded string of the content of the file.
keyField string

The name of the field that is considered as a key.

The values identified by the key field is not included in the transformed instances that is sent to the Model. This is similar to specifying this name of the field in excludedFields. In addition, the batch prediction output will not include the instances. Instead the output will only include the value of the key field, in a field named key in the output:

  • For jsonl output format, the output will have a key field instead of the instance field.
  • For csv/bigquery output format, the output will have have a key column instead of the instance feature columns.

The input must be JSONL with objects at each line, CSV, BigQuery or TfRecord.

includedFields[] string

Fields that will be included in the prediction instance that is sent to the Model.

If instanceType is array, the order of field names in includedFields also determines the order of the values in the array.

When includedFields is populated, excludedFields must be empty.

The input must be JSONL with objects at each line, BigQuery or TfRecord.

excludedFields[] string

Fields that will be excluded in the prediction instance that is sent to the Model.

Excluded will be attached to the batch prediction output if keyField is not specified.

When excludedFields is populated, includedFields must be empty.

The input must be JSONL with objects at each line, BigQuery or TfRecord.

JSON representation
{
  "instanceType": string,
  "keyField": string,
  "includedFields": [
    string
  ],
  "excludedFields": [
    string
  ]
}

OutputConfig

Configures the output of BatchPredictionJob. See Model.supported_output_storage_formats for supported output formats, and how predictions are expressed via any of them.

Fields
predictionsFormat string

Required. The format in which Vertex AI gives the predictions, must be one of the Model's supportedOutputStorageFormats.

destination Union type
Required. The destination of the output. destination can be only one of the following:
gcsDestination object (GcsDestination)

The Cloud Storage location of the directory where the output is to be written to. In the given directory a new directory is created. Its name is prediction-<model-display-name>-<job-create-time>, where timestamp is in YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format. Inside of it files predictions_0001.<extension>, predictions_0002.<extension>, ..., predictions_N.<extension> are created where <extension> depends on chosen predictionsFormat, and N may equal 0001 and depends on the total number of successfully predicted instances. If the Model has both instance and prediction schemata defined then each such file contains predictions as per the predictionsFormat. If prediction for any instance failed (partially or completely), then an additional errors_0001.<extension>, errors_0002.<extension>,..., errors_N.<extension> files are created (N depends on total number of failed predictions). These files contain the failed instances, as per their schema, followed by an additional error field which as value has google.rpc.Status containing only code and message fields.

bigqueryDestination object (BigQueryDestination)

The BigQuery project or dataset location where the output is to be written to. If project is provided, a new dataset is created with name prediction_<model-display-name>_<job-create-time> where is made BigQuery-dataset-name compatible (for example, most special characters become underscores), and timestamp is in YYYY_MM_DDThh_mm_ss_sssZ "based on ISO-8601" format. In the dataset two tables will be created, predictions, and errors. If the Model has both instance and prediction schemata defined then the tables have columns as follows: The predictions table contains instances for which the prediction succeeded, it has columns as per a concatenation of the Model's instance and prediction schemata. The errors table contains rows for which the prediction has failed, it has instance columns, as per the instance schema, followed by a single "errors" column, which as values has google.rpc.Status represented as a STRUCT, and containing only code and message.

JSON representation
{
  "predictionsFormat": string,

  // destination
  "gcsDestination": {
    object (GcsDestination)
  },
  "bigqueryDestination": {
    object (BigQueryDestination)
  }
  // Union type
}

BatchDedicatedResources

A description of resources that are used for performing batch operations, are dedicated to a Model, and need manual configuration.

Fields
machineSpec object (MachineSpec)

Required. Immutable. The specification of a single machine.

startingReplicaCount integer

Immutable. The number of machine replicas used at the start of the batch operation. If not set, Vertex AI decides starting number, not greater than maxReplicaCount

maxReplicaCount integer

Immutable. The maximum number of machine replicas the batch operation may be scaled to. The default value is 10.

flexStart object (FlexStart)

Optional. Immutable. If set, use DWS resource to schedule the deployment workload. reference: (https://cloud.google.com/blog/products/compute/introducing-dynamic-workload-scheduler)

spot boolean

Optional. If true, schedule the deployment workload on spot VMs.

JSON representation
{
  "machineSpec": {
    object (MachineSpec)
  },
  "startingReplicaCount": integer,
  "maxReplicaCount": integer,
  "flexStart": {
    object (FlexStart)
  },
  "spot": boolean
}

MachineSpec

Specification of a single machine.

Fields
machineType string

Immutable. The type of the machine.

See the list of machine types supported for prediction

See the list of machine types supported for custom training.

For DeployedModel this field is optional, and the default value is n1-standard-2. For BatchPredictionJob or as part of WorkerPoolSpec this field is required.

acceleratorType enum (AcceleratorType)

Immutable. The type of accelerator(s) that may be attached to the machine as per acceleratorCount.

acceleratorCount integer

The number of accelerators to attach to the machine.

For accelerator optimized machine types (https://cloud.google.com/compute/docs/accelerator-optimized-machines), One may set the acceleratorCount from 1 to N for machine with N GPUs. If acceleratorCount is less than or equal to N / 2, Vertex will co-schedule the replicas of the model into the same VM to save cost.

For example, if the machine type is a3-highgpu-8g, which has 8 H100 GPUs, one can set acceleratorCount to 1 to 8. If acceleratorCount is 1, 2, 3, or 4, Vertex will co-schedule 8, 4, 2, or 2 replicas of the model into the same VM to save cost.

When co-scheduling, CPU, memory and storage on the VM will be distributed to replicas on the VM. For example, one can expect a co-scheduled replica requesting 2 GPUs out of a 8-GPU VM will receive 25% of the CPU, memory and storage of the VM.

Note that the feature is not compatible with multihostGpuNodeCount. When multihostGpuNodeCount is set, the co-scheduling will not be enabled.

gpuPartitionSize string

Optional. Immutable. The Nvidia GPU partition size.

When specified, the requested accelerators will be partitioned into smaller GPU partitions. For example, if the request is for 8 units of NVIDIA A100 GPUs, and gpuPartitionSize="1g.10gb", the service will create 8 * 7 = 56 partitioned MIG instances.

The partition size must be a value supported by the requested accelerator. Refer to Nvidia GPU Partitioning for the available partition sizes.

If set, the acceleratorCount should be set to 1.

tpuTopology string

Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpuTopology: "2x2x1").

multihostGpuNodeCount integer

Optional. Immutable. The number of nodes per replica for multihost GPU deployments.

reservationAffinity object (ReservationAffinity)

Optional. Immutable. Configuration controlling how this resource pool consumes reservation.

minGpuDriverVersion string

Optional. Immutable. The minimum GPU driver version that this machine requires. For example, "535.104.06". If not specified, the default GPU driver version will be used by the underlying infrastructure.

JSON representation
{
  "machineType": string,
  "acceleratorType": enum (AcceleratorType),
  "acceleratorCount": integer,
  "gpuPartitionSize": string,
  "tpuTopology": string,
  "multihostGpuNodeCount": integer,
  "reservationAffinity": {
    object (ReservationAffinity)
  },
  "minGpuDriverVersion": string
}

AcceleratorType

Represents a hardware accelerator type.

Enums
ACCELERATOR_TYPE_UNSPECIFIED Unspecified accelerator type, which means no accelerator.
NVIDIA_TESLA_K80

Deprecated: Nvidia Tesla K80 GPU has reached end of support, see https://cloud.google.com/compute/docs/eol/k80-eol.

NVIDIA_TESLA_P100 Nvidia Tesla P100 GPU.
NVIDIA_TESLA_V100 Nvidia Tesla V100 GPU.
NVIDIA_TESLA_P4 Nvidia Tesla P4 GPU.
NVIDIA_TESLA_T4 Nvidia Tesla T4 GPU.
NVIDIA_TESLA_A100 Nvidia Tesla A100 GPU.
NVIDIA_A100_80GB Nvidia A100 80GB GPU.
NVIDIA_L4 Nvidia L4 GPU.
NVIDIA_H100_80GB Nvidia H100 80Gb GPU.
NVIDIA_H100_MEGA_80GB Nvidia H100 Mega 80Gb GPU.
NVIDIA_H200_141GB Nvidia H200 141Gb GPU.
NVIDIA_B200 Nvidia B200 GPU.
NVIDIA_GB200 Nvidia GB200 GPU.
NVIDIA_RTX_PRO_6000 Nvidia RTX Pro 6000 GPU.
TPU_V2 TPU v2.
TPU_V3 TPU v3.
TPU_V4_POD TPU v4.
TPU_V5_LITEPOD TPU v5.

ReservationAffinity

A ReservationAffinity can be used to configure a Vertex AI resource (e.g., a DeployedModel) to draw its Compute Engine resources from a Shared Reservation, or exclusively from on-demand capacity.

Fields
reservationAffinityType enum (Type)

Required. Specifies the reservation affinity type.

key string

Optional. Corresponds to the label key of a reservation resource. To target a SPECIFIC_RESERVATION by name, use compute.googleapis.com/reservation-name as the key and specify the name of your reservation as its value.

values[] string

Optional. Corresponds to the label values of a reservation resource. This must be the full resource name of the reservation or reservation block.

JSON representation
{
  "reservationAffinityType": enum (Type),
  "key": string,
  "values": [
    string
  ]
}

Type

Identifies a type of reservation affinity.

Enums
TYPE_UNSPECIFIED Default value. This should not be used.
NO_RESERVATION Do not consume from any reserved capacity, only use on-demand.
ANY_RESERVATION Consume any reservation available, falling back to on-demand.
SPECIFIC_RESERVATION Consume from a specific reservation. When chosen, the reservation must be identified via the key and values fields.

FlexStart

FlexStart is used to schedule the deployment workload on DWS resource. It contains the max duration of the deployment.

Fields
maxRuntimeDuration string (Duration format)

The max duration of the deployment is maxRuntimeDuration. The deployment will be terminated after the duration. The maxRuntimeDuration can be set up to 7 days.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

JSON representation
{
  "maxRuntimeDuration": string
}

ManualBatchTuningParameters

Manual batch tuning parameters.

Fields
batchSize integer

Immutable. The number of the records (e.g. instances) of the operation given in each batch to a machine replica. Machine type, and size of a single record should be considered when setting this parameter, higher value speeds up the batch operation's execution, but too high value will result in a whole batch not fitting in a machine's memory, and the whole operation will fail. The default value is 64.

JSON representation
{
  "batchSize": integer
}

ExplanationSpec

Specification of Model explanation.

Fields
parameters object (ExplanationParameters)

Required. Parameters that configure explaining of the Model's predictions.

metadata object (ExplanationMetadata)

Optional. metadata describing the Model's input and output for explanation.

JSON representation
{
  "parameters": {
    object (ExplanationParameters)
  },
  "metadata": {
    object (ExplanationMetadata)
  }
}

ExplanationParameters

Parameters to configure explaining for Model's predictions.

Fields
topK integer

If populated, returns attributions for top K indices of outputs (defaults to 1). Only applies to Models that predicts more than one outputs (e,g, multi-class Models). When set to -1, returns explanations for all outputs.

outputIndices array (ListValue format)

If populated, only returns attributions that have outputIndex contained in outputIndices. It must be an ndarray of integers, with the same shape of the output it's explaining.

If not populated, returns attributions for topK indices of outputs. If neither topK nor outputIndices is populated, returns the argmax index of the outputs.

Only applicable to Models that predict multiple outputs (e,g, multi-class Models that predict multiple classes).

method Union type
method can be only one of the following:
sampledShapleyAttribution object (SampledShapleyAttribution)

An attribution method that approximates Shapley values for features that contribute to the label being predicted. A sampling strategy is used to approximate the value rather than considering all subsets of features. Refer to this paper for model details: https://arxiv.org/abs/1306.4265.

integratedGradientsAttribution object (IntegratedGradientsAttribution)

An attribution method that computes Aumann-Shapley values taking advantage of the model's fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1703.01365

xraiAttribution object (XraiAttribution)

An attribution method that redistributes Integrated Gradients attribution to segmented regions, taking advantage of the model's fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1906.02825

XRAI currently performs better on natural images, like a picture of a house or an animal. If the images are taken in artificial environments, like a lab or manufacturing line, or from diagnostic equipment, like x-rays or quality-control cameras, use Integrated Gradients instead.

examples object (Examples)

Example-based explanations that returns the nearest neighbors from the provided dataset.

JSON representation
{
  "topK": integer,
  "outputIndices": array,

  // method
  "sampledShapleyAttribution": {
    object (SampledShapleyAttribution)
  },
  "integratedGradientsAttribution": {
    object (IntegratedGradientsAttribution)
  },
  "xraiAttribution": {
    object (XraiAttribution)
  },
  "examples": {
    object (Examples)
  }
  // Union type
}

SampledShapleyAttribution

An attribution method that approximates Shapley values for features that contribute to the label being predicted. A sampling strategy is used to approximate the value rather than considering all subsets of features.

Fields
pathCount integer

Required. The number of feature permutations to consider when approximating the Shapley values.

Valid range of its value is [1, 50], inclusively.

JSON representation
{
  "pathCount": integer
}

IntegratedGradientsAttribution

An attribution method that computes the Aumann-Shapley value taking advantage of the model's fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1703.01365

Fields
stepCount integer

Required. The number of steps for approximating the path integral. A good value to start is 50 and gradually increase until the sum to diff property is within the desired error range.

Valid range of its value is [1, 100], inclusively.

smoothGradConfig object (SmoothGradConfig)

Config for SmoothGrad approximation of gradients.

When enabled, the gradients are approximated by averaging the gradients from noisy samples in the vicinity of the inputs. Adding noise can help improve the computed gradients. Refer to this paper for more details: https://arxiv.org/pdf/1706.03825.pdf

blurBaselineConfig object (BlurBaselineConfig)

Config for IG with blur baseline.

When enabled, a linear path from the maximally blurred image to the input image is created. Using a blurred baseline instead of zero (black image) is motivated by the BlurIG approach explained here: https://arxiv.org/abs/2004.03383

JSON representation
{
  "stepCount": integer,
  "smoothGradConfig": {
    object (SmoothGradConfig)
  },
  "blurBaselineConfig": {
    object (BlurBaselineConfig)
  }
}

SmoothGradConfig

Config for SmoothGrad approximation of gradients.

When enabled, the gradients are approximated by averaging the gradients from noisy samples in the vicinity of the inputs. Adding noise can help improve the computed gradients. Refer to this paper for more details: https://arxiv.org/pdf/1706.03825.pdf

Fields
noisySampleCount integer

The number of gradient samples to use for approximation. The higher this number, the more accurate the gradient is, but the runtime complexity increases by this factor as well. Valid range of its value is [1, 50]. Defaults to 3.

GradientNoiseSigma Union type
Represents the standard deviation of the gaussian kernel that will be used to add noise to the interpolated inputs prior to computing gradients. GradientNoiseSigma can be only one of the following:
noiseSigma number

This is a single float value and will be used to add noise to all the features. Use this field when all features are normalized to have the same distribution: scale to range [0, 1], [-1, 1] or z-scoring, where features are normalized to have 0-mean and 1-variance. Learn more about normalization.

For best results the recommended value is about 10% - 20% of the standard deviation of the input feature. Refer to section 3.2 of the SmoothGrad paper: https://arxiv.org/pdf/1706.03825.pdf. Defaults to 0.1.

If the distribution is different per feature, set featureNoiseSigma instead for each feature.

featureNoiseSigma object (FeatureNoiseSigma)

This is similar to noiseSigma, but provides additional flexibility. A separate noise sigma can be provided for each feature, which is useful if their distributions are different. No noise is added to features that are not set. If this field is unset, noiseSigma will be used for all features.

JSON representation
{
  "noisySampleCount": integer,

  // GradientNoiseSigma
  "noiseSigma": number,
  "featureNoiseSigma": {
    object (FeatureNoiseSigma)
  }
  // Union type
}

FeatureNoiseSigma

Noise sigma by features. Noise sigma represents the standard deviation of the gaussian kernel that will be used to add noise to interpolated inputs prior to computing gradients.

Fields
noiseSigma[] object (NoiseSigmaForFeature)

Noise sigma per feature. No noise is added to features that are not set.

JSON representation
{
  "noiseSigma": [
    {
      object (NoiseSigmaForFeature)
    }
  ]
}

NoiseSigmaForFeature

Noise sigma for a single feature.

Fields
name string

The name of the input feature for which noise sigma is provided. The features are defined in explanation metadata inputs.

sigma number

This represents the standard deviation of the Gaussian kernel that will be used to add noise to the feature prior to computing gradients. Similar to noiseSigma but represents the noise added to the current feature. Defaults to 0.1.

JSON representation
{
  "name": string,
  "sigma": number
}

BlurBaselineConfig

Config for blur baseline.

When enabled, a linear path from the maximally blurred image to the input image is created. Using a blurred baseline instead of zero (black image) is motivated by the BlurIG approach explained here: https://arxiv.org/abs/2004.03383

Fields
maxBlurSigma number

The standard deviation of the blur kernel for the blurred baseline. The same blurring parameter is used for both the height and the width dimension. If not set, the method defaults to the zero (i.e. black for images) baseline.

JSON representation
{
  "maxBlurSigma": number
}

XraiAttribution

An explanation method that redistributes Integrated Gradients attributions to segmented regions, taking advantage of the model's fully differentiable structure. Refer to this paper for more details: https://arxiv.org/abs/1906.02825

Supported only by image Models.

Fields
stepCount integer

Required. The number of steps for approximating the path integral. A good value to start is 50 and gradually increase until the sum to diff property is met within the desired error range.

Valid range of its value is [1, 100], inclusively.

smoothGradConfig object (SmoothGradConfig)

Config for SmoothGrad approximation of gradients.

When enabled, the gradients are approximated by averaging the gradients from noisy samples in the vicinity of the inputs. Adding noise can help improve the computed gradients. Refer to this paper for more details: https://arxiv.org/pdf/1706.03825.pdf

blurBaselineConfig object (BlurBaselineConfig)

Config for XRAI with blur baseline.

When enabled, a linear path from the maximally blurred image to the input image is created. Using a blurred baseline instead of zero (black image) is motivated by the BlurIG approach explained here: https://arxiv.org/abs/2004.03383

JSON representation
{
  "stepCount": integer,
  "smoothGradConfig": {
    object (SmoothGradConfig)
  },
  "blurBaselineConfig": {
    object (BlurBaselineConfig)
  }
}

Examples

Example-based explainability that returns the nearest neighbors from the provided dataset.

Fields
gcsSource object (GcsSource)

The Cloud Storage locations that contain the instances to be indexed for approximate nearest neighbor search.

neighborCount integer

The number of neighbors to return when querying for examples.

source Union type
source can be only one of the following:
exampleGcsSource object (ExampleGcsSource)

The Cloud Storage input instances.

config Union type
config can be only one of the following:
nearestNeighborSearchConfig value (Value format)

The full configuration for the generated index, the semantics are the same as metadata and should match NearestNeighborSearchConfig.

presets object (Presets)

Simplified preset configuration, which automatically sets configuration values based on the desired query speed-precision trade-off and modality.

JSON representation
{
  "gcsSource": {
    object (GcsSource)
  },
  "neighborCount": integer,

  // source
  "exampleGcsSource": {
    object (ExampleGcsSource)
  }
  // Union type

  // config
  "nearestNeighborSearchConfig": value,
  "presets": {
    object (Presets)
  }
  // Union type
}

ExampleGcsSource

The Cloud Storage input instances.

Fields
dataFormat enum (DataFormat)

The format in which instances are given, if not specified, assume it's JSONL format. Currently only JSONL format is supported.

gcsSource object (GcsSource)

The Cloud Storage location for the input instances.

JSON representation
{
  "dataFormat": enum (DataFormat),
  "gcsSource": {
    object (GcsSource)
  }
}

DataFormat

The format of the input example instances.

Enums
DATA_FORMAT_UNSPECIFIED Format unspecified, used when unset.
JSONL Examples are stored in JSONL files.

Presets

Preset configuration for example-based explanations

Fields
modality enum (Modality)

The modality of the uploaded model, which automatically configures the distance measurement and feature normalization for the underlying example index and queries. If your model does not precisely fit one of these types, it is okay to choose the closest type.

query enum (Query)

Preset option controlling parameters for speed-precision trade-off when querying for examples. If omitted, defaults to PRECISE.

JSON representation
{
  "modality": enum (Modality),
  "query": enum (Query)
}

Query

Preset option controlling parameters for query speed-precision trade-off

Enums
PRECISE More precise neighbors as a trade-off against slower response.
FAST Faster response as a trade-off against less precise neighbors.

Modality

Preset option controlling parameters for different modalities

Enums
MODALITY_UNSPECIFIED Should not be set. Added as a recommended best practice for enums
IMAGE IMAGE modality
TEXT TEXT modality
TABULAR TABULAR modality

ExplanationMetadata

metadata describing the Model's input and output for explanation.

Fields
inputs map (key: string, value: object (InputMetadata))

Required. Map from feature names to feature input metadata. Keys are the name of the features. Values are the specification of the feature.

An empty InputMetadata is valid. It describes a text feature which has the name specified as the key in ExplanationMetadata.inputs. The baseline of the empty feature is chosen by Vertex AI.

For Vertex AI-provided Tensorflow images, the key can be any friendly name of the feature. Once specified, featureAttributions are keyed by this key (if not grouped with another feature).

For custom images, the key must match with the key in instance.

outputs map (key: string, value: object (OutputMetadata))

Required. Map from output names to output metadata.

For Vertex AI-provided Tensorflow images, keys can be any user defined string that consists of any UTF-8 characters.

For custom images, keys are the name of the output field in the prediction to be explained.

Currently only one key is allowed.

featureAttributionsSchemaUri string

Points to a YAML file stored on Google Cloud Storage describing the format of the feature attributions. The schema is defined as an OpenAPI 3.0.2 Schema Object. AutoML tabular Models always have this field populated by Vertex AI. Note: The URI given on output may be different, including the URI scheme, than the one given on input. The output URI will point to a location where the user only has a read access.

latentSpaceSource string

name of the source to generate embeddings for example based explanations.

JSON representation
{
  "inputs": {
    string: {
      object (InputMetadata)
    },
    ...
  },
  "outputs": {
    string: {
      object (OutputMetadata)
    },
    ...
  },
  "featureAttributionsSchemaUri": string,
  "latentSpaceSource": string
}

InputMetadata

metadata of the input of a feature.

Fields other than InputMetadata.input_baselines are applicable only for Models that are using Vertex AI-provided images for Tensorflow.

Fields
inputBaselines[] value (Value format)

baseline inputs for this feature.

If no baseline is specified, Vertex AI chooses the baseline for this feature. If multiple baselines are specified, Vertex AI returns the average attributions across them in Attribution.feature_attributions.

For Vertex AI-provided Tensorflow images (both 1.x and 2.x), the shape of each baseline must match the shape of the input tensor. If a scalar is provided, we broadcast to the same shape as the input tensor.

For custom images, the element of the baselines must be in the same format as the feature's input in the instance[]. The schema of any single instance may be specified via Endpoint's DeployedModels' Model's PredictSchemata's instanceSchemaUri.

inputTensorName string

name of the input tensor for this feature. Required and is only applicable to Vertex AI-provided images for Tensorflow.

encoding enum (Encoding)

Defines how the feature is encoded into the input tensor. Defaults to IDENTITY.

modality string

Modality of the feature. Valid values are: numeric, image. Defaults to numeric.

featureValueDomain object (FeatureValueDomain)

The domain details of the input feature value. Like min/max, original mean or standard deviation if normalized.

indicesTensorName string

Specifies the index of the values of the input tensor. Required when the input tensor is a sparse representation. Refer to Tensorflow documentation for more details: https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor.

denseShapeTensorName string

Specifies the shape of the values of the input if the input is a sparse representation. Refer to Tensorflow documentation for more details: https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor.

indexFeatureMapping[] string

A list of feature names for each index in the input tensor. Required when the input InputMetadata.encoding is BAG_OF_FEATURES, BAG_OF_FEATURES_SPARSE, INDICATOR.

encodedTensorName string

Encoded tensor is a transformation of the input tensor. Must be provided if choosing Integrated Gradients attribution or XRAI attribution and the input tensor is not differentiable.

An encoded tensor is generated if the input tensor is encoded by a lookup table.

encodedBaselines[] value (Value format)

A list of baselines for the encoded tensor.

The shape of each baseline should match the shape of the encoded tensor. If a scalar is provided, Vertex AI broadcasts to the same shape as the encoded tensor.

visualization object (Visualization)

Visualization configurations for image explanation.

groupName string

name of the group that the input belongs to. Features with the same group name will be treated as one feature when computing attributions. Features grouped together can have different shapes in value. If provided, there will be one single attribution generated in Attribution.feature_attributions, keyed by the group name.

JSON representation
{
  "inputBaselines": [
    value
  ],
  "inputTensorName": string,
  "encoding": enum (Encoding),
  "modality": string,
  "featureValueDomain": {
    object (FeatureValueDomain)
  },
  "indicesTensorName": string,
  "denseShapeTensorName": string,
  "indexFeatureMapping": [
    string
  ],
  "encodedTensorName": string,
  "encodedBaselines": [
    value
  ],
  "visualization": {
    object (Visualization)
  },
  "groupName": string
}

Encoding

Defines how a feature is encoded. Defaults to IDENTITY.

Enums
ENCODING_UNSPECIFIED Default value. This is the same as IDENTITY.
IDENTITY The tensor represents one feature.
BAG_OF_FEATURES

The tensor represents a bag of features where each index maps to a feature. InputMetadata.index_feature_mapping must be provided for this encoding. For example:

input = [27, 6.0, 150]
indexFeatureMapping = ["age", "height", "weight"]
BAG_OF_FEATURES_SPARSE

The tensor represents a bag of features where each index maps to a feature. Zero values in the tensor indicates feature being non-existent. InputMetadata.index_feature_mapping must be provided for this encoding. For example:

input = [2, 0, 5, 0, 1]
indexFeatureMapping = ["a", "b", "c", "d", "e"]
INDICATOR

The tensor is a list of binaries representing whether a feature exists or not (1 indicates existence). InputMetadata.index_feature_mapping must be provided for this encoding. For example:

input = [1, 0, 1, 0, 1]
indexFeatureMapping = ["a", "b", "c", "d", "e"]
COMBINED_EMBEDDING

The tensor is encoded into a 1-dimensional array represented by an encoded tensor. InputMetadata.encoded_tensor_name must be provided for this encoding. For example:

input = ["This", "is", "a", "test", "."]
encoded = [0.1, 0.2, 0.3, 0.4, 0.5]
CONCAT_EMBEDDING

Select this encoding when the input tensor is encoded into a 2-dimensional array represented by an encoded tensor. InputMetadata.encoded_tensor_name must be provided for this encoding. The first dimension of the encoded tensor's shape is the same as the input tensor's shape. For example:

input = ["This", "is", "a", "test", "."]
encoded = [[0.1, 0.2, 0.3, 0.4, 0.5],
           [0.2, 0.1, 0.4, 0.3, 0.5],
           [0.5, 0.1, 0.3, 0.5, 0.4],
           [0.5, 0.3, 0.1, 0.2, 0.4],
           [0.4, 0.3, 0.2, 0.5, 0.1]]

FeatureValueDomain

domain details of the input feature value. Provides numeric information about the feature, such as its range (min, max). If the feature has been pre-processed, for example with z-scoring, then it provides information about how to recover the original feature. For example, if the input feature is an image and it has been pre-processed to obtain 0-mean and stddev = 1 values, then originalMean, and originalStddev refer to the mean and stddev of the original feature (e.g. image tensor) from which input feature (with mean = 0 and stddev = 1) was obtained.

Fields
minValue number

The minimum permissible value for this feature.

maxValue number

The maximum permissible value for this feature.

originalMean number

If this input feature has been normalized to a mean value of 0, the originalMean specifies the mean value of the domain prior to normalization.

originalStddev number

If this input feature has been normalized to a standard deviation of 1.0, the originalStddev specifies the standard deviation of the domain prior to normalization.

JSON representation
{
  "minValue": number,
  "maxValue": number,
  "originalMean": number,
  "originalStddev": number
}

Visualization

Visualization configurations for image explanation.

Fields
type enum (Type)

type of the image visualization. Only applicable to Integrated Gradients attribution. OUTLINES shows regions of attribution, while PIXELS shows per-pixel attribution. Defaults to OUTLINES.

polarity enum (Polarity)

Whether to only highlight pixels with positive contributions, negative or both. Defaults to POSITIVE.

colorMap enum (ColorMap)

The color scheme used for the highlighted areas.

Defaults to PINK_GREEN for Integrated Gradients attribution, which shows positive attributions in green and negative in pink.

Defaults to VIRIDIS for XRAI attribution, which highlights the most influential regions in yellow and the least influential in blue.

clipPercentUpperbound number

Excludes attributions above the specified percentile from the highlighted areas. Using the clipPercentUpperbound and clipPercentLowerbound together can be useful for filtering out noise and making it easier to see areas of strong attribution. Defaults to 99.9.

clipPercentLowerbound number

Excludes attributions below the specified percentile, from the highlighted areas. Defaults to 62.

overlayType enum (OverlayType)

How the original image is displayed in the visualization. Adjusting the overlay can help increase visual clarity if the original image makes it difficult to view the visualization. Defaults to NONE.

JSON representation
{
  "type": enum (Type),
  "polarity": enum (Polarity),
  "colorMap": enum (ColorMap),
  "clipPercentUpperbound": number,
  "clipPercentLowerbound": number,
  "overlayType": enum (OverlayType)
}

Type

type of the image visualization. Only applicable to Integrated Gradients attribution.

Enums
TYPE_UNSPECIFIED Should not be used.
PIXELS Shows which pixel contributed to the image prediction.
OUTLINES Shows which region contributed to the image prediction by outlining the region.

Polarity

Whether to only highlight pixels with positive contributions, negative or both. Defaults to POSITIVE.

Enums
POLARITY_UNSPECIFIED Default value. This is the same as POSITIVE.
POSITIVE Highlights the pixels/outlines that were most influential to the model's prediction.
NEGATIVE Setting polarity to negative highlights areas that does not lead to the models's current prediction.
BOTH Shows both positive and negative attributions.

ColorMap

The color scheme used for highlighting areas.

Enums
COLOR_MAP_UNSPECIFIED Should not be used.
PINK_GREEN Positive: green. Negative: pink.
VIRIDIS Viridis color map: A perceptually uniform color mapping which is easier to see by those with colorblindness and progresses from yellow to green to blue. Positive: yellow. Negative: blue.
RED Positive: red. Negative: red.
GREEN Positive: green. Negative: green.
RED_GREEN Positive: green. Negative: red.
PINK_WHITE_GREEN PiYG palette.

OverlayType

How the original image is displayed in the visualization.

Enums
OVERLAY_TYPE_UNSPECIFIED Default value. This is the same as NONE.
NONE No overlay.
ORIGINAL The attributions are shown on top of the original image.
GRAYSCALE The attributions are shown on top of grayscaled version of the original image.
MASK_BLACK The attributions are used as a mask to reveal predictive parts of the image and hide the un-predictive parts.

OutputMetadata

metadata of the prediction output to be explained.

Fields
outputTensorName string

name of the output tensor. Required and is only applicable to Vertex AI provided images for Tensorflow.

display_name_mapping Union type

Defines how to map Attribution.output_index to Attribution.output_display_name.

If neither of the fields are specified, Attribution.output_display_name will not be populated. display_name_mapping can be only one of the following:

indexDisplayNameMapping value (Value format)

Static mapping between the index and display name.

Use this if the outputs are a deterministic n-dimensional array, e.g. a list of scores of all the classes in a pre-defined order for a multi-classification Model. It's not feasible if the outputs are non-deterministic, e.g. the Model produces top-k classes or sort the outputs by their values.

The shape of the value must be an n-dimensional array of strings. The number of dimensions must match that of the outputs to be explained. The Attribution.output_display_name is populated by locating in the mapping with Attribution.output_index.

displayNameMappingKey string

Specify a field name in the prediction to look for the display name.

Use this if the prediction contains the display names for the outputs.

The display names in the prediction must have the same shape of the outputs, so that it can be located by Attribution.output_index for a specific output.

JSON representation
{
  "outputTensorName": string,

  // display_name_mapping
  "indexDisplayNameMapping": value,
  "displayNameMappingKey": string
  // Union type
}

OutputInfo

Further describes this job's output. Supplements outputConfig.

Fields
bigqueryOutputTable string

Output only. The name of the BigQuery table created, in predictions_<timestamp> format, into which the prediction output is written. Can be used by UI to generate the BigQuery output path, for example.

output_location Union type
The output location into which prediction output is written. output_location can be only one of the following:
gcsOutputDirectory string

Output only. The full path of the Cloud Storage directory created, into which the prediction output is written.

bigqueryOutputDataset string

Output only. The path of the BigQuery dataset created, in bq://projectId.bqDatasetId format, into which the prediction output is written.

JSON representation
{
  "bigqueryOutputTable": string,

  // output_location
  "gcsOutputDirectory": string,
  "bigqueryOutputDataset": string
  // Union type
}

ResourcesConsumed

Statistics information about resource consumption.

Fields
replicaHours number

Output only. The number of replica hours used. Note that many replicas may run in parallel, and additionally any given work may be queued for some time. Therefore this value is not strictly related to wall time.

JSON representation
{
  "replicaHours": number
}

CompletionStats

Success and error statistics of processing multiple entities (for example, DataItems or structured data rows) in batch.

Fields
successfulCount string (int64 format)

Output only. The number of entities that had been processed successfully.

failedCount string (int64 format)

Output only. The number of entities for which any error was encountered.

incompleteCount string (int64 format)

Output only. In cases when enough errors are encountered a job, pipeline, or operation may be failed as a whole. Below is the number of entities for which the processing had not been finished (either in successful or failed state). Set to -1 if the number is unknown (for example, the operation failed before the total entity number could be collected).

successfulForecastPointCount string (int64 format)

Output only. The number of the successful forecast points that are generated by the forecasting model. This is ONLY used by the forecasting batch prediction.

JSON representation
{
  "successfulCount": string,
  "failedCount": string,
  "incompleteCount": string,
  "successfulForecastPointCount": string
}

ModelMonitoringConfig

The model monitoring configuration used for Batch Prediction Job.

Fields
objectiveConfigs[] object (ModelMonitoringObjectiveConfig)

Model monitoring objective config.

alertConfig object (ModelMonitoringAlertConfig)

Model monitoring alert config.

analysisInstanceSchemaUri string

YAML schema file uri in Cloud Storage describing the format of a single instance that you want Tensorflow data Validation (TFDV) to analyze.

If there are any data type differences between predict instance and TFDV instance, this field can be used to override the schema. For models trained with Vertex AI, this field must be set as all the fields in predict instance formatted as string.

statsAnomaliesBaseDirectory object (GcsDestination)

A Google Cloud Storage location for batch prediction model monitoring to dump statistics and anomalies. If not provided, a folder will be created in customer project to hold statistics and anomalies.

JSON representation
{
  "objectiveConfigs": [
    {
      object (ModelMonitoringObjectiveConfig)
    }
  ],
  "alertConfig": {
    object (ModelMonitoringAlertConfig)
  },
  "analysisInstanceSchemaUri": string,
  "statsAnomaliesBaseDirectory": {
    object (GcsDestination)
  }
}

ModelMonitoringObjectiveConfig

The objective configuration for model monitoring, including the information needed to detect anomalies for one particular model.

Fields
trainingDataset object (TrainingDataset)

Training dataset for models. This field has to be set only if TrainingPredictionSkewDetectionConfig is specified.

trainingPredictionSkewDetectionConfig object (TrainingPredictionSkewDetectionConfig)

The config for skew between training data and prediction data.

predictionDriftDetectionConfig object (PredictionDriftDetectionConfig)

The config for drift of prediction data.

explanationConfig object (ExplanationConfig)

The config for integrating with Vertex Explainable AI.

JSON representation
{
  "trainingDataset": {
    object (TrainingDataset)
  },
  "trainingPredictionSkewDetectionConfig": {
    object (TrainingPredictionSkewDetectionConfig)
  },
  "predictionDriftDetectionConfig": {
    object (PredictionDriftDetectionConfig)
  },
  "explanationConfig": {
    object (ExplanationConfig)
  }
}

TrainingDataset

Training Dataset information.

Fields
dataFormat string

data format of the dataset, only applicable if the input is from Google Cloud Storage. The possible formats are:

"tf-record" The source file is a TFRecord file.

"csv" The source file is a CSV file. "jsonl" The source file is a JSONL file.

targetField string

The target field name the model is to predict. This field will be excluded when doing Predict and (or) Explain for the training data.

loggingSamplingStrategy object (SamplingStrategy)

Strategy to sample data from Training Dataset. If not set, we process the whole dataset.

data_source Union type
data_source can be only one of the following:
dataset string

The resource name of the Dataset used to train this Model.

gcsSource object (GcsSource)

The Google Cloud Storage uri of the unmanaged Dataset used to train this Model.

bigquerySource object (BigQuerySource)

The BigQuery table of the unmanaged Dataset used to train this Model.

JSON representation
{
  "dataFormat": string,
  "targetField": string,
  "loggingSamplingStrategy": {
    object (SamplingStrategy)
  },

  // data_source
  "dataset": string,
  "gcsSource": {
    object (GcsSource)
  },
  "bigquerySource": {
    object (BigQuerySource)
  }
  // Union type
}

SamplingStrategy

Sampling Strategy for logging, can be for both training and prediction dataset.

Fields
randomSampleConfig object (RandomSampleConfig)

Random sample config. Will support more sampling strategies later.

JSON representation
{
  "randomSampleConfig": {
    object (RandomSampleConfig)
  }
}

RandomSampleConfig

Requests are randomly selected.

Fields
sampleRate number

Sample rate (0, 1]

JSON representation
{
  "sampleRate": number
}

TrainingPredictionSkewDetectionConfig

The config for Training & Prediction data skew detection. It specifies the training dataset sources and the skew detection parameters.

Fields
skewThresholds map (key: string, value: object (ThresholdConfig))

Key is the feature name and value is the threshold. If a feature needs to be monitored for skew, a value threshold must be configured for that feature. The threshold here is against feature distribution distance between the training and prediction feature.

attributionScoreSkewThresholds map (key: string, value: object (ThresholdConfig))

Key is the feature name and value is the threshold. The threshold here is against attribution score distance between the training and prediction feature.

defaultSkewThreshold object (ThresholdConfig)

Skew anomaly detection threshold used by all features. When the per-feature thresholds are not set, this field can be used to specify a threshold for all features.

JSON representation
{
  "skewThresholds": {
    string: {
      object (ThresholdConfig)
    },
    ...
  },
  "attributionScoreSkewThresholds": {
    string: {
      object (ThresholdConfig)
    },
    ...
  },
  "defaultSkewThreshold": {
    object (ThresholdConfig)
  }
}

ThresholdConfig

The config for feature monitoring threshold.

Fields
threshold Union type
threshold can be only one of the following:
value number

Specify a threshold value that can trigger the alert. If this threshold config is for feature distribution distance: 1. For categorical feature, the distribution distance is calculated by L-inifinity norm. 2. For numerical feature, the distribution distance is calculated by Jensen–Shannon divergence. Each feature must have a non-zero threshold if they need to be monitored. Otherwise no alert will be triggered for that feature.

JSON representation
{

  // threshold
  "value": number
  // Union type
}

PredictionDriftDetectionConfig

The config for Prediction data drift detection.

Fields
driftThresholds map (key: string, value: object (ThresholdConfig))

Key is the feature name and value is the threshold. If a feature needs to be monitored for drift, a value threshold must be configured for that feature. The threshold here is against feature distribution distance between different time windws.

attributionScoreDriftThresholds map (key: string, value: object (ThresholdConfig))

Key is the feature name and value is the threshold. The threshold here is against attribution score distance between different time windows.

defaultDriftThreshold object (ThresholdConfig)

Drift anomaly detection threshold used by all features. When the per-feature thresholds are not set, this field can be used to specify a threshold for all features.

JSON representation
{
  "driftThresholds": {
    string: {
      object (ThresholdConfig)
    },
    ...
  },
  "attributionScoreDriftThresholds": {
    string: {
      object (ThresholdConfig)
    },
    ...
  },
  "defaultDriftThreshold": {
    object (ThresholdConfig)
  }
}

ExplanationConfig

The config for integrating with Vertex Explainable AI. Only applicable if the Model has explanationSpec populated.

Fields
enableFeatureAttributes boolean

If want to analyze the Vertex Explainable AI feature attribute scores or not. If set to true, Vertex AI will log the feature attributions from explain response and do the skew/drift detection for them.

explanationBaseline object (ExplanationBaseline)

Predictions generated by the BatchPredictionJob using baseline dataset.

JSON representation
{
  "enableFeatureAttributes": boolean,
  "explanationBaseline": {
    object (ExplanationBaseline)
  }
}

ExplanationBaseline

Output from BatchPredictionJob for Model Monitoring baseline dataset, which can be used to generate baseline attribution scores.

Fields
predictionFormat enum (PredictionFormat)

The storage format of the predictions generated BatchPrediction job.

destination Union type
The configuration specifying of BatchExplain job output. This can be used to generate the baseline of feature attribution scores. destination can be only one of the following:
gcs object (GcsDestination)

Cloud Storage location for BatchExplain output.

bigquery object (BigQueryDestination)

BigQuery location for BatchExplain output.

JSON representation
{
  "predictionFormat": enum (PredictionFormat),

  // destination
  "gcs": {
    object (GcsDestination)
  },
  "bigquery": {
    object (BigQueryDestination)
  }
  // Union type
}

PredictionFormat

The storage format of the predictions generated BatchPrediction job.

Enums
PREDICTION_FORMAT_UNSPECIFIED Should not be set.
JSONL Predictions are in JSONL files.
BIGQUERY Predictions are in BigQuery.

ModelMonitoringAlertConfig

The alert config for model monitoring.

Fields
enableLogging boolean

Dump the anomalies to Cloud Logging. The anomalies will be put to json payload encoded from proto ModelMonitoringStatsAnomalies. This can be further synced to Pub/Sub or any other services supported by Cloud Logging.

notificationChannels[] string

Resource names of the NotificationChannels to send alert. Must be of the format projects/<project_id_or_number>/notificationChannels/<channelId>

alert Union type
alert can be only one of the following:
emailAlertConfig object (EmailAlertConfig)

email alert config.

JSON representation
{
  "enableLogging": boolean,
  "notificationChannels": [
    string
  ],

  // alert
  "emailAlertConfig": {
    object (EmailAlertConfig)
  }
  // Union type
}

EmailAlertConfig

The config for email alert.

Fields
userEmails[] string

The email addresses to send the alert.

JSON representation
{
  "userEmails": [
    string
  ]
}

ModelMonitoringStatsAnomalies

Statistics and anomalies generated by Model Monitoring.

Fields

Model Monitoring Objective those stats and anomalies belonging to.

deployedModelId string

Deployed Model id.

anomalyCount integer

Number of anomalies within all stats.

featureStats[] object (FeatureHistoricStatsAnomalies)

A list of historical Stats and Anomalies generated for all Features.

JSON representation
{
  "objective": enum (ModelDeploymentMonitoringObjectiveType),
  "deployedModelId": string,
  "anomalyCount": integer,
  "featureStats": [
    {
      object (FeatureHistoricStatsAnomalies)
    }
  ]
}

ModelDeploymentMonitoringObjectiveType

The Model Monitoring Objective types.

Enums
MODEL_DEPLOYMENT_MONITORING_OBJECTIVE_TYPE_UNSPECIFIED Default value, should not be set.
RAW_FEATURE_SKEW Raw feature values' stats to detect skew between Training-Prediction datasets.
RAW_FEATURE_DRIFT Raw feature values' stats to detect drift between Serving-Prediction datasets.
FEATURE_ATTRIBUTION_SKEW feature attribution scores to detect skew between Training-Prediction datasets.
FEATURE_ATTRIBUTION_DRIFT feature attribution scores to detect skew between Prediction datasets collected within different time windows.

FeatureHistoricStatsAnomalies

Historical Stats (and Anomalies) for a specific feature.

Fields
featureDisplayName string

Display name of the feature.

threshold object (ThresholdConfig)

Threshold for anomaly detection.

trainingStats object (FeatureStatsAnomaly)

Stats calculated for the Training Dataset.

predictionStats[] object (FeatureStatsAnomaly)

A list of historical stats generated by different time window's Prediction Dataset.

JSON representation
{
  "featureDisplayName": string,
  "threshold": {
    object (ThresholdConfig)
  },
  "trainingStats": {
    object (FeatureStatsAnomaly)
  },
  "predictionStats": [
    {
      object (FeatureStatsAnomaly)
    }
  ]
}

FeatureStatsAnomaly

Stats and Anomaly generated at specific timestamp for specific feature. The startTime and endTime are used to define the time range of the dataset that current stats belongs to, e.g. prediction traffic is bucketed into prediction datasets by time window. If the Dataset is not defined by time window, startTime = endTime. timestamp of the stats and anomalies always refers to endTime. Raw stats and anomalies are stored in statsUri or anomalyUri in the tensorflow defined protos. Field dataStats contains almost identical information with the raw stats in Vertex AI defined proto, for UI to display.

Fields
score number

feature importance score, only populated when cross-feature monitoring is enabled. For now only used to represent feature attribution score within range [0, 1] for ModelDeploymentMonitoringObjectiveType.FEATURE_ATTRIBUTION_SKEW and ModelDeploymentMonitoringObjectiveType.FEATURE_ATTRIBUTION_DRIFT.

statsUri string

Path of the stats file for current feature values in Cloud Storage bucket. Format: gs:////stats. Example: gs://monitoring_bucket/featureName/stats. Stats are stored as binary format with Protobuf message tensorflow.metadata.v0.FeatureNameStatistics.

anomalyUri string

Path of the anomaly file for current feature values in Cloud Storage bucket. Format: gs:////anomalies. Example: gs://monitoring_bucket/featureName/anomalies. Stats are stored as binary format with Protobuf message Anoamlies are stored as binary format with Protobuf message tensorflow.metadata.v0.AnomalyInfo.

distributionDeviation number

Deviation from the current stats to baseline stats. 1. For categorical feature, the distribution distance is calculated by L-inifinity norm. 2. For numerical feature, the distribution distance is calculated by Jensen–Shannon divergence.

anomalyDetectionThreshold number

This is the threshold used when detecting anomalies. The threshold can be changed by user, so this one might be different from ThresholdConfig.value.

startTime string (Timestamp format)

The start timestamp of window where stats were generated. For objectives where time window doesn't make sense (e.g. Featurestore Snapshot Monitoring), startTime is only used to indicate the monitoring intervals, so it always equals to (endTime - monitoringInterval).

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

endTime string (Timestamp format)

The end timestamp of window where stats were generated. For objectives where time window doesn't make sense (e.g. Featurestore Snapshot Monitoring), endTime indicates the timestamp of the data used to generate stats (e.g. timestamp we take snapshots for feature values).

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

JSON representation
{
  "score": number,
  "statsUri": string,
  "anomalyUri": string,
  "distributionDeviation": number,
  "anomalyDetectionThreshold": number,
  "startTime": string,
  "endTime": string
}

Methods

cancel

Cancels a BatchPredictionJob.

create

Creates a BatchPredictionJob.

delete

Deletes a BatchPredictionJob.

get

Gets a BatchPredictionJob

list

Lists BatchPredictionJobs in a Location.