REST Resource: projects.locations.tuningJobs

Resource: TuningJob

Represents a TuningJob that runs with Google owned models.

Fields
name string

Output only. Identifier. Resource name of a TuningJob. Format: projects/{project}/locations/{location}/tuningJobs/{tuningJob}

tunedModelDisplayName string

Optional. The display name of the TunedModel. The name can be up to 128 characters long and can consist of any UTF-8 characters. For continuous tuning, tunedModelDisplayName will by default use the same display name as the pre-tuned model. If a new display name is provided, the tuning job will create a new model instead of a new version.

description string

Optional. The description of the TuningJob.

state enum (JobState)

Output only. The detailed state of the job.

createTime string (Timestamp format)

Output only. time when the TuningJob was created.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

startTime string (Timestamp format)

Output only. time when the TuningJob for the first time entered the JOB_STATE_RUNNING state.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

endTime string (Timestamp format)

Output only. time when the TuningJob entered any of the following JobStates: JOB_STATE_SUCCEEDED, JOB_STATE_FAILED, JOB_STATE_CANCELLED, JOB_STATE_EXPIRED.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

updateTime string (Timestamp format)

Output only. time when the TuningJob was most recently updated.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

error object (Status)

Output only. Only populated when job's state is JOB_STATE_FAILED or JOB_STATE_CANCELLED.

labels map (key: string, value: string)

Optional. The labels with user-defined metadata to organize TuningJob and generated resources such as Model and Endpoint.

label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed.

See https://goo.gl/xmQnxf for more information and examples of labels.

experiment string

Output only. The Experiment associated with this TuningJob.

tunedModel object (TunedModel)

Output only. The tuned model resources associated with this TuningJob.

tuningDataStats object (TuningDataStats)

Output only. The tuning data statistics associated with this TuningJob.

encryptionSpec object (EncryptionSpec)

Customer-managed encryption key options for a TuningJob. If this is set, then all resources created by the TuningJob will be encrypted with the provided encryption key.

serviceAccount string

The service account that the tuningJob workload runs as. If not specified, the Vertex AI Secure Fine-Tuned service Agent in the project will be used. See https://cloud.google.com/iam/docs/service-agents#vertex-ai-secure-fine-tuning-service-agent

Users starting the pipeline must have the iam.serviceAccounts.actAs permission on this service account.

evaluateDatasetRuns[] object (EvaluateDatasetRun)

Output only. Evaluation runs for the Tuning Job.

source_model Union type
source_model can be only one of the following:
baseModel string

The base model that is being tuned. See Supported models.

preTunedModel object (PreTunedModel)

The pre-tuned model for continuous tuning.

tuning_spec Union type
tuning_spec can be only one of the following:
supervisedTuningSpec object (SupervisedTuningSpec)

Tuning Spec for Supervised Fine Tuning.

preferenceOptimizationSpec object (PreferenceOptimizationSpec)

Tuning Spec for Preference Optimization.

JSON representation
{
  "name": string,
  "tunedModelDisplayName": string,
  "description": string,
  "state": enum (JobState),
  "createTime": string,
  "startTime": string,
  "endTime": string,
  "updateTime": string,
  "error": {
    object (Status)
  },
  "labels": {
    string: string,
    ...
  },
  "experiment": string,
  "tunedModel": {
    object (TunedModel)
  },
  "tuningDataStats": {
    object (TuningDataStats)
  },
  "encryptionSpec": {
    object (EncryptionSpec)
  },
  "serviceAccount": string,
  "evaluateDatasetRuns": [
    {
      object (EvaluateDatasetRun)
    }
  ],

  // source_model
  "baseModel": string,
  "preTunedModel": {
    object (PreTunedModel)
  }
  // Union type

  // tuning_spec
  "supervisedTuningSpec": {
    object (SupervisedTuningSpec)
  },
  "preferenceOptimizationSpec": {
    object (PreferenceOptimizationSpec)
  }
  // Union type
}

PreTunedModel

A pre-tuned model for continuous tuning.

Fields
tunedModelName string

The resource name of the Model. E.g., a model resource name with a specified version id or alias:

projects/{project}/locations/{location}/models/{model}@{versionId}

projects/{project}/locations/{location}/models/{model}@{alias}

Or, omit the version id to use the default version:

projects/{project}/locations/{location}/models/{model}

checkpointId string

Optional. The source checkpoint id. If not specified, the default checkpoint will be used.

baseModel string

Output only. The name of the base model this PreTunedModel was tuned from.

JSON representation
{
  "tunedModelName": string,
  "checkpointId": string,
  "baseModel": string
}

SupervisedTuningSpec

Tuning Spec for Supervised Tuning for first party models.

Fields
trainingDatasetUri string

Required. Training dataset used for tuning. The dataset can be specified as either a Cloud Storage path to a JSONL file or as the resource name of a Vertex Multimodal Dataset.

validationDatasetUri string

Optional. Validation dataset used for tuning. The dataset can be specified as either a Cloud Storage path to a JSONL file or as the resource name of a Vertex Multimodal Dataset.

hyperParameters object (SupervisedHyperParameters)

Optional. Hyperparameters for SFT.

exportLastCheckpointOnly boolean

Optional. If set to true, disable intermediate checkpoints for SFT and only the last checkpoint will be exported. Otherwise, enable intermediate checkpoints for SFT. Default is false.

evaluationConfig object (EvaluationConfig)

Optional. Evaluation Config for Tuning Job.

JSON representation
{
  "trainingDatasetUri": string,
  "validationDatasetUri": string,
  "hyperParameters": {
    object (SupervisedHyperParameters)
  },
  "exportLastCheckpointOnly": boolean,
  "evaluationConfig": {
    object (EvaluationConfig)
  }
}

SupervisedHyperParameters

Hyperparameters for SFT.

Fields
epochCount string (int64 format)

Optional. Number of complete passes the model makes over the entire training dataset during training.

learningRateMultiplier number

Optional. Multiplier for adjusting the default learning rate. Mutually exclusive with learningRate. This feature is only available for 1P models.

adapterSize enum (AdapterSize)

Optional. Adapter size for tuning.

JSON representation
{
  "epochCount": string,
  "learningRateMultiplier": number,
  "adapterSize": enum (AdapterSize)
}

AdapterSize

Supported adapter sizes for tuning.

Enums
ADAPTER_SIZE_UNSPECIFIED Adapter size is unspecified.
ADAPTER_SIZE_ONE Adapter size 1.
ADAPTER_SIZE_TWO Adapter size 2.
ADAPTER_SIZE_FOUR Adapter size 4.
ADAPTER_SIZE_EIGHT Adapter size 8.
ADAPTER_SIZE_SIXTEEN Adapter size 16.
ADAPTER_SIZE_THIRTY_TWO Adapter size 32.

EvaluationConfig

Evaluation Config for Tuning Job.

Fields
metrics[] object (Metric)

Required. The metrics used for evaluation.

outputConfig object (OutputConfig)

Required. Config for evaluation output.

autoraterConfig object (AutoraterConfig)

Optional. Autorater config for evaluation.

JSON representation
{
  "metrics": [
    {
      object (Metric)
    }
  ],
  "outputConfig": {
    object (OutputConfig)
  },
  "autoraterConfig": {
    object (AutoraterConfig)
  }
}

OutputConfig

Config for evaluation output.

Fields
destination Union type
The destination for evaluation output. destination can be only one of the following:
gcsDestination object (GcsDestination)

Cloud storage destination for evaluation output.

JSON representation
{

  // destination
  "gcsDestination": {
    object (GcsDestination)
  }
  // Union type
}

PreferenceOptimizationSpec

Tuning Spec for Preference Optimization.

Fields
trainingDatasetUri string

Required. Cloud Storage path to file containing training dataset for preference optimization tuning. The dataset must be formatted as a JSONL file.

hyperParameters object (PreferenceOptimizationHyperParameters)

Optional. Hyperparameters for Preference Optimization.

exportLastCheckpointOnly boolean

Optional. If set to true, disable intermediate checkpoints for Preference Optimization and only the last checkpoint will be exported. Otherwise, enable intermediate checkpoints for Preference Optimization. Default is false.

validationDatasetUri string

Optional. Cloud Storage path to file containing validation dataset for preference optimization tuning. The dataset must be formatted as a JSONL file.

JSON representation
{
  "trainingDatasetUri": string,
  "hyperParameters": {
    object (PreferenceOptimizationHyperParameters)
  },
  "exportLastCheckpointOnly": boolean,
  "validationDatasetUri": string
}

PreferenceOptimizationHyperParameters

Hyperparameters for Preference Optimization.

Fields
adapterSize enum (AdapterSize)

Optional. Adapter size for preference optimization.

epochCount string (int64 format)

Optional. Number of complete passes the model makes over the entire training dataset during training.

learningRateMultiplier number

Optional. Multiplier for adjusting the default learning rate.

beta number

Optional. weight for KL Divergence regularization.

JSON representation
{
  "adapterSize": enum (AdapterSize),
  "epochCount": string,
  "learningRateMultiplier": number,
  "beta": number
}

TunedModel

The Model Registry Model and Online Prediction Endpoint associated with this TuningJob.

Fields
model string

Output only. The resource name of the TunedModel. Format:

projects/{project}/locations/{location}/models/{model}@{versionId}

When tuning from a base model, the version id will be 1.

For continuous tuning, if the provided tunedModelDisplayName is set and different from parent model's display name, the tuned model will have a new parent model with version 1. Otherwise the version id will be incremented by 1 from the last version id in the parent model. E.g.,

projects/{project}/locations/{location}/models/{model}@{last_version_id + 1}

endpoint string

Output only. A resource name of an Endpoint. Format: projects/{project}/locations/{location}/endpoints/{endpoint}.

checkpoints[] object (TunedModelCheckpoint)

Output only. The checkpoints associated with this TunedModel. This field is only populated for tuning jobs that enable intermediate checkpoints.

JSON representation
{
  "model": string,
  "endpoint": string,
  "checkpoints": [
    {
      object (TunedModelCheckpoint)
    }
  ]
}

TunedModelCheckpoint

TunedModelCheckpoint for the Tuned Model of a Tuning Job.

Fields
checkpointId string

The id of the checkpoint.

epoch string (int64 format)

The epoch of the checkpoint.

step string (int64 format)

The step of the checkpoint.

endpoint string

The Endpoint resource name that the checkpoint is deployed to. Format: projects/{project}/locations/{location}/endpoints/{endpoint}.

JSON representation
{
  "checkpointId": string,
  "epoch": string,
  "step": string,
  "endpoint": string
}

TuningDataStats

The tuning data statistic values for TuningJob.

Fields
tuning_data_stats Union type
tuning_data_stats can be only one of the following:
supervisedTuningDataStats object (SupervisedTuningDataStats)

The SFT Tuning data stats.

preferenceOptimizationDataStats object (PreferenceOptimizationDataStats)

Output only. Statistics for preference optimization.

JSON representation
{

  // tuning_data_stats
  "supervisedTuningDataStats": {
    object (SupervisedTuningDataStats)
  },
  "preferenceOptimizationDataStats": {
    object (PreferenceOptimizationDataStats)
  }
  // Union type
}

SupervisedTuningDataStats

Tuning data statistics for Supervised Tuning.

Fields
tuningDatasetExampleCount string (int64 format)

Output only. Number of examples in the tuning dataset.

totalTuningCharacterCount string (int64 format)

Output only. Number of tuning characters in the tuning dataset.

totalBillableCharacterCount
(deprecated)
string (int64 format)

Output only. Number of billable characters in the tuning dataset.

totalBillableTokenCount string (int64 format)

Output only. Number of billable tokens in the tuning dataset.

tuningStepCount string (int64 format)

Output only. Number of tuning steps for this Tuning Job.

userInputTokenDistribution object (SupervisedTuningDatasetDistribution)

Output only. Dataset distributions for the user input tokens.

userOutputTokenDistribution object (SupervisedTuningDatasetDistribution)

Output only. Dataset distributions for the user output tokens.

userMessagePerExampleDistribution object (SupervisedTuningDatasetDistribution)

Output only. Dataset distributions for the messages per example.

userDatasetExamples[] object (Content)

Output only. Sample user messages in the training dataset uri.

totalTruncatedExampleCount string (int64 format)

Output only. The number of examples in the dataset that have been dropped. An example can be dropped for reasons including: too many tokens, contains an invalid image, contains too many images, etc.

truncatedExampleIndices[] string (int64 format)

Output only. A partial sample of the indices (starting from 1) of the dropped examples.

droppedExampleReasons[] string

Output only. For each index in truncatedExampleIndices, the user-facing reason why the example was dropped.

JSON representation
{
  "tuningDatasetExampleCount": string,
  "totalTuningCharacterCount": string,
  "totalBillableCharacterCount": string,
  "totalBillableTokenCount": string,
  "tuningStepCount": string,
  "userInputTokenDistribution": {
    object (SupervisedTuningDatasetDistribution)
  },
  "userOutputTokenDistribution": {
    object (SupervisedTuningDatasetDistribution)
  },
  "userMessagePerExampleDistribution": {
    object (SupervisedTuningDatasetDistribution)
  },
  "userDatasetExamples": [
    {
      object (Content)
    }
  ],
  "totalTruncatedExampleCount": string,
  "truncatedExampleIndices": [
    string
  ],
  "droppedExampleReasons": [
    string
  ]
}

SupervisedTuningDatasetDistribution

Dataset distribution for Supervised Tuning.

Fields
sum string (int64 format)

Output only. Sum of a given population of values.

billableSum string (int64 format)

Output only. Sum of a given population of values that are billable.

min number

Output only. The minimum of the population values.

max number

Output only. The maximum of the population values.

mean number

Output only. The arithmetic mean of the values in the population.

median number

Output only. The median of the values in the population.

p5 number

Output only. The 5th percentile of the values in the population.

p95 number

Output only. The 95th percentile of the values in the population.

buckets[] object (DatasetBucket)

Output only. Defines the histogram bucket.

JSON representation
{
  "sum": string,
  "billableSum": string,
  "min": number,
  "max": number,
  "mean": number,
  "median": number,
  "p5": number,
  "p95": number,
  "buckets": [
    {
      object (DatasetBucket)
    }
  ]
}

DatasetBucket

Dataset bucket used to create a histogram for the distribution given a population of values.

Fields
count number

Output only. Number of values in the bucket.

left number

Output only. left bound of the bucket.

right number

Output only. Right bound of the bucket.

JSON representation
{
  "count": number,
  "left": number,
  "right": number
}

PreferenceOptimizationDataStats

Statistics computed for datasets used for preference optimization.

Fields
tuningDatasetExampleCount string (int64 format)

Output only. Number of examples in the tuning dataset.

totalBillableTokenCount string (int64 format)

Output only. Number of billable tokens in the tuning dataset.

tuningStepCount string (int64 format)

Output only. Number of tuning steps for this Tuning Job.

userInputTokenDistribution object (DatasetDistribution)

Output only. Dataset distributions for the user input tokens.

userOutputTokenDistribution object (DatasetDistribution)

Output only. Dataset distributions for the user output tokens.

scoreVariancePerExampleDistribution object (DatasetDistribution)

Output only. Dataset distributions for scores variance per example.

scoresDistribution object (DatasetDistribution)

Output only. Dataset distributions for scores.

userDatasetExamples[] object (GeminiPreferenceExample)

Output only. Sample user examples in the training dataset.

droppedExampleIndices[] string (int64 format)

Output only. A partial sample of the indices (starting from 1) of the dropped examples.

droppedExampleReasons[] string

Output only. For each index in droppedExampleIndices, the user-facing reason why the example was dropped.

JSON representation
{
  "tuningDatasetExampleCount": string,
  "totalBillableTokenCount": string,
  "tuningStepCount": string,
  "userInputTokenDistribution": {
    object (DatasetDistribution)
  },
  "userOutputTokenDistribution": {
    object (DatasetDistribution)
  },
  "scoreVariancePerExampleDistribution": {
    object (DatasetDistribution)
  },
  "scoresDistribution": {
    object (DatasetDistribution)
  },
  "userDatasetExamples": [
    {
      object (GeminiPreferenceExample)
    }
  ],
  "droppedExampleIndices": [
    string
  ],
  "droppedExampleReasons": [
    string
  ]
}

DatasetDistribution

Distribution computed over a tuning dataset.

Fields
sum number

Output only. Sum of a given population of values.

min number

Output only. The minimum of the population values.

max number

Output only. The maximum of the population values.

mean number

Output only. The arithmetic mean of the values in the population.

median number

Output only. The median of the values in the population.

p5 number

Output only. The 5th percentile of the values in the population.

p95 number

Output only. The 95th percentile of the values in the population.

buckets[] object (DistributionBucket)

Output only. Defines the histogram bucket.

JSON representation
{
  "sum": number,
  "min": number,
  "max": number,
  "mean": number,
  "median": number,
  "p5": number,
  "p95": number,
  "buckets": [
    {
      object (DistributionBucket)
    }
  ]
}

DistributionBucket

Dataset bucket used to create a histogram for the distribution given a population of values.

Fields
count string (int64 format)

Output only. Number of values in the bucket.

left number

Output only. left bound of the bucket.

right number

Output only. Right bound of the bucket.

JSON representation
{
  "count": string,
  "left": number,
  "right": number
}

GeminiPreferenceExample

Input example for preference optimization.

Fields
contents[] object (Content)

Multi-turn contents that represents the Prompt.

completions[] object (Completion)

List of completions for a given prompt.

JSON representation
{
  "contents": [
    {
      object (Content)
    }
  ],
  "completions": [
    {
      object (Completion)
    }
  ]
}

Completion

Completion and its preference score.

Fields
completion object (Content)

Single turn completion for the given prompt.

score number

The score for the given completion.

JSON representation
{
  "completion": {
    object (Content)
  },
  "score": number
}

EvaluateDatasetRun

Evaluate Dataset Run result for Tuning Job.

Fields
operationName string

Output only. The operation id of the evaluation run. Format: projects/{project}/locations/{location}/operations/{operationId}.

checkpointId string

Output only. The checkpoint id used in the evaluation run. Only populated when evaluating checkpoints.

evaluateDatasetResponse object (EvaluateDatasetResponse)

Output only. Results for EvaluationService.

error object (Status)

Output only. The error of the evaluation run if any.

JSON representation
{
  "operationName": string,
  "checkpointId": string,
  "evaluateDatasetResponse": {
    object (EvaluateDatasetResponse)
  },
  "error": {
    object (Status)
  }
}

EvaluateDatasetResponse

The results from an evaluation run performed by the EvaluationService.

Fields
aggregationOutput object (AggregationOutput)

Output only. Aggregation statistics derived from results of EvaluationService.

outputInfo object (OutputInfo)

Output only. Output info for EvaluationService.

JSON representation
{
  "aggregationOutput": {
    object (AggregationOutput)
  },
  "outputInfo": {
    object (OutputInfo)
  }
}

AggregationOutput

The aggregation result for the entire dataset and all metrics.

Fields
dataset object (EvaluationDataset)

The dataset used for evaluation & aggregation.

aggregationResults[] object (AggregationResult)

One AggregationResult per metric.

JSON representation
{
  "dataset": {
    object (EvaluationDataset)
  },
  "aggregationResults": [
    {
      object (AggregationResult)
    }
  ]
}

EvaluationDataset

The dataset used for evaluation.

Fields
source Union type
The source of the dataset. source can be only one of the following:
gcsSource object (GcsSource)

Cloud storage source holds the dataset. Currently only one Cloud Storage file path is supported.

bigquerySource object (BigQuerySource)

BigQuery source holds the dataset.

JSON representation
{

  // source
  "gcsSource": {
    object (GcsSource)
  },
  "bigquerySource": {
    object (BigQuerySource)
  }
  // Union type
}

AggregationResult

The aggregation result for a single metric.

Fields
aggregationMetric enum (AggregationMetric)

Aggregation metric.

aggregation_result Union type
The aggregation result. aggregation_result can be only one of the following:
pointwiseMetricResult object (PointwiseMetricResult)

result for pointwise metric.

pairwiseMetricResult object (PairwiseMetricResult)

result for pairwise metric.

exactMatchMetricValue object (ExactMatchMetricValue)

Results for exact match metric.

bleuMetricValue object (BleuMetricValue)

Results for bleu metric.

rougeMetricValue object (RougeMetricValue)

Results for rouge metric.

customCodeExecutionResult object (CustomCodeExecutionResult)

result for code execution metric.

JSON representation
{
  "aggregationMetric": enum (AggregationMetric),

  // aggregation_result
  "pointwiseMetricResult": {
    object (PointwiseMetricResult)
  },
  "pairwiseMetricResult": {
    object (PairwiseMetricResult)
  },
  "exactMatchMetricValue": {
    object (ExactMatchMetricValue)
  },
  "bleuMetricValue": {
    object (BleuMetricValue)
  },
  "rougeMetricValue": {
    object (RougeMetricValue)
  },
  "customCodeExecutionResult": {
    object (CustomCodeExecutionResult)
  }
  // Union type
}

PointwiseMetricResult

Spec for pointwise metric result.

Fields
explanation string

Output only. Explanation for pointwise metric score.

customOutput object (CustomOutput)

Output only. Spec for custom output.

score number

Output only. Pointwise metric score.

JSON representation
{
  "explanation": string,
  "customOutput": {
    object (CustomOutput)
  },
  "score": number
}

CustomOutput

Spec for custom output.

Fields
custom_output Union type
Custom output. custom_output can be only one of the following:
rawOutputs object (RawOutput)

Output only. List of raw output strings.

JSON representation
{

  // custom_output
  "rawOutputs": {
    object (RawOutput)
  }
  // Union type
}

RawOutput

Raw output.

Fields
rawOutput[] string

Output only. Raw output string.

JSON representation
{
  "rawOutput": [
    string
  ]
}

PairwiseMetricResult

Spec for pairwise metric result.

Fields
pairwiseChoice enum (PairwiseChoice)

Output only. Pairwise metric choice.

explanation string

Output only. Explanation for pairwise metric score.

customOutput object (CustomOutput)

Output only. Spec for custom output.

JSON representation
{
  "pairwiseChoice": enum (PairwiseChoice),
  "explanation": string,
  "customOutput": {
    object (CustomOutput)
  }
}

PairwiseChoice

Pairwise prediction autorater preference.

Enums
PAIRWISE_CHOICE_UNSPECIFIED Unspecified prediction choice.
BASELINE baseline prediction wins
CANDIDATE Candidate prediction wins
TIE Winner cannot be determined

ExactMatchMetricValue

Exact match metric value for an instance.

Fields
score number

Output only. Exact match score.

JSON representation
{
  "score": number
}

BleuMetricValue

Bleu metric value for an instance.

Fields
score number

Output only. Bleu score.

JSON representation
{
  "score": number
}

RougeMetricValue

Rouge metric value for an instance.

Fields
score number

Output only. Rouge score.

JSON representation
{
  "score": number
}

CustomCodeExecutionResult

result for custom code execution metric.

Fields
score number

Output only. Custom code execution score.

JSON representation
{
  "score": number
}

OutputInfo

Describes the info for output of EvaluationService.

Fields
output_location Union type
The output location into which evaluation output is written. output_location can be only one of the following:
gcsOutputDirectory string

Output only. The full path of the Cloud Storage directory created, into which the evaluation results and aggregation results are written.

JSON representation
{

  // output_location
  "gcsOutputDirectory": string
  // Union type
}

Methods

cancel

Cancels a tuning job.

create

Creates a tuning job.

get

Gets a tuning job.

list

Lists tuning jobs in a location.

rebaseTunedModel

Rebase a tuned model.