The ML.TRAINING_INFO function
This document describes the ML.TRAINING_INFO function, which lets you see
information about the training iterations of a model.
You can run ML.TRAINING_INFO while the CREATE MODEL
statement for the target model is running, or you can wait until after the
CREATE MODEL statement completes. If you run ML.TRAINING_INFO before the
first training iteration of the CREATE MODEL statement completes, the query
returns a Not found error.
Syntax
ML.TRAINING_INFO( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, )
Arguments
ML.TRAINING_INFO takes the following arguments:
PROJECT_ID: your project ID.DATASET: the BigQuery dataset that contains the model.MODEL_NAME: the name of the model.
Output
ML.TRAINING_INFO returns the following columns:
training_run: anINT64value that contains the training run identifier for the model. The value in this column is0for a newly created model. If you retrain the model using thewarm_startargument of theCREATE MODELstatement, this value is incremented.iteration: anINT64value that contains the iteration number of the training run. The value for the first iteration is0. This value is incremented for each additional training run.loss: aFLOAT64value that contains the loss metric calculated after an iteration on the training data:- For logistic regression models, this is log loss.
- For linear regression models, this is mean squared error.
- For multiclass logistic regressions, this is cross-entropy log loss.
- For explicit matrix factorization models this is mean squared error calculated over the seen input ratings.
- For implicit matrix factorization models, the loss is calculated using the following formula:
$$ Loss = \sum_{u, i} c_{ui}(p_{ui} - x^T_uy_i)^2 + \lambda(\sum_u||x_u||^2 + \sum_i||y_i||^2) $$For more information about what the variables mean, see Feedback types.
eval_loss: aFLOAT64value that contains the loss metric calculated on the holdout data. For k-means models,ML.TRAINING_INFOdoesn't return aneval_losscolumn. If theDATA_SPLIT_METHODargument isNO_SPLIT, then all entries in theeval_losscolumn areNULL.learning_rate: aFLOAT64value that contains the learning rate in this iteration.duration_ms: anINT64value that contains how long the iteration took, in milliseconds.cluster_info: anARRAY<STRUCT>value that contains the fieldscentroid_id,cluster_radius, andcluster_size.ML.TRAINING_INFOcomputescluster_radiusandcluster_sizewith standardized features. Only returned for k-means models.
Permissions
You must have the bigquery.models.create and bigquery.models.getData
Identity and Access Management (IAM) permissions
in order to run ML.TRAINING_INFO.
Limitations
ML.TRAINING_INFO is subject to the following limitations:
ML.TRAINING_INFOdoesn't support imported TensorFlow models.- For time series models,
ML.TRAINING_INFOonly returns three columns:training_run,iteration, andduration_ms. It doesn't expose the training information per iteration, or per time series if multiple time series are forecasted at once. Theduration_msis the total time cost for the entire process.
Example
The following example retrieves training information from the model
mydataset.mymodel in your default project:
SELECT * FROM ML.TRAINING_INFO(MODEL `mydataset.mymodel`)
What's next
- For more information about model evaluation, see BigQuery ML model evaluation overview.
- For more information about supported SQL statements and functions for ML models, see End-to-end user journeys for ML models.