The ML.FEATURE_IMPORTANCE function
This document describes the ML.FEATURE_IMPORTANCE function, which lets you
see the feature importance score. This score indicates how useful or valuable
each feature was in the construction of a boosted tree or a random forest model
during training. For more information, see the
feature_importances property
in the XGBoost library.
Syntax
ML.FEATURE_IMPORTANCE( MODEL `PROJECT_ID.DATASET.MODEL` )
Arguments
ML.FEATURE_IMPORTANCE takes the following arguments:
PROJECT_ID: your project ID.DATASET: the BigQuery dataset that contains the model.MODEL: the name of the model.
Output
ML.FEATURE_IMPORTANCE returns the following columns:
feature: aSTRINGvalue that contains the name of the feature column in the input training data.importance_weight: aFLOAT64value that contains the number of times a feature is used to split the data across all trees.importance_gain: aFLOAT64value that contains the average gain across all splits the feature is used in.importance_cover: aFLOAT64value that contains the average coverage across all splits the feature is used in.
If the TRANSFORM clause
was used in the CREATE MODEL statement that created the model,
ML.FEATURE_IMPORTANCE returns the information of the pre-transform columns
from the query_statement clause of the CREATE MODEL statement.
Permissions
You must have the bigquery.models.create and bigquery.models.getData
Identity and Access Management (IAM) permissions
in order to run ML.FEATURE_IMPORTANCE.
Limitations
ML.FEATURE_IMPORTANCE is only supported with
boosted tree models
and
random forest models.
Example
This example retrieves feature importance from mymodel in
mydataset. The dataset is in your default project.
SELECT * FROM ML.FEATURE_IMPORTANCE(MODEL `mydataset.mymodel`)
What's next
- For more information about Explainable AI, see BigQuery Explainable AI overview.
- For more information about supported SQL statements and functions for ML models, see End-to-end user journeys for ML models.