Supported input feature types
BigQuery ML supports different input feature types for different model types. Supported input feature types are listed in the following table:
| Model Category | Model Types | Numeric types (INT64, NUMERIC, BIGNUMERIC, FLOAT64) | Categorical types (BOOL, STRING, BYTES, DATE, DATETIME) | TIMESTAMP | STRUCT | GEOGRAPHY | ARRAY<Numeric types> | ARRAY<Categorical types> | ARRAY<STRUCT<INT64, Numeric types>> |
|---|---|---|---|---|---|---|---|---|---|
| Supervised Learning | Linear & Logistic Regression | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |
| Deep Neural Networks | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
| Wide-and-Deep | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
| Boosted trees | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
| AutoML Tables | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
| Unsupervised Learning | K-means | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |
| PCA | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
| Autoencoder | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||
| Time Series Models | ARIMA_PLUS_XREG | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Dense vector input
BigQuery ML supports ARRAY<numerical> as dense vector input
during model training. The embedding feature is a special type of dense vector.
see the ML.GENERATE_EMBEDDING function for more information.
Sparse input
BigQuery ML supports ARRAY<STRUCT> as sparse input during
model training. Each struct contains an INT64 value that represents its
zero-based index, and a
numeric type
that represents the corresponding value.
Below is an example of a sparse tensor input for the integer array
[0,1,0,0,0,0,1]:
ARRAY<STRUCT<k INT64, v INT64>>[(1, 1), (6, 1)] AS f1