The ML.RECOMMEND function
This document describes the ML.RECOMMEND function, which lets you generate
a predicted rating for every user-item row
combination for a
matrix factorization model.
Because the input data for a
matrix factorization model tends to be a sparse matrix with missing values,
ML.RECOMMEND can return the predictions for those missing values without
requiring specification of each entry.
Syntax
ML.RECOMMEND(
MODEL `PROJECT_ID.DATASET.MODEL_NAME`,
[, { TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) }]
[, STRUCT(TRIAL_ID AS trial_id)])
Arguments
ML.RECOMMEND takes the following arguments:
PROJECT_ID: the project that contains the resource.DATASET: the dataset that contains the resource.MODEL: the name of the model.TABLE: The name of the input table that contains the user and item data.If you specify input data by using either the
TABLEorQUERY_STATEMENTargument, the user and item columns must match the user and item columns in the model, and their types must be compatible according to BigQuery implicit coercion rules.If the input table does not contain both the user and item column, the input table can only contain one column. If the table contains both the user and item columns, then the non-user or item columns are passed through and available for use in the query.
QUERY_STATEMENT: The GoogleSQL query that is used to generate the evaluation data. For the supported SQL syntax for theQUERY_STATEMENTclause in GoogleSQL, see Query syntax.If you specify input data by using either the
TABLEorQUERY_STATEMENTargument, the user and item columns must match the user and item columns in the model, and their types must be compatible according to BigQuery implicit coercion rules.TRIAL_ID: anINT64value that identifies the hyperparameter tuning trial that you want the function to evaluate. The function uses the optimal trial by default. Only specify this argument if you ran hyperparameter tuning when creating the model.
Output
ML.RECOMMEND outputs at least 3 columns for all cases; the user column, the
item column and a column for predicted recommendations.
The output of ML.RECOMMEND is computed as follows:
- If both the user and item columns are in the input data,
then
ML.RECOMMENDreturns a rating for each user-item pair. - If only the user or only the item is specified (for example, if the table
identified by the
tableargument only contains the user column), then all the item ratings for every user in the table are outputted. - If either the user or item feature was not in the training dataset, the rating
that is returned is the intercept of the feature that was provided, either
item or user, added to the
global__intercept__offset. For example,global__intercept__ + __intercept__['user_a']. - If input data is specified but does not provide
either the user or item column,
ML.RECOMMENDreturns an error. - If no input data is specified,
ML.RECOMMENDoutputs the ratings for every user and item combination seen during training.
If the model was trained with feedback_type=EXPLICIT, a user column called
user, and an item column called item, then ML.RECOMMEND returns the
following columns:
user: aSTRINGvalue containing the user data.item: aSTRINGvalue containing the item datapredicted_<rating_col_name>: aFLOAT64value that contains the rating for each user-item pair. Because the input ratings from training are assumed to be explicit feedback, the predicted ratings are approximately in the range of the original input, although ratings outside the range are also normal.
If the model was trained with feedback_type=IMPLICIT, a user column called
user, and an item column called item, then ML.RECOMMEND returns the
following columns:
user: aSTRINGvalue containing the user data.item: aSTRINGvalue containing the item datapredicted_<rating_col_name>_confidence: aFLOAT64value that contains the relative confidence for each user-item pair. The input ratings from training are assumed to be a proxy for user confidence. Therefore, if the model has converged, the predicted confidences lie between approximately 0 and 1 (but can lie just outside that range). If the model hasn't converged, the predicted confidences can be any value. If your model isn't converging and your ratings are very large, try decreasing theWALS_ALPHAvalue that's specified in theCREATE MODELstatement for the model. If your model isn't converging and your ratings are very small, try increasing theWALS_ALPHAvalue.
Examples
The following examples show how to use the ML.RECOMMEND function.
No input data
The following example generates predicted ratings for every
user-item pair in the inputs of mymodel because there is no input data
specified.
SELECT * FROM ML.RECOMMEND(MODEL `mydataset.mymodel`)
With input data
The following example generates predicted ratings for each user-item row in
mydataset.mytable assuming that mydataset.mymodel was trained using the user
column user and item column item.
SELECT * FROM ML.RECOMMEND(MODEL `mydataset.mymodel`, ( SELECT user, item FROM `mydataset.mytable`))
What's next
- For more information about model inference, see Model inference overview.
- For more information about supported SQL statements and functions for ML models, see End-to-end user journeys for ML models.