The ML.GENERATE_EMBEDDING function
This document describes the ML.GENERATE_EMBEDDING function, which
lets you create embeddings that describe an entity—for example,
a piece of text or an image.
You can create embeddings for the following types of data:
- Text data from standard tables.
- Visual data that is returned as ObjectRefRuntimevalues by theOBJ.GET_ACCESS_URLfunction. You can useObjectRefvalues from standard tables as input to theOBJ.GET_ACCESS_URLfunction. (Preview)
- Visual data in object tables.
- Output data from PCA, autoencoder, or matrix factorization models.
Embeddings
Embeddings are high-dimensional numerical vectors that represent a given entity. Machine learning (ML) models use embeddings to encode semantics about entities to make it easier to reason about and compare them. If two entities are semantically similar, then their respective embeddings are located near each other in the embedding vector space.
Embeddings help you perform the following tasks:
- Semantic search: search entities ranked by semantic similarity.
- Recommendation: return entities with attributes similar to a given entity.
- Classification: return the class of entities whose attributes are similar to the given entity.
- Clustering: cluster entities whose attributes are similar to a given entity.
- Outlier detection: return entities whose attributes are least related to the given entity.
- Matrix factorization: return entities that represent the underlying weights that a model uses during prediction.
- Principal component analysis (PCA): return entities (principal components) that represent the input data in such a way that it is easier to identify patterns, clusters, and outliers.
- Autoencoding: return the latent space representations of the input data.
Function processing
Depending on the task, the ML.GENERATE_EMBEDDING function works in one of the
following ways:
- To generate embeddings from text or visual content, - ML.GENERATE_EMBEDDINGsends the request to a BigQuery ML remote model that represents a Vertex AI embedding model or a supported open model (Preview), and then returns the model's response.- The - ML.GENERATE_EMBEDDINGfunction works with the Vertex AI model to perform embedding tasks supported by that model. For more information on the types of tasks these models can perform, see the following documentation:- Typically, you want to use text embedding models for text-only use cases, and use multimodal models for cross-modal search use cases, where embeddings for text and visual content are generated in the same semantic space. 
- For PCA and autoencoding, - ML.GENERATE_EMBEDDINGprocesses the request using a BigQuery ML PCA or autoencoder model- ML.PREDICTfunction.- ML.GENERATE_EMBEDDINGgathers the- ML.PREDICToutput for the model into an array and outputs it as the- ml_generate_embedding_resultcolumn. Having all of the embeddings in a single column lets you directly use the- VECTOR_SEARCHfunction on the- ML.GENERATE_EMBEDDINGoutput.
- For matrix factorization, - ML.GENERATE_EMBEDDINGprocesses the request using a BigQuery ML matrix factorization model and the- ML.WEIGHTSfunction.- ML.GENERATE_EMBEDDINGgathers the- factor_weights.weightand- interceptvalues from the- ML.WEIGHTSoutput for the model into an array and outputs it as the- ml_generate_embedding_resultcolumn. Having all of the embeddings in a single column lets you directly use the- VECTOR_SEARCHfunction on the- ML.GENERATE_EMBEDDINGoutput.
Syntax
ML.GENERATE_EMBEDDING syntax differs depending on the
BigQuery ML model you choose. If you use a remote model, it also
differs depending on the Vertex AI model that your remote models
targets. Choose the option appropriate for your use case.
 multimodalembedding 
# Syntax for standard tables ML.GENERATE_EMBEDDING( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }, STRUCT( [FLATTEN_JSON_OUTPUT AS flatten_json_output] [, OUTPUT_DIMENSIONALITY AS output_dimensionality]) )
# Syntax for object tables ML.GENERATE_EMBEDDING( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }, STRUCT( [FLATTEN_JSON_OUTPUT AS flatten_json_output] [, START_SECOND AS start_second] [, END_SECOND AS end_second] [, INTERVAL_SECONDS AS interval_seconds] [, OUTPUT_DIMENSIONALITY AS output_dimensionality]) )
Arguments
ML.GENERATE_EMBEDDING takes the following arguments:
- PROJECT_ID: the project that contains the resource.
- DATASET: the BigQuery dataset that contains the resource.
- MODEL_NAME: the name of a remote model over a Vertex AI- multimodalembedding@001model.- You can confirm what LLM is used by the remote model by opening the Google Cloud console and looking at the Remote endpoint field in the model details page. 
- TABLE_NAME: one of the following:- If you are creating embeddings for text in a standard table, the name of the BigQuery table that contains the content. The content must be in a - STRINGcolumn named- content. If your table does not have a- contentcolumn, use the- QUERY_STATEMENTargument instead and provide a- SELECTstatement that includes an alias for an existing table column. An error occurs if no- contentcolumn is available.
- If you are creating embeddings for visual content using data from an an object table, the name of a BigQuery object table that contains the visual content. 
 
- QUERY_STATEMENT: the GoogleSQL query that generates the input data for the function.- If you are creating embeddings from a standard table, the query must produce a column named - content, which you can generate as follows:- For text embeddings, you can pull the value from a - STRINGcolumn, or you can specify a string literal in the query.
- For visual content embeddings, you can provide an - ObjectRefRuntimevalue for the- contentcolumn. You can generate- ObjectRefRuntimevalues by using the- OBJ.GET_ACCESS_URLfunction. The- OBJ.GET_ACCESS_URLfunction takes an- ObjectRefvalue as input, which you can provide by either specifying the name of a column that contains- ObjectRefvalues, or by constructing an- ObjectRefvalue.- ObjectRefRuntimevalues must have the- access_url.read_urland- details.gcs_metadata.content_typeelements of the JSON value populated.
 
- If you are creating embeddings from an object table, the query doesn't have to return a - contentcolumn. You can only specify- WHERE,- ORDER BY, and- LIMITclauses in the query.
 
- FLATTEN_JSON_OUTPUT: a- BOOLvalue that determines whether the- JSONcontent returned by the function is parsed into separate columns. The default is- TRUE.
- START_SECOND: a- FLOAT64value that specifies the second in the video at which to start the embedding. The default value is- 0. If you specify this argument, you must also specify the- END_SECONDargument. This value must be positive and less than the- END_SECONDvalue. This argument only applies to video content.
- END_SECOND: a- FLOAT64value that specifies the second in the video at which to end the embedding. The- END_SECONDvalue can't be higher than- 120. The default value is- 120. If you specify this argument, you must also specify the- START_SECONDargument. This value must be positive and greater than the- START_SECONDvalue. This argument only applies to video content.
- INTERVAL_SECONDS: a- FLOAT64value that specifies the interval to use when creating embeddings. For example, if you set- START_SECOND=- 0,- END_SECOND=- 120, and- INTERVAL_SECONDS=- 10, then the video is split into twelve 10 second segments (- [0, 10), [10, 20), [20, 30)...) and embeddings are generated for each segment. This value must be greater than or equal to- 4and less than- 120. The default value is- 16. This argument only applies to video content.
- OUTPUT_DIMENSIONALITY: an- INT64value that specifies the number of dimensions to use when generating embeddings. Valid values are- 128,- 256,- 512, and- 1408. The default value is- 1408. For example, if you specify- 256 AS output_dimensionality, then the- ml_generate_embedding_resultoutput column contains 256 embeddings for each input value.- You can only use this argument when creating text or image embeddings. If you use this argument when creating video embeddings, the function returns an error. 
Details
The model and input table must be in the same region.
 gemini-embedding-001 
ML.GENERATE_EMBEDDING( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }, STRUCT( [FLATTEN_JSON_OUTPUT AS flatten_json_output] [, TASK_TYPE AS task_type] [, OUTPUT_DIMENSIONALITY AS output_dimensionality]) )
Arguments
ML.GENERATE_EMBEDDING takes the following arguments:
- PROJECT_ID: the project that contains the resource.
- DATASET: the BigQuery dataset that contains the resource.
- MODEL_NAME: the name of a remote model over a supported open model.- You can confirm what LLM is used by the remote model by opening the Google Cloud console and looking at the Remote endpoint field in the model details page. 
- QUERY_STATEMENT: a query whose result contains a- STRINGcolumn that's named- content. For information about the supported SQL syntax of the- QUERY_STATEMENTclause, see GoogleSQL query syntax.
- FLATTEN_JSON_OUTPUT: a- BOOLvalue that determines whether the- JSONcontent returned by the function is parsed into separate columns. The default is- TRUE.
- TASK_TYPE: a- STRINGliteral that specifies the intended downstream application to help the model produce better quality embeddings. The- TASK_TYPEargument accepts the following values:- RETRIEVAL_QUERY: specifies that the given text is a query in a search or retrieval setting.
- RETRIEVAL_DOCUMENT: specifies that the given text is a document in a search or retrieval setting.- When using this task type, it is helpful to include the document title in the query statement in order to improve embedding quality. The document title must be in a column either named - titleor aliased as- title, for example:- SELECT * FROM ML.GENERATE_EMBEDDING( MODEL `mydataset.embedding_model`, (SELECT abstract as content, header as title, publication_number FROM `mydataset.publications`), STRUCT(TRUE AS flatten_json_output, 'RETRIEVAL_DOCUMENT' as task_type) ); - Specifying the title column in the input query populates the - titlefield of the request body sent to the model. If you specify a- titlevalue when using any other task type, that input is ignored and has no effect on the embedding results.
- SEMANTIC_SIMILARITY: specifies that the given text will be used for Semantic Textual Similarity (STS).
- CLASSIFICATION: specifies that the embeddings will be used for classification.
- CLUSTERING: specifies that the embeddings will be used for clustering.
- QUESTION_ANSWERING: specifies that the embeddings will be used for question answering.
- FACT_VERIFICATION: specifies that the embeddings will be used for fact verification.
- CODE_RETRIEVAL_QUERY: specifies that the embeddings will be used for code retrieval.
 
- OUTPUT_DIMENSIONALITY: an- INT64value in the range- [1, 3072]that specifies the number of dimensions to use when generating embeddings. For example, if you specify- 256 AS output_dimensionality, then the- ml_generate_embedding_resultoutput column contains 256 embeddings for each input value. The default value is- 3072.
Details
The model and input table must be in the same region.
 text-embedding 
ML.GENERATE_EMBEDDING( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }, STRUCT( [FLATTEN_JSON_OUTPUT AS flatten_json_output] [, TASK_TYPE AS task_type] [, OUTPUT_DIMENSIONALITY AS output_dimensionality]) )
Arguments
ML.GENERATE_EMBEDDING takes the following arguments:
- PROJECT_ID: the project that contains the resource.
- DATASET: the BigQuery dataset that contains the resource.
- MODEL_NAME: the name of a remote model over a Vertex AI text embedding model.- You can confirm what LLM is used by the remote model by opening the Google Cloud console and looking at the Remote endpoint field in the model details page. 
- TABLE_NAME: the name of the BigQuery table that contains a- STRINGcolumn to embed. The text in the column that's named- contentis sent to the model. If your table doesn't have a- contentcolumn, use a- SELECTstatement for this argument to provide an alias for an existing table column. An error occurs if no- contentcolumn exists.
- QUERY_STATEMENT: a query whose result contains a- STRINGcolumn that's named- content. For information about the supported SQL syntax of the- QUERY_STATEMENTclause, see GoogleSQL query syntax.
- FLATTEN_JSON_OUTPUT: a- BOOLvalue that determines whether the- JSONcontent returned by the function is parsed into separate columns. The default is- TRUE.
- TASK_TYPE: a- STRINGliteral that specifies the intended downstream application to help the model produce better quality embeddings. The- TASK_TYPEargument accepts the following values:- RETRIEVAL_QUERY: specifies that the given text is a query in a search or retrieval setting.
- RETRIEVAL_DOCUMENT: specifies that the given text is a document in a search or retrieval setting.- When using this task type, it is helpful to include the document title in the query statement in order to improve embedding quality. The document title must be in a column either named - titleor aliased as- title, for example:- SELECT * FROM ML.GENERATE_EMBEDDING( MODEL `mydataset.embedding_model`, (SELECT abstract as content, header as title, publication_number FROM `mydataset.publications`), STRUCT(TRUE AS flatten_json_output, 'RETRIEVAL_DOCUMENT' as task_type) ); - Specifying the title column in the input query populates the - titlefield of the request body sent to the model. If you specify a- titlevalue when using any other task type, that input is ignored and has no effect on the embedding results.
- SEMANTIC_SIMILARITY: specifies that the given text will be used for Semantic Textual Similarity (STS).
- CLASSIFICATION: specifies that the embeddings will be used for classification.
- CLUSTERING: specifies that the embeddings will be used for clustering.
- QUESTION_ANSWERING: specifies that the embeddings will be used for question answering.
- FACT_VERIFICATION: specifies that the embeddings will be used for fact verification.
- CODE_RETRIEVAL_QUERY: specifies that the embeddings will be used for code retrieval.
 
- OUTPUT_DIMENSIONALITY: an- INT64value in the range- [1, 768]that specifies the number of dimensions to use when generating embeddings. For example, if you specify- 256 AS output_dimensionality, then the- ml_generate_embedding_resultoutput column contains 256 embeddings for each input value. The default value is- 768.- You can only use this argument if the remote model that you specify in the - modelargument uses one of the following models as an endpoint:- text-embedding-004or later
- text-multilingual-embedding-002or later
 
Details
The model and input table must be in the same region.
 text-multilingual-embedding 
ML.GENERATE_EMBEDDING( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }, STRUCT( [FLATTEN_JSON_OUTPUT AS flatten_json_output] [, TASK_TYPE AS task_type] [, OUTPUT_DIMENSIONALITY AS output_dimensionality]) )
Arguments
ML.GENERATE_EMBEDDING takes the following arguments:
- PROJECT_ID: the project that contains the resource.
- DATASET: the BigQuery dataset that contains the resource.
- MODEL_NAME: the name of a remote model over a Vertex AI text embedding model.- You can confirm what LLM is used by the remote model by opening the Google Cloud console and looking at the Remote endpoint field in the model details page. 
- TABLE_NAME: the name of the BigQuery table that contains a- STRINGcolumn to embed. The text in the column that's named- contentis sent to the model. If your table doesn't have a- contentcolumn, use a- SELECTstatement for this argument to provide an alias for an existing table column. An error occurs if no- contentcolumn exists.
- QUERY_STATEMENT: a query whose result contains a- STRINGcolumn that's named- content. For information about the supported SQL syntax of the- QUERY_STATEMENTclause, see GoogleSQL query syntax.
- FLATTEN_JSON_OUTPUT: a- BOOLvalue that determines whether the- JSONcontent returned by the function is parsed into separate columns. The default is- TRUE.
- TASK_TYPE: a- STRINGliteral that specifies the intended downstream application to help the model produce better quality embeddings. The- TASK_TYPEargument accepts the following values:- RETRIEVAL_QUERY: specifies that the given text is a query in a search or retrieval setting.
- RETRIEVAL_DOCUMENT: specifies that the given text is a document in a search or retrieval setting.- When using this task type, it is helpful to include the document title in the query statement in order to improve embedding quality. The document title must be in a column either named - titleor aliased as- title, for example:- SELECT * FROM ML.GENERATE_EMBEDDING( MODEL `mydataset.embedding_model`, (SELECT abstract as content, header as title, publication_number FROM `mydataset.publications`), STRUCT(TRUE AS flatten_json_output, 'RETRIEVAL_DOCUMENT' as task_type) ); - Specifying the title column in the input query populates the - titlefield of the request body sent to the model. If you specify a- titlevalue when using any other task type, that input is ignored and has no effect on the embedding results.
- SEMANTIC_SIMILARITY: specifies that the given text will be used for Semantic Textual Similarity (STS).
- CLASSIFICATION: specifies that the embeddings will be used for classification.
- CLUSTERING: specifies that the embeddings will be used for clustering.
- QUESTION_ANSWERING: specifies that the embeddings will be used for question answering.
- FACT_VERIFICATION: specifies that the embeddings will be used for fact verification.
- CODE_RETRIEVAL_QUERY: specifies that the embeddings will be used for code retrieval.
 
- OUTPUT_DIMENSIONALITY: an- INT64value in the range- [1, 768]that specifies the number of dimensions to use when generating embeddings. For example, if you specify- 256 AS output_dimensionality, then the- ml_generate_embedding_resultoutput column contains 256 embeddings for each input value. The default value is- 768.- You can only use this argument if the remote model that you specify in the - modelargument uses one of the following models as an endpoint:- text-embedding-004or later
- text-multilingual-embedding-002or later
 
Details
The model and input table must be in the same region.
Open models
ML.GENERATE_EMBEDDING( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }, STRUCT([FLATTEN_JSON_OUTPUT AS flatten_json_output]) )
Arguments
ML.GENERATE_EMBEDDING takes the following arguments:
- PROJECT_ID: the project that contains the resource.
- DATASET: the BigQuery dataset that contains the resource.
- MODEL_NAME: the name of a remote model over a supported open model.- You can confirm the type of model by opening the Google Cloud console and looking at the Model type field in the model details page. 
- TABLE_NAME: the name of the BigQuery table that contains a- STRINGcolumn to embed. The text in the column that's named- contentis sent to the model. If your table doesn't have a- contentcolumn, use a- SELECTstatement for this argument to provide an alias for an existing table column. An error occurs if no- contentcolumn exists.
- QUERY_STATEMENT: a query whose result contains a- STRINGcolumn that's named- content. For information about the supported SQL syntax of the- QUERY_STATEMENTclause, see GoogleSQL query syntax.
- FLATTEN_JSON_OUTPUT: a- BOOLvalue that determines whether the- JSONcontent returned by the function is parsed into separate columns. The default is- TRUE.
Details
The model and input table must be in the same region.
PCA
ML.GENERATE_EMBEDDING( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) } )
Arguments
ML.GENERATE_EMBEDDING takes the following arguments:
- PROJECT_ID: the project that contains the resource.
- DATASET: the BigQuery dataset that contains the resource.
- MODEL_NAME: the name of a PCA model.- You can confirm the type of model by opening the Google Cloud console and looking at the Model type field in the model details page. 
- TABLE_NAME: the name of the BigQuery table that contains the input data for the PCA model.
- QUERY_STATEMENT: a query whose result contains the input data for the PCA model.
Details
The model and input table must be in the same region.
Autoencoder
ML.GENERATE_EMBEDDING( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }, STRUCT([TRIAL_ID AS trial_id]) )
Arguments
ML.GENERATE_EMBEDDING takes the following arguments:
- PROJECT_ID: the project that contains the resource.
- DATASET: the BigQuery dataset that contains the resource.
- MODEL_NAME: the name of an autoencoder model.- You can confirm the type of model by opening the Google Cloud console and looking at the Model type field in the model details page. 
- TABLE_NAME: the name of the BigQuery table that contains the input data for the autoencoder model.
- QUERY_STATEMENT: a query whose result contains the input data for the autoencoder model.
- TRIAL_ID: an- INT64value that identifies the hyperparameter tuning trial that you want the function to evaluate. The function uses the optimal trial by default. Only specify this argument if you ran hyperparameter tuning when creating the model.
Details
The model and input table must be in the same region.
Matrix factorization
ML.GENERATE_EMBEDDING( MODEL `PROJECT_ID.DATASET.MODEL_NAME`, STRUCT([TRIAL_ID AS trial_id]) )
Arguments
ML.GENERATE_EMBEDDING takes the following arguments:
- PROJECT_ID: the project that contains the resource.
- DATASET: the BigQuery dataset that contains the resource.
- MODEL_NAME: the name of a matrix factorization model.- You can confirm the type of model by opening the Google Cloud console and looking at the Model type field in the model details page. 
- TRIAL_ID: an- INT64value that identifies the hyperparameter tuning trial that you want the function to evaluate. The function uses the optimal trial by default. Only specify this argument if you ran hyperparameter tuning when creating the model.
Output
 multimodalembedding 
ML.GENERATE_EMBEDDING returns the input table and the following columns:
- ml_generate_embedding_result:- If flatten_json_outputisFALSE, this is the JSON response from theprojects.locations.endpoints.predictcall to the model. The generated embeddings are in thetextEmbedding,imageEmbedding, orvideoEmbeddingselement, depending on the type of input data you used.
- If flatten_json_outputisTRUE, this is anARRAY<FLOAT64>value that contains the generated embeddings.
 
- If 
- ml_generate_embedding_status: a- STRINGvalue that contains the API response status for the corresponding row. This value is empty if the operation was successful.
- ml_generate_embedding_start_sec: for video content, an- INT64value that contains the starting second of the portion of the video that the embedding represents. For image content, the value is- NULL. This column isn't returned for text content.
- ml_generate_embedding_end_sec: for video content, an- INT64value that contains the ending second of the portion of the video that the embedding represents. For image content, the value is- NULL. This column isn't returned for text content.
 gemini-embedding-001 
ML.GENERATE_EMBEDDING returns the input table and the following columns:
- ml_generate_embedding_result:- If flatten_json_outputisFALSE, this is the JSON response from theprojects.locations.endpoints.predictcall to the model. The generated embeddings are in thevalueselement.
- If flatten_json_outputisTRUE, this is anARRAY<FLOAT64>value that contains the generated embeddings.
 
- If 
- ml_generate_embedding_statistics: a- JSONvalue that contains a- token_countfield with the number of tokens in the content, and a- truncatedfield that indicates whether the content was truncated. This column is returned when- flatten_json_outputis- TRUE.
- ml_generate_embedding_status: a- STRINGvalue that contains the API response status for the corresponding row. This value is empty if the operation was successful.
 text-embedding 
ML.GENERATE_EMBEDDING returns the input table and the following columns:
- ml_generate_embedding_result:- If flatten_json_outputisFALSE, this is the JSON response from theprojects.locations.endpoints.predictcall to the model. The generated embeddings are in thevalueselement.
- If flatten_json_outputisTRUE, this is anARRAY<FLOAT64>value that contains the generated embeddings.
 
- If 
- ml_generate_embedding_statistics: a- JSONvalue that contains a- token_countfield with the number of tokens in the content, and a- truncatedfield that indicates whether the content was truncated. This column is returned when- flatten_json_outputis- TRUE.
- ml_generate_embedding_status: a- STRINGvalue that contains the API response status for the corresponding row. This value is empty if the operation was successful.
 text-multilingual-embedding 
ML.GENERATE_EMBEDDING returns the input table and the following columns:
- ml_generate_embedding_result:- If flatten_json_outputisFALSE, this is the JSON response from theprojects.locations.endpoints.predictcall to the model. The generated embeddings are in thevalueselement.
- If flatten_json_outputisTRUE, this is anARRAY<FLOAT64>value that contains the generated embeddings.
 
- If 
- ml_generate_embedding_statistics: a- JSONvalue that contains a- token_countfield with the number of tokens in the content, and a- truncatedfield that indicates whether the content was truncated. This column is returned when- flatten_json_outputis- TRUE.
- ml_generate_embedding_status: a- STRINGvalue that contains the API response status for the corresponding row. This value is empty if the operation was successful.
Open models
ML.GENERATE_EMBEDDING returns the input table and the following columns:
- ml_generate_embedding_result:- If flatten_json_outputisFALSE, this is the JSON response from theprojects.locations.endpoints.predictcall to the model. The generated embeddings are in the first element of thepredictionsarray.
- If flatten_json_outputisTRUE, this is anARRAY<FLOAT64>value that contains the generated embeddings.
 
- If 
- ml_generate_embedding_status: a- STRINGvalue that contains the API response status for the corresponding row. This value is empty if the operation was successful.
PCA
ML.GENERATE_EMBEDDING returns the input table and the following column:
- ml_generate_embedding_result: this is an- ARRAY<FLOAT>value that contains the principal components for the input data. The number of array dimensions is equal to the PCA model's- NUM_PRINCIPAL_COMPONENTSoption value if that option is used when the model is created. If the- PCA_EXPLAINED_VARIANCE_RATIOoption is used instead, the array dimensions vary depending on the input table and the option ratio determined by BigQuery ML.
Autoencoder
ML.GENERATE_EMBEDDING returns the input table and the following column:
- trial_id: an- INT64value that identifies the hyperparameter tuning trial used by the function. This column is only returned if you ran hyperparameter tuning when creating the model.
- ml_generate_embedding_result: this is an- ARRAY<FLOAT>value that contains the latent space dimensions for the input data. The number of array dimensions is equal to the number in the middle of the autoencoder model's- HIDDEN_UNITSoption array value.
Matrix factorization
ML.GENERATE_EMBEDDING returns the following columns:
- trial_id: an- INT64value that identifies the hyperparameter tuning trial used by the function. This column is only returned if you ran hyperparameter tuning when creating the model.
- ml_generate_embedding_result: this is an- ARRAY<FLOAT>value that contains the weights of the feature, and also the intercept or bias term for the feature. The intercept value is the last value in the array. The number of array dimensions is equal to the matrix factorization model's- NUM_FACTORSoption value.
- processed_input: a- STRINGvalue that contains the name of the user or item column. The value of this column matches the name of the user or item column provided in the- query_statementclause that was used when the matrix factorization model was trained.
- feature: a- STRINGvalue that contains the names of the specific users or items used during training.
Supported visual content
You can use the ML.GENERATE_EMBEDDING function to generate embeddings for
videos and images that meet the requirements described in
API limits.
There is no limitation on the length of the video files you can use
with this function. However, the function only processes the first two minutes
of a video. If a video is longer than two minutes, the
ML.GENERATE_EMBEDDING function only returns embeddings for the
first two minutes.
Known issues
Sometimes after a query job that uses this function finishes successfully, some returned rows contain the following error message:
A retryable error occurred: RESOURCE EXHAUSTED error from <remote endpoint>
This issue occurs because BigQuery query jobs finish successfully
even if the function fails for some of the rows. The function fails when the
volume of API calls to the remote endpoint exceeds the quota limits for that
service. This issue occurs most often when you are running multiple parallel
batch queries. BigQuery retries these calls, but if the retries
fail, the resource exhausted error message is returned.
To iterate through inference calls until all rows are successfully processed, you can use the BigQuery remote inference SQL scripts or the BigQuery remote inference pipeline Dataform package.
Examples
 multimodalembedding 
This example shows how to generate embeddings from visual content by using a
remote model that references a multimodalembedding model.
Create the remote model:
CREATE OR REPLACE MODEL `mydataset.multimodalembedding` REMOTE WITH CONNECTION `us.test_connection` OPTIONS(ENDPOINT = 'multimodalembedding@001')
Use an ObjectRefRuntime value
Generate embeddings from visual content in an ObjectRef column
in a standard table:
SELECT * FROM ML.GENERATE_EMBEDDING( MODEL `mydataset.multimodalembedding`, ( SELECT OBJ.GET_ACCESS_URL(art_image, 'r') as content FROM `mydataset.art`) );
Use an object table
Generate embeddings from visual content in an object table:
SELECT * FROM ML.GENERATE_EMBEDDING( MODEL `mydataset.multimodalembedding`, TABLE `mydataset.my_object_table`);
 text-embedding 
This example shows how to generate an embedding of a single piece of
sample text by using a remote model that references a
text-embedding model.
Create the remote model:
CREATE OR REPLACE MODEL `mydataset.text_embedding` REMOTE WITH CONNECTION `us.test_connection` OPTIONS(ENDPOINT = 'text-embedding-005')
Generate the embedding:
SELECT * FROM ML.GENERATE_EMBEDDING( MODEL `mydataset.text_embedding`, (SELECT "Example text to embed" AS content), STRUCT(TRUE AS flatten_json_output) );
 text-multilingual-embedding 
This example shows how to generate embeddings from a table and specify
a task type by using a remote model that references a
text-multilingual-embedding model.
Create the remote model:
CREATE OR REPLACE MODEL `mydataset.text_multi` REMOTE WITH CONNECTION `us.test_connection` OPTIONS(ENDPOINT = 'text-multilingual-embedding-002')
Generate the embeddings:
SELECT * FROM ML.GENERATE_EMBEDDING( MODEL `mydataset.text_multi`, TABLE `mydataset.customer_feedback`, STRUCT(TRUE AS flatten_json_output, 'SEMANTIC_SIMILARITY' as task_type) );
PCA
This example shows how to generate embeddings that represent the principal components of a PCA model.
Create the PCA model:
CREATE OR REPLACE MODEL `mydataset.pca_nyc_trees` OPTIONS ( MODEL_TYPE = 'PCA', PCA_EXPLAINED_VARIANCE_RATIO = 0.9) AS ( SELECT tree_id, block_id, tree_dbh, stump_diam, curb_loc, status, health, spc_latin FROM `bigquery-public-data.new_york_trees.tree_census_2015` );
Generate embeddings that represent principal components:
SELECT * FROM ML.GENERATE_EMBEDDING( MODEL `mydataset.pca_nyc_trees`, ( SELECT tree_id, block_id, tree_dbh, stump_diam, curb_loc, status, health, spc_latin FROM `bigquery-public-data.new_york_trees.tree_census_2015` ));
Autoencoder
This example shows how to generate embeddings that represent the latent space dimensions of an autoencoder model.
Create the autoencoder model:
CREATE OR REPLACE MODEL `mydataset.my_autoencoder_model` OPTIONS ( model_type = 'autoencoder', activation_fn = 'relu', batch_size = 8, dropout = 0.2, hidden_units = [ 32, 16, 4, 16, 32], learn_rate = 0.001, l1_reg_activation = 0.0001, max_iterations = 10, optimizer = 'adam') AS SELECT * EXCEPT ( Time, Class) FROM `bigquery-public-data.ml_datasets.ulb_fraud_detection`;
Generate embeddings that represent latent space dimensions:
SELECT * FROM ML.GENERATE_EMBEDDING( MODEL `mydataset.my_autoencoder_model`, TABLE `bigquery-public-data.ml_datasets.ulb_fraud_detection`);
Matrix factorization
This example shows how to generate embeddings that represent the underlying weights that the matrix factorization model uses during prediction.
Create the matrix factorization model:
CREATE OR REPLACE MODEL `mydataset.my_mf_model` OPTIONS ( model_type='matrix_factorization', user_col='user_id', item_col='item_id', l2_reg=9.83, num_factors=34) AS SELECT user_id, item_id, AVG(rating) as rating FROM movielens.movielens_1m GROUP BY user_id, item_id;
Generate embeddings that represent model weights and intercepts:
SELECT * FROM ML.GENERATE_EMBEDDING(MODEL `mydataset.my_mf_model`)
Locations
The ML.GENERATE_EMBEDDING function must run in the same
region or multi-region as the model that the
function references. For more information on supported regions for
embedding models, see
Google model endpoint locations.
Embedding models are also available in the US multi-region.
Quotas
Quotas apply when you use the ML.GENERATE_EMBEDDING function with remote
models. For more information, see Vertex AI and Cloud AI service
functions quotas and limits.
For the multimodalembedding model, the
default requests per minute (RPM) for non-EU regions is 600.
The default RPM for EU regions is 120. However, you can request a quota
increase in order to increase throughput.
To increase quota, first request more quota for the Vertex AI
multimodalembedding model by using the process described in
Manage your quota using the console.
When the model quota has been increased, send an email to
  bqml-feedback@google.com and request a
  quota increase for the ML.GENERATE_EMBEDDING function. Include information
  about the adjusted multimodalembedding quota.
What's next
- Try creating embeddings:
- For more information about using Vertex AI models to generate text and embeddings, see Generative AI overview.
- Try the Perform semantic search and retrieval-augmented generation tutorial to learn how to do the following tasks: - Generate text embeddings.
- Create a vector index on the embeddings.
- Perform a vector search with the embeddings to search for similar text.
- Perform retrieval-augmented generation (RAG) by using vector search results to augment the prompt input and improve results.
 
- Try the Parse PDFs in a retrieval-augmented generation pipeline tutorial to learn how to create a RAG pipeline based on parsed PDF content. 
- For more information about using Cloud AI APIs to perform AI tasks, see AI application overview. 
- For more information about supported SQL statements and functions for generative AI models, see End-to-end user journeys for generative AI models.