Gemini Embedding 2 is Google's embedding generation model that's ideal for complex retrieval and analytics tasks.
Gemini Embedding 2 accepts multimodal inputs to generate 3072-dimensional vectors. It accepts images, text, documents, audio, and video inputs and semantically maps the generated vectors into a unified semantic space. This lets you perform tasks, such as searching for an image based on a text description.
Gemini Embedding 2 introduces several features to optimize embedding quality and flexibility:
Custom task instructions: By specifying task instructions—for example,
task:code retrievalortask:search result—optimize the embeddings for the intended relationships and retrieve more accurate results for the specific goal.Adjustable result size: The model generates a 3072-dimensional float vector, by default. However, you can retrieve a smaller dimensional output by specifying the
output_dimensionalityparameter.Document OCR: Read OCR from document inputs.
Audio track extraction: Extract audio tracks from video inputs and interleave them with video frames.
Try in Vertex AI (Preview) Deploy example app
| Model ID | gemini-embedding-2-preview |
|
|---|---|---|
| Supported inputs & outputs |
|
|
| Token limits |
|
|
| Maximum sequence length |
8,192 tokens |
|
| Output dimensions |
Up to 3,072 (with MRL support) |
|
| Consumption options |
|
|
| See Consumption options for more information. | ||
| Technical specifications | ||
| Images |
|
|
| Documents |
|
|
| Video |
|
|
| Audio |
|
|
| Parameter defaults |
|
|
| Supported regions | ||
|
Model availability |
|
|
| See Deployments and endpoints for more information. | ||
| Knowledge cutoff date | November 2025 | |
| Versions |
|
|
| Supported languages | See Supported languages. | |
| Pricing | See Pricing. | |