Deployments and endpoints

Google and Partner models and generative AI features on Vertex AI are exposed as specific regional endpoints and a global endpoint. Global endpoints cover the entire world and provide higher availability and reliability than single regions.

Global endpoint

Selecting a global endpoint for your requests can improve overall availability while reducing resource exhausted (429) errors. Don't use the global endpoint if you have ML processing requirements, because you can't control or know which region your ML processing requests are sent to when a request is made.

Supported models

Usage of the global endpoint is supported for the following Google models in specified regions. For details about which regions support the global endpoint, see the Global tab in the Google model endpoint locations table.

For information about global endpoint availability for partner models, see the Global tab in the Google Cloud partner model endpoint locations table.

Use the global endpoint

To use the global endpoint, exclude the location from the endpoint name and configure the location of the resource to global. For example, the following is global endpoint URL:

https://aiplatform.googleapis.com/v1/projects/test-project/locations/global/publishers/google/models/gemini-2.0-flash-001:generateContent

For the Google Gen AI SDK, create a client that uses the global location:

# google-genai >= 0.8.0 is required
client = genai.Client(
    vertexai=True, project='PROJECT_ID', location='global'
)

For the Vertex AI SDK for Python, initialize the SDK using the global location:

# google-cloud-aiplatform >= 1.79.0 is required
import vertexai
from vertexai.generative_models import GenerativeModel

vertexai.init(project='PROJECT_ID', location='global')

Limitations

The following capabilities are not available when using the global endpoint:

  • Tuning
  • Batch prediction for Anthropic and OpenMaaS models
  • Retrieval-augmented generation (RAG) corpus (RAG requests are supported)

Usage of the global endpoint with Provisioned Throughput is available only for the following models:

Click to view supported models for the global endpoint when using Provisioned Throughput

Model Latest supported model version
Gemini 3 Flash preview gemini-3-flash-preview
Gemini 3 Pro preview gemini-3-pro-preview
Gemini 3 Pro Image preview gemini-3-pro-image-preview
Gemini 2.5 Pro gemini-2.5-pro
Gemini 2.5 Flash preview gemini-2.5-flash-preview-09-2025
Gemini 2.5 Flash-Lite preview gemini-2.5-flash-lite-preview-09-2025
Gemini 2.5 Flash Image gemini-2.5-flash-image
Gemini 2.5 Flash gemini-2.5-flash
Gemini 2.5 Flash-Lite gemini-2.5-flash-lite
Gemini 2.0 Flash gemini-2.0-flash-001
Gemini 2.0 Flash-Lite gemini-2.0-flash-lite-001

Google model endpoint locations

Google models in Vertex AI are available for the following endpoints:

Global

Global(global)
Gemini models
Gemini 3 Flash preview (gemini-3-flash-preview)
Gemini 3 Pro preview (gemini-3-pro-preview)
Gemini 3 Pro Image preview (gemini-3-pro-image-preview)
Gemini 2.5 Pro (gemini-2.5-pro)
Gemini 2.5 Flash preview (gemini-2.5-flash-preview-09-2025)
Gemini 2.5 Flash-Lite preview (gemini-2.5-flash-lite-preview-09-2025)
Gemini 2.5 Flash Image (gemini-2.5-flash-image)
Gemini 2.5 Flash (gemini-2.5-flash)
Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)
Gemini 2.5 Flash with Gemini Live API native audio (gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash with Gemini Live API native audio preview (gemini-live-2.5-flash-preview-native-audio-09-2025)
Gemini 2.0 Flash with Gemini Live API preview (gemini-2.0-flash-live-preview-04-09)
Gemini 2.0 Flash (gemini-2.0-flash)
Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)
Embeddings models
Gemini Embeddings (gemini-embeddings-001)
Embeddings for Text
Embeddings for Multimodal
Imagen on Vertex AI models
Imagen 3 (imagen-3.0-generate-002)
Imagen 3 (imagen-3.0-generate-001)
Imagen 3 Fast (imagen-3.0-fast-generate-001)
Imagen 3 Controlled Customization (imagen-3.0-capability-001)
Imagen 4 (imagen-4.0-generate-001)
Imagen 4 (imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate (imagen-4.0-ultra-generate-001)
Veo on Vertex AI models
Veo 2 Generate (veo-2.0-generate-001)
Veo 2 Generate preview (veo-2.0-generate-exp)
Veo 2 Generate preview (veo-2.0-generate-preview)
Veo 3 Generate preview (veo-3.0-generate-preview)
Veo 3 Generate preview (veo-3.0-fast-generate-preview)
Veo 3 Generate (veo-3.0-generate-001)
Veo 3 Fast Generate (veo-3.0-fast-generate-001)
Veo 3.1 Generate preview (veo-3.1-generate-preview)
Veo 3.1 Fast Generate preview (veo-3.1-fast-generate-preview)
Veo 3.1 Generate (veo-3.1-generate-001)
Veo 3.1 Fast Generate (veo-3.1-fast-generate-001)
Speech-to-Text and Text-to-Speech models
Chirp 3: Transcription (chirp_3)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Pro TTS (gemini-2.5-pro-tts)
Gemini 2.5 Flash TTS (gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS preview (gemini-2.5-flash-lite-preview-tts)

United States

Oregon(us-west1) Las Vegas(us-west4) Iowa(us-central1) South Carolina(us-east1) N. Virginia(us-east4) Columbus(us-east5) Dallas(us-south1)
Gemini models
Gemini 3 Flash preview (gemini-3-flash-preview)
Gemini 3 Pro preview (gemini-3-pro-preview)
Gemini 3 Pro Image preview (gemini-3-pro-image-preview)
Gemini 2.5 Pro (gemini-2.5-pro)
Gemini 2.5 Flash preview (gemini-2.5-flash-preview-09-2025)
Gemini 2.5 Flash-Lite preview (gemini-2.5-flash-lite-preview-09-2025)
Gemini 2.5 Flash Image (gemini-2.5-flash-image)
Gemini 2.5 Flash (gemini-2.5-flash)
Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)
Gemini 2.5 Flash with Gemini Live API native audio (gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash with Gemini Live API native audio preview (gemini-live-2.5-flash-preview-native-audio-09-2025)
Gemini 2.0 Flash with Gemini Live API preview (gemini-2.0-flash-live-preview-04-09)
Gemini 2.0 Flash (gemini-2.0-flash)
Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)
Embeddings models
Gemini Embeddings (gemini-embeddings-001)
Embeddings for Text
Embeddings for Multimodal
Imagen on Vertex AI models
Imagen 3 (imagen-3.0-generate-002)
Imagen 3 (imagen-3.0-generate-001)
Imagen 3 Fast (imagen-3.0-fast-generate-001)
Imagen 3 Controlled Customization (imagen-3.0-capability-001)
Imagen 4 (imagen-4.0-generate-001)
Imagen 4 (imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate (imagen-4.0-ultra-generate-001)
Veo on Vertex AI models
Veo 2 Generate (veo-2.0-generate-001)
Veo 2 Generate preview (veo-2.0-generate-exp)
Veo 2 Generate preview (veo-2.0-generate-preview)
Veo 3 Generate preview (veo-3.0-generate-preview)
Veo 3 Generate preview (veo-3.0-fast-generate-preview)
Veo 3 Generate (veo-3.0-generate-001)
Veo 3 Fast Generate (veo-3.0-fast-generate-001)
Veo 3.1 Generate preview (veo-3.1-generate-preview)
Veo 3.1 Fast Generate preview (veo-3.1-fast-generate-preview)
Veo 3.1 Generate (veo-3.1-generate-001)
Veo 3.1 Fast Generate (veo-3.1-fast-generate-001)
Speech-to-Text and Text-to-Speech models
Chirp 3: Transcription (chirp_3)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Pro TTS (gemini-2.5-pro-tts)
Gemini 2.5 Flash TTS (gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS preview (gemini-2.5-flash-lite-preview-tts)

Americas

Montréal(northamerica-northeast1) São Paulo(southamerica-east1)
Gemini models
Gemini 3 Flash preview (gemini-3-flash-preview)
Gemini 3 Pro preview (gemini-3-pro-preview)
Gemini 3 Pro Image preview (gemini-3-pro-image-preview)
Gemini 2.5 Pro (gemini-2.5-pro)
Gemini 2.5 Flash preview (gemini-2.5-flash-preview-09-2025)
Gemini 2.5 Flash-Lite preview (gemini-2.5-flash-lite-preview-09-2025)
Gemini 2.5 Flash Image (gemini-2.5-flash-image)
Gemini 2.5 Flash (gemini-2.5-flash)
Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)
Gemini 2.5 Flash with Gemini Live API native audio (gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash with Gemini Live API native audio preview (gemini-live-2.5-flash-preview-native-audio-09-2025)
Gemini 2.0 Flash with Gemini Live API preview (gemini-2.0-flash-live-preview-04-09)
Gemini 2.0 Flash (gemini-2.0-flash)
Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)
Embeddings models
Gemini Embeddings (gemini-embeddings-001)
Embeddings for Text
Embeddings for Multimodal
Imagen on Vertex AI models
Imagen 3 (imagen-3.0-generate-002)
Imagen 3 (imagen-3.0-generate-001)
Imagen 3 Fast (imagen-3.0-fast-generate-001)
Imagen 3 Controlled Customization (imagen-3.0-capability-001)
Imagen 4 (imagen-4.0-generate-001)
Imagen 4 (imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate (imagen-4.0-ultra-generate-001)
Veo on Vertex AI models
Veo 2 Generate (veo-2.0-generate-001)
Veo 2 Generate preview (veo-2.0-generate-exp)
Veo 2 Generate preview (veo-2.0-generate-preview)
Veo 3 Generate preview (veo-3.0-generate-preview)
Veo 3 Generate preview (veo-3.0-fast-generate-preview)
Veo 3 Generate (veo-3.0-generate-001)
Veo 3 Fast Generate (veo-3.0-fast-generate-001)
Veo 3.1 Generate preview (veo-3.1-generate-preview)
Veo 3.1 Fast Generate preview (veo-3.1-fast-generate-preview)
Veo 3.1 Generate (veo-3.1-generate-001)
Veo 3.1 Fast Generate (veo-3.1-fast-generate-001)
Speech-to-Text and Text-to-Speech models
Chirp 3: Transcription (chirp_3)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Pro TTS (gemini-2.5-pro-tts)
Gemini 2.5 Flash TTS (gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS preview (gemini-2.5-flash-lite-preview-tts)

Europe

London(europe-west2) Belgium(europe-west1) Netherlands(europe-west4) Zürich(europe-west6) Frankfurt(europe-west3) Finland(europe-north1) Warsaw(europe-central2) Milan(europe-west8) Madrid(europe-southwest1) Paris(europe-west9)
Gemini models
Gemini 3 Flash preview (gemini-3-flash-preview)
Gemini 3 Pro preview (gemini-3-pro-preview)
Gemini 3 Pro Image preview (gemini-3-pro-image-preview)
Gemini 2.5 Pro (gemini-2.5-pro)
Gemini 2.5 Flash preview (gemini-2.5-flash-preview-09-2025)
Gemini 2.5 Flash-Lite preview (gemini-2.5-flash-lite-preview-09-2025)
Gemini 2.5 Flash Image (gemini-2.5-flash-image)
Gemini 2.5 Flash (gemini-2.5-flash)
Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)
Gemini 2.5 Flash with Gemini Live API native audio (gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash with Gemini Live API native audio preview (gemini-live-2.5-flash-preview-native-audio-09-2025)
Gemini 2.0 Flash with Gemini Live API preview (gemini-2.0-flash-live-preview-04-09)
Gemini 2.0 Flash (gemini-2.0-flash)
Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)
Embeddings models
Gemini Embeddings (gemini-embeddings-001)
Embeddings for Text
Embeddings for Multimodal
Imagen on Vertex AI models
Imagen 3 (imagen-3.0-generate-002)
Imagen 3 (imagen-3.0-generate-001)
Imagen 3 Fast (imagen-3.0-fast-generate-001)
Imagen 3 Controlled Customization (imagen-3.0-capability-001)
Imagen 4 (imagen-4.0-generate-001)
Imagen 4 (imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate (imagen-4.0-ultra-generate-001)
Veo on Vertex AI models
Veo 2 Generate (veo-2.0-generate-001)
Veo 2 Generate preview (veo-2.0-generate-exp)
Veo 2 Generate preview (veo-2.0-generate-preview)
Veo 3 Generate preview (veo-3.0-generate-preview)
Veo 3 Generate preview (veo-3.0-fast-generate-preview)
Veo 3 Generate (veo-3.0-generate-001)
Veo 3 Fast Generate (veo-3.0-fast-generate-001)
Veo 3.1 Generate preview (veo-3.1-generate-preview)
Veo 3.1 Fast Generate preview (veo-3.1-fast-generate-preview)
Veo 3.1 Generate (veo-3.1-generate-001)
Veo 3.1 Fast Generate (veo-3.1-fast-generate-001)
Speech-to-Text and Text-to-Speech models
Chirp 3: Transcription (chirp_3)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Pro TTS (gemini-2.5-pro-tts)
Gemini 2.5 Flash TTS (gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS preview (gemini-2.5-flash-lite-preview-tts)

Asia Pacific

Mumbai(asia-south1) Singapore(asia-southeast1) Hong Kong(asia-east2) Taiwan(asia-east1) Tokyo(asia-northeast1) Sydney(australia-southeast1) Seoul(asia-northeast3)
Gemini models
Gemini 3 Flash preview (gemini-3-flash-preview)
Gemini 3 Pro preview (gemini-3-pro-preview)
Gemini 3 Pro Image preview (gemini-3-pro-image-preview)
Gemini 2.5 Pro (gemini-2.5-pro)
Gemini 2.5 Flash preview (gemini-2.5-flash-preview-09-2025)
Gemini 2.5 Flash-Lite preview (gemini-2.5-flash-lite-preview-09-2025)
Gemini 2.5 Flash Image (gemini-2.5-flash-image)
Gemini 2.5 Flash (gemini-2.5-flash)
Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)
Gemini 2.5 Flash with Gemini Live API native audio (gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash with Gemini Live API native audio preview (gemini-live-2.5-flash-preview-native-audio-09-2025)
Gemini 2.0 Flash with Gemini Live API preview (gemini-2.0-flash-live-preview-04-09)
Gemini 2.0 Flash (gemini-2.0-flash)
Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)
Embeddings models
Gemini Embeddings (gemini-embeddings-001)
Embeddings for Text
Embeddings for Multimodal
Imagen on Vertex AI models
Imagen 3 (imagen-3.0-generate-002)
Imagen 3 (imagen-3.0-generate-001)
Imagen 3 Fast (imagen-3.0-fast-generate-001)
Imagen 3 Controlled Customization (imagen-3.0-capability-001)
Imagen 4 (imagen-4.0-generate-001)
Imagen 4 (imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate (imagen-4.0-ultra-generate-001)
Veo on Vertex AI models
Veo 2 Generate (veo-2.0-generate-001)
Veo 2 Generate preview (veo-2.0-generate-exp)
Veo 2 Generate preview (veo-2.0-generate-preview)
Veo 3 Generate preview (veo-3.0-generate-preview)
Veo 3 Generate preview (veo-3.0-fast-generate-preview)
Veo 3 Generate (veo-3.0-generate-001)
Veo 3 Fast Generate (veo-3.0-fast-generate-001)
Veo 3.1 Generate preview (veo-3.1-generate-preview)
Veo 3.1 Fast Generate preview (veo-3.1-fast-generate-preview)
Veo 3.1 Generate (veo-3.1-generate-001)
Veo 3.1 Fast Generate (veo-3.1-fast-generate-001)
Speech-to-Text and Text-to-Speech models
Chirp 3: Transcription (chirp_3)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Pro TTS (gemini-2.5-pro-tts)
Gemini 2.5 Flash TTS (gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS preview (gemini-2.5-flash-lite-preview-tts)

Middle East

Tel Aviv(me-west1) Doha(me-central1) Dammam(me-central2)
Gemini models
Gemini 3 Flash preview (gemini-3-flash-preview)
Gemini 3 Pro preview (gemini-3-pro-preview)
Gemini 3 Pro Image preview (gemini-3-pro-image-preview)
Gemini 2.5 Pro (gemini-2.5-pro)
Gemini 2.5 Flash preview (gemini-2.5-flash-preview-09-2025)
Gemini 2.5 Flash-Lite preview (gemini-2.5-flash-lite-preview-09-2025)
Gemini 2.5 Flash Image (gemini-2.5-flash-image)
Gemini 2.5 Flash (gemini-2.5-flash)
Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)
Gemini 2.5 Flash with Gemini Live API native audio (gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash with Gemini Live API native audio preview (gemini-live-2.5-flash-preview-native-audio-09-2025)
Gemini 2.0 Flash with Gemini Live API preview (gemini-2.0-flash-live-preview-04-09)
Gemini 2.0 Flash (gemini-2.0-flash)
Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)
Embeddings models
Gemini Embeddings (gemini-embeddings-001)
Embeddings for Text
Embeddings for Multimodal
Imagen on Vertex AI models
Imagen 3 (imagen-3.0-generate-002)
Imagen 3 (imagen-3.0-generate-001)
Imagen 3 Fast (imagen-3.0-fast-generate-001)
Imagen 3 Controlled Customization (imagen-3.0-capability-001)
Imagen 4 (imagen-4.0-generate-001)
Imagen 4 (imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate (imagen-4.0-ultra-generate-001)
Veo on Vertex AI models
Veo 2 Generate (veo-2.0-generate-001)
Veo 2 Generate preview (veo-2.0-generate-exp)
Veo 2 Generate preview (veo-2.0-generate-preview)
Veo 3 Generate preview (veo-3.0-generate-preview)
Veo 3 Generate preview (veo-3.0-fast-generate-preview)
Veo 3 Generate (veo-3.0-generate-001)
Veo 3 Fast Generate (veo-3.0-fast-generate-001)
Veo 3.1 Generate preview (veo-3.1-generate-preview)
Veo 3.1 Fast Generate preview (veo-3.1-fast-generate-preview)
Veo 3.1 Generate (veo-3.1-generate-001)
Veo 3.1 Fast Generate (veo-3.1-fast-generate-001)
Speech-to-Text and Text-to-Speech models
Chirp 3: Transcription (chirp_3)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Pro TTS (gemini-2.5-pro-tts)
Gemini 2.5 Flash TTS (gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS preview (gemini-2.5-flash-lite-preview-tts)

Google Cloud partner model endpoint locations

Google serves requests from the region that you specified. For some models, Google also offers a global endpoint to improve overall availability and reduce error rates. The global endpoint can have a separate set of quotas from the regional endpoint and doesn't support data residency requirements. For more information, see the "Regional and global endpoint" section in Vertex AI partner models for MaaS.

Partner model endpoints for Generative AI on Vertex AI are available in the following regions:

Global

Global(global)
Anthropic models
Claude Opus 4.6
Claude Opus 4.5
Claude Sonnet 4.5
Claude Opus 4.1
Claude Haiku 4.5
Claude Opus 4
Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral models
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

United States

Oregon(us-west1) Las Vegas(us-west4) Iowa(us-central1) South Carolina(us-east1) N. Virginia(us-east4) Columbus(us-east5) Dallas(us-south1)
Anthropic models
Claude Opus 4.6
Claude Opus 4.5
Claude Sonnet 4.5
Claude Opus 4.1
Claude Haiku 4.5
Claude Opus 4
Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral models
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Americas

Montréal(northamerica-northeast1) São Paulo(southamerica-east1)
Anthropic models
Claude Opus 4.6
Claude Opus 4.5
Claude Sonnet 4.5
Claude Opus 4.1
Claude Haiku 4.5
Claude Opus 4
Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral models
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Europe

London(europe-west2) Belgium(europe-west1) Netherlands(europe-west4) Zürich(europe-west6) Frankfurt(europe-west3) Finland(europe-north1) Warsaw(europe-central2) Milan(europe-west8) Madrid(europe-southwest1) Paris(europe-west9)
Anthropic models
Claude Opus 4.6
Claude Opus 4.5
Claude Sonnet 4.5
Claude Opus 4.1
Claude Haiku 4.5
Claude Opus 4
Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral models
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Asia Pacific

Mumbai(asia-south1) Singapore(asia-southeast1) Hong Kong(asia-east2) Taiwan(asia-east1) Tokyo(asia-northeast1) Sydney(australia-southeast1) Seoul(asia-northeast3)
Anthropic models
Claude Opus 4.6
Claude Opus 4.5
Claude Sonnet 4.5
Claude Opus 4.1
Claude Haiku 4.5
Claude Opus 4
Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral models
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Middle East

Tel Aviv(me-west1) Doha(me-central1) Dammam(me-central2)
Anthropic models
Claude Opus 4.6
Claude Opus 4.5
Claude Sonnet 4.5
Claude Opus 4.1
Claude Haiku 4.5
Claude Opus 4
Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral models
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Google Cloud open model endpoint locations

Google serves requests from the region that you specified. For some models, Google also offers a global endpoint to improve overall availability and reduce error rates. The global endpoint can have a separate set of quotas from the regional endpoint and doesn't support data residency requirements. For more information, see the "Regional and global endpoint" section in Vertex AI open models for MaaS.

Open model endpoints for Generative AI on Vertex AI are available in the following regions:

Global

Global(global)
Deepseek models
DeepSeek-OCR (deepseek-ocr-maas)
DeepSeek-V3.2 (deepseek-v3.2-maas)
DeepSeek-V3.1 (deepseek-v3.1-maas)
DeepSeek R1 (0528) (deepseek-r1-0528-maas)
ZAI.org models
GLM 4.7 (glm-4.7-maas)
GLM 5 (glm-5-maas)
OpenAI models
gpt-oss 120B (gpt-oss-120b-maas)
gpt-oss 20B (gpt-oss-20b-maas)
Moonshot AI models
Kimi K2 Thinking (kimi-k2-thinking-maas)
Llama models
Llama 3.3 70B
Llama 4 Maverick 17B-128E
Llama 4 Scout 17B-16E
MiniMax models
MiniMax M2 (minimax-m2-maas)
Qwen models
Qwen3-Next-80B Thinking (qwen3-next-80b-a3b-thinking-maas)
Qwen3-Next-80B Instruct (qwen3-next-80b-a3b-instruct-maas)
Qwen3 Coder (qwen3-coder-480b-a35b-instruct-maas)
Qwen3 235B (qwen3-235b-a22b-instruct-2507-maas)
e5 models
Multilingual E5 Small (multilingual-e5-small-maas)
Multilingual E5 Large (multilingual-e5-large-instruct-maas)

United States

Oregon(us-west1) Las Vegas(us-west4) Iowa(us-central1) South Carolina(us-east1) N. Virginia(us-east4) Columbus(us-east5) Dallas(us-south1)
Deepseek models
DeepSeek-OCR (deepseek-ocr-maas)
DeepSeek-V3.2 (deepseek-v3.2-maas)
DeepSeek-V3.1 (deepseek-v3.1-maas)
DeepSeek R1 (0528) (deepseek-r1-0528-maas)
ZAI.org models
GLM 4.7 (glm-4.7-maas)
GLM 5 (glm-5-maas)
OpenAI models
gpt-oss 120B (gpt-oss-120b-maas)
gpt-oss 20B (gpt-oss-20b-maas)
Moonshot AI models
Kimi K2 Thinking (kimi-k2-thinking-maas)
Llama models
Llama 3.3 70B
Llama 4 Maverick 17B-128E
Llama 4 Scout 17B-16E
MiniMax models
MiniMax M2 (minimax-m2-maas)
Qwen models
Qwen3-Next-80B Thinking (qwen3-next-80b-a3b-thinking-maas)
Qwen3-Next-80B Instruct (qwen3-next-80b-a3b-instruct-maas)
Qwen3 Coder (qwen3-coder-480b-a35b-instruct-maas)
Qwen3 235B (qwen3-235b-a22b-instruct-2507-maas)
e5 models
Multilingual E5 Small (multilingual-e5-small-maas)
Multilingual E5 Large (multilingual-e5-large-instruct-maas)

Americas

Montréal(northamerica-northeast1) São Paulo(southamerica-east1)
Deepseek models
DeepSeek-OCR (deepseek-ocr-maas)
DeepSeek-V3.2 (deepseek-v3.2-maas)
DeepSeek-V3.1 (deepseek-v3.1-maas)
DeepSeek R1 (0528) (deepseek-r1-0528-maas)
ZAI.org models
GLM 4.7 (glm-4.7-maas)
GLM 5 (glm-5-maas)
OpenAI models
gpt-oss 120B (gpt-oss-120b-maas)
gpt-oss 20B (gpt-oss-20b-maas)
Moonshot AI models
Kimi K2 Thinking (kimi-k2-thinking-maas)
Llama models
Llama 3.3 70B
Llama 4 Maverick 17B-128E
Llama 4 Scout 17B-16E
MiniMax models
MiniMax M2 (minimax-m2-maas)
Qwen models
Qwen3-Next-80B Thinking (qwen3-next-80b-a3b-thinking-maas)
Qwen3-Next-80B Instruct (qwen3-next-80b-a3b-instruct-maas)
Qwen3 Coder (qwen3-coder-480b-a35b-instruct-maas)
Qwen3 235B (qwen3-235b-a22b-instruct-2507-maas)
e5 models
Multilingual E5 Small (multilingual-e5-small-maas)
Multilingual E5 Large (multilingual-e5-large-instruct-maas)

Europe

London(europe-west2) Belgium(europe-west1) Netherlands(europe-west4) Zürich(europe-west6) Frankfurt(europe-west3) Finland(europe-north1) Warsaw(europe-central2) Milan(europe-west8) Madrid(europe-southwest1) Paris(europe-west9)
Deepseek models
DeepSeek-OCR (deepseek-ocr-maas)
DeepSeek-V3.2 (deepseek-v3.2-maas)
DeepSeek-V3.1 (deepseek-v3.1-maas)
DeepSeek R1 (0528) (deepseek-r1-0528-maas)
ZAI.org models
GLM 4.7 (glm-4.7-maas)
GLM 5 (glm-5-maas)
OpenAI models
gpt-oss 120B (gpt-oss-120b-maas)
gpt-oss 20B (gpt-oss-20b-maas)
Moonshot AI models
Kimi K2 Thinking (kimi-k2-thinking-maas)
Llama models
Llama 3.3 70B
Llama 4 Maverick 17B-128E
Llama 4 Scout 17B-16E
MiniMax models
MiniMax M2 (minimax-m2-maas)
Qwen models
Qwen3-Next-80B Thinking (qwen3-next-80b-a3b-thinking-maas)
Qwen3-Next-80B Instruct (qwen3-next-80b-a3b-instruct-maas)
Qwen3 Coder (qwen3-coder-480b-a35b-instruct-maas)
Qwen3 235B (qwen3-235b-a22b-instruct-2507-maas)
e5 models
Multilingual E5 Small (multilingual-e5-small-maas)
Multilingual E5 Large (multilingual-e5-large-instruct-maas)

Asia Pacific

Mumbai(asia-south1) Singapore(asia-southeast1) Hong Kong(asia-east2) Taiwan(asia-east1) Tokyo(asia-northeast1) Sydney(australia-southeast1) Seoul(asia-northeast3)
Deepseek models
DeepSeek-OCR (deepseek-ocr-maas)
DeepSeek-V3.2 (deepseek-v3.2-maas)
DeepSeek-V3.1 (deepseek-v3.1-maas)
DeepSeek R1 (0528) (deepseek-r1-0528-maas)
ZAI.org models
GLM 4.7 (glm-4.7-maas)
GLM 5 (glm-5-maas)
OpenAI models
gpt-oss 120B (gpt-oss-120b-maas)
gpt-oss 20B (gpt-oss-20b-maas)
Moonshot AI models
Kimi K2 Thinking (kimi-k2-thinking-maas)
Llama models
Llama 3.3 70B
Llama 4 Maverick 17B-128E
Llama 4 Scout 17B-16E
MiniMax models
MiniMax M2 (minimax-m2-maas)
Qwen models
Qwen3-Next-80B Thinking (qwen3-next-80b-a3b-thinking-maas)
Qwen3-Next-80B Instruct (qwen3-next-80b-a3b-instruct-maas)
Qwen3 Coder (qwen3-coder-480b-a35b-instruct-maas)
Qwen3 235B (qwen3-235b-a22b-instruct-2507-maas)
e5 models
Multilingual E5 Small (multilingual-e5-small-maas)
Multilingual E5 Large (multilingual-e5-large-instruct-maas)

Middle East

Tel Aviv(me-west1) Doha(me-central1) Dammam(me-central2)
Deepseek models
DeepSeek-OCR (deepseek-ocr-maas)
DeepSeek-V3.2 (deepseek-v3.2-maas)
DeepSeek-V3.1 (deepseek-v3.1-maas)
DeepSeek R1 (0528) (deepseek-r1-0528-maas)
ZAI.org models
GLM 4.7 (glm-4.7-maas)
GLM 5 (glm-5-maas)
OpenAI models
gpt-oss 120B (gpt-oss-120b-maas)
gpt-oss 20B (gpt-oss-20b-maas)
Moonshot AI models
Kimi K2 Thinking (kimi-k2-thinking-maas)
Llama models
Llama 3.3 70B
Llama 4 Maverick 17B-128E
Llama 4 Scout 17B-16E
MiniMax models
MiniMax M2 (minimax-m2-maas)
Qwen models
Qwen3-Next-80B Thinking (qwen3-next-80b-a3b-thinking-maas)
Qwen3-Next-80B Instruct (qwen3-next-80b-a3b-instruct-maas)
Qwen3 Coder (qwen3-coder-480b-a35b-instruct-maas)
Qwen3 235B (qwen3-235b-a22b-instruct-2507-maas)
e5 models
Multilingual E5 Small (multilingual-e5-small-maas)
Multilingual E5 Large (multilingual-e5-large-instruct-maas)

What's next