Speech models

Dialogflow CX voice agents use Speech-to-Text for speech recognition, which is included in Dialogflow CX pricing. Dialogflow CX automatically selects a speech recognition model for you, but you can optionally specify the model.

Available models

All available models are listed at Speech-to-Text models. Select a model that is best suited to your domain and supports your agent language and speech features.

If a model is not explicitly specified, then Dialogflow CX auto-selects a model based on the audio configuration in API requests and agent settings.

If enhanced speech model is enabled for the agent and an enhanced version of the specified model for the language does not exist, then the speech is recognized using the standard version of the specified model.

The following models typically have the best performance:

telephony_short (best for telephony Dialogflow CX)
telephony (best for Agent Assist)
phone_call (good for Agent Assist and telephony Dialogflow CX)
latest_short (best for non-telephony Dialogflow CX)
command_and_search (best for languages where other models are not available)

Specify a model

You can supply the model when calling the detectIntent or streamingDetectIntent methods on the Sessions type; or when configuring the ConversationProfile for Agent Assist.

Mutual TLS authentication

Speech adaptation

Speech models Stay organized with collections Save and categorize content based on your preferences.

Available models

Specify a model

Speech models