This document provides troubleshooting steps for issues that you might encounter when using Gemini Live API.
Connection drops unexpectedly
If your session connection drops unexpectedly, it might be due to token limits, session connection timeouts, or network issues.
Behavior
Session disconnected with error code 1000 or 1006.
Possible reasons
- Context window compression isn't enabled and context token exceeds the session's context token limit (up to 128k).
- No session resumption logic is implemented or resumption logic isn't correctly implemented.
- Unstable internet connection.
Reasons and solutions
Tokens exceed the session's context token limit: To prevent exceeding the session's context token limit, activate context compression. This might impact the quality of the conversation, as the model will intermittently discard earlier parts of the chat history.
Session connection expires after 10 minutes: Manage session resumption to allow for longer interactions. For further details, consult the best practices for session resumption.
Unstable internet connection: Verify the condition of your internet connection, as fluctuations in stability may result in connectivity problems.
Model failed to understand the user
If the model doesn't seem to understand your input, ensure your audio is formatted correctly and consider your microphone quality and background noise.
Behavior
The model responds with irrelevant information, responds with incorrect information, or asks the user to repeat.
Possible reasons
- Input audio format isn't correct.
- Microphone quality isn't good.
- The background noise is too high.
Reasons and solutions
Input audio format isn't correct: Verify that the input audio uses a little-endian, 16-bit PCM format with a 16 kHz sampling rate and a single mono channel.
Microphone quality isn't good: Test the microphone quality by recording a short audio and playing it back. If the microphone quality isn't good, try to use a microphone with better quality.
Background noise is too high: Test the background noise level by recording a short audio and playing it back. If the background noise level is too high, try to move the microphone closer to the user or use a microphone with better noise cancellation.
Model isn't responding
If you aren't receiving a response from the model, check your voice activity detection options and WebSocket connection.
Behavior
No response from the model.
Possible reasons
- VAD settings aren't correctly set.
- WebSocket connection got interrupted.
Reasons and solutions
VAD set incorrectly: VAD is disabled by the user. In this case, the model will keep waiting for the user's speech and won't respond to the user. Make sure to send
ActivityStartandActivityEndevents to the model if the VAD is disabled.WebSocket connection got interrupted: If the WebSocket connection is interrupted, there will be no communication between the client and the server. Check the WebSocket connection status and make sure it is properly established.
Cannot interrupt the model
If you're unable to interrupt the model while it's speaking, ensure you're handling the playback buffer and streaming audio correctly.
Behavior
The model continues to speak without interruption from the user.
Possible reasons
- Failed to flush playback buffer.
- Failed to stream audio to Gemini Live API.
- Customized VAD isn't correctly implemented.
Reasons and solutions
Failed to flush playback buffer: The client should flush the playback buffer immediately when receiving an interrupt signal from the model. Otherwise, the model will keep talking.
Failed to stream audio to Gemini Live API: The client should stream audio to Gemini Live API in chunks between 20 ms and 40 ms to minimize latency. If the client fails to stream audio to Gemini Live API, the model won't send an interrupt signal to the client.
Customized VAD isn't correctly implemented: If the customized VAD fails to recognize the start of speech or the client fails to send
ActivityStartsignal to the model, the model won't send an interrupt signal to the client.