Gemini Enterprise offers an API that lets you generate podcasts based on source documents. The output is very similar to the podcasts that end users can generate from within their notebooks.
Podcast generation through the API is well suited for batch jobs where you might have dozens or hundreds of books, articles, or courses, and you want to generate a podcast for each.
The Podcast API is a standalone API. That is, you don't need a NotebookLM Enterprise notebook, a Gemini Enterprise license, or a data store. All you need is an enabled Google Cloud project and the Podcast API User role.
Inputs
The input for the API is an array of context elements. This is the source
material that the podcast gets generated from. The input can be in the form of
text, images, audio, and video. The total content of the context array must be
less than 100,000 tokens.
For a list of supported types, see the technical specifications for images, documents, video, and audio on this page about Gemini 2.5 Flash.
Output
The output from the API is the podcast, in MP3 format.
Before you begin
Before you can generate a podcast using the API, you must have the following:
A Google Cloud project with the Discovery Engine API enabled. See Create a project and enable the API.
The Identity and Access Management (IAM) role of Podcast API User (
roles/discoveryengine.podcastApiUser). For general information about granting roles, see Set up NotebookLM Enterprise.
Generate a podcast from context input
Use the following command to generate a podcast by calling the podcast method.
The input is an array of multimedia objects such as text, images, and audio and video clips.
REST
To generate and export a podcast, do the following:
Run the following curl command:
curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ "https://discoveryengine.googleapis.com/v1/projects/PROJECT_ID/locations/global/podcasts" \ -d '{ "podcastConfig": { "focus": "FOCUS", "length": "LENGTH", "languageCode": "LANGUAGE_CODE" }, "contexts": [ { "text": "TEXT_CONTENT" }, { "inlineData": { "mimeType": "MIME_TYPE", "data": "BASE64_ENCODED_DATA" } } ], "title": "PODCAST_TITLE", "description": "PODCAST_DESCRIPTION" }'Replace the following:
PROJECT_ID: the ID of your project.FOCUS: a prompt where you suggest the focus of the podcast.LENGTH: there are two options:SHORT(typically 4-5 minutes)STANDARD(typically around 10 minutes but it can be shorter with smaller data sets)
LANGUAGE_CODE: optional. Specify the language code for the podcast. Use language tags defined by BCP47. If the language code isn't provided, then the podcast is generated in English.TEXT_CONTENT: The text content to be included.inlineData: An object for non-text media.MIME_TYPE: The MIME type of the blob data (e.g., "image/png").BASE64_ENCODED_DATA: The base64-encoded raw bytes of the media data.PODCAST_TITLE: a title for the podcast. This can be for internal use, or you can choose to display it to your end users.PODCAST_DESCRIPTION: a description of the podcast. This can be for internal use, or you can choose to display it to your end users.
It takes a few minutes to generate a podcast.
Make note of the operation name; you need it to download the podcast in step 4. In the example above, the operation name is
projects/123456/locations/global/operations/create-podcast-54321.Optional. Poll the status of the podcast creation operation. See Get details about a long-running operation.
After the operation has finished, run the following curl command to download the podcast:
curl -v \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ "https://discoveryengine.googleapis.com/v1/OPERATION_NAME:download?alt=media" \ --output FILENAME.mp3 -L
Replace the following:
OPERATION_NAME: the name of the operation that you noted down in step 2.FILENAME: a filename for the podcast.
This command downloads the podcast to an MP3 file in your local directory.
Compliance
The podcast API isn't in compliance with customer-managed encryption keys, CMEK for Gemini Enterprise.