Transcribe speech to text by using the gcloud CLI
This page shows you how to send a speech recognition request to Cloud Speech-to-Text
by using the gcloud tool from the command
line.
Cloud Speech-to-Text enables easy integration of Google speech recognition technologies into developer applications. You can send audio data to the Cloud Speech-to-Text API, which then returns a text transcription of that audio file. For more information about the service, see Cloud STT basics.
Before you begin
Before you can send a request to the Cloud Speech-to-Text API, you must complete the following actions.
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
Install the Google Cloud CLI.
-
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
-
To initialize the gcloud CLI, run the following command:
gcloud init -
Create or select a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
Create a Google Cloud project:
gcloud projects create PROJECT_ID
Replace
PROJECT_IDwith a name for the Google Cloud project you are creating. -
Select the Google Cloud project that you created:
gcloud config set project PROJECT_ID
Replace
PROJECT_IDwith your Google Cloud project name.
-
If you're using an existing project for this guide, verify that you have the permissions required to complete this guide. If you created a new project, then you already have the required permissions.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the Cloud Speech-to-Text API:
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.gcloud services enable speech.googleapis.com
-
Install the Google Cloud CLI.
-
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
-
To initialize the gcloud CLI, run the following command:
gcloud init -
Create or select a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
Create a Google Cloud project:
gcloud projects create PROJECT_ID
Replace
PROJECT_IDwith a name for the Google Cloud project you are creating. -
Select the Google Cloud project that you created:
gcloud config set project PROJECT_ID
Replace
PROJECT_IDwith your Google Cloud project name.
-
If you're using an existing project for this guide, verify that you have the permissions required to complete this guide. If you created a new project, then you already have the required permissions.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the Cloud Speech-to-Text API:
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.gcloud services enable speech.googleapis.com
- Optional: Create a new Cloud Storage bucket to store your audio data. For more information, see Create a Cloud Storage bucket.
For more information about enabling the API, see Set up Cloud Speech-to-Text for your Google Cloud project.
Required roles
To ensure that Cloud Composer Service Agent has the necessary
permissions to run Cloud Speech-to-Text,
ask your administrator to grant Cloud Composer Service Agent the
Service Account Token Creator (iam.serviceAccountTokenCreator)
IAM role on your project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
Your administrator might also be able to give Cloud Composer Service Agent the required permissions through custom roles or other predefined roles.
To get the permissions that
you need to store audio in Cloud Storage,
ask your administrator to grant you the
Storage Object Viewer (roles/storage.objectViewer)
IAM role on Cloud Storage bucket.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Make an audio transcription request
Use Cloud STT to transcribe an audio file to text. Use the
following code sample to send a recognize request to the
Cloud Speech-to-Text API.
Open the command line shell and run the following command.
gcloud ml speech recognize gs://cloud-samples-tests/speech/brooklyn.flac \ --language-code=en-US
This command requests that Cloud STT transcribe the audio contained in a FLAC hosted at a publicly accessible location.
If the request is successful, the server returns a response in JSON format:
{
"results": [
{
"alternatives": [
{
"confidence": 0.9840146,
"transcript": "how old is the Brooklyn Bridge"
}
]
}
]
}Congratulations! You sent your first request to Cloud STT.
If you receive an error or an empty response from Cloud STT, take a look at the troubleshooting and error mitigation steps.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, delete the Google Cloud project with the resources.
- Use the Google Cloud console to delete your project if you don't need it.
What's next
- Practice transcribing short audio files.
- Learn how to batch long audio files for speech recognition.
- Learn how to transcribe streaming audio like from a microphone.
- Get started with the Cloud STT in your language of choice by using a Cloud STT client library.
- Work through the sample applications.
- For best performance, accuracy, and other tips, see the best practices documentation.