這個頁面說明如何使用同步語音辨識功能,將短音訊檔案轉錄為文字內容。
同步語音辨識會針對短音訊 (少於 60 秒) 傳回辨識出的文字。
您可以從本機檔案將音訊內容直接傳送至 Cloud Speech-to-Text,Cloud Speech-to-Text 也可以處理儲存在 Cloud Storage 值區中的音訊內容。如要瞭解同步語音辨識要求的限制,請參閱配額與限制頁面。
事前準備
- 登入 Google Cloud 帳戶。如果您是 Google Cloud新手,歡迎 建立帳戶,親自評估產品在實際工作環境中的成效。新客戶還能獲得價值 $300 美元的免費抵免額,可用於執行、測試及部署工作負載。
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
Enable the Speech-to-Text APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.-
Make sure that you have the following role or roles on the project: Cloud Speech Administrator
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
-
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
- Click Select a role, then search for the role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
-
安裝 Google Cloud CLI。
-
若您採用的是外部識別資訊提供者 (IdP),請先使用聯合身分登入 gcloud CLI。
-
執行下列指令,初始化 gcloud CLI:
gcloud init -
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
Enable the Speech-to-Text APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.-
Make sure that you have the following role or roles on the project: Cloud Speech Administrator
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
-
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
- Click Select a role, then search for the role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
-
安裝 Google Cloud CLI。
-
若您採用的是外部識別資訊提供者 (IdP),請先使用聯合身分登入 gcloud CLI。
-
執行下列指令,初始化 gcloud CLI:
gcloud init -
如果您使用本機殼層,請為使用者帳戶建立本機驗證憑證:
gcloud auth application-default login
如果您使用 Cloud Shell,則不需要執行這項操作。
如果系統傳回驗證錯誤,且您使用外部識別資訊提供者 (IdP),請確認您已 使用聯合身分登入 gcloud CLI。
用戶端程式庫可以使用應用程式預設憑證,輕鬆向 Google API 進行驗證,然後傳送要求給這些 API。有了應用程式預設憑證,您就能在本機測試應用程式並部署,不必變更基礎程式碼。詳情請參閱「 進行驗證以使用用戶端程式庫」一文。
此外,請務必安裝用戶端程式庫。
對本機檔案執行同步語音辨識
以下是對本機音訊檔案執行同步語音辨識的範例:
Python
from google.cloud.speech_v2 import SpeechClient
from google.cloud.speech_v2.types import cloud_speech
# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
# Instantiates a client
client = SpeechClient()
# Reads a file as bytes
with open("resources/audio.wav", "rb") as f:
audio_content = f.read()
config = cloud_speech.RecognitionConfig(
auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),
language_codes=["en-US"],
model="chirp_3",
)
request = cloud_speech.RecognizeRequest(
recognizer=f"projects/{PROJECT_ID}/locations/global/recognizers/_",
config=config,
content=audio_content,
)
# Transcribes the audio into text
response = client.recognize(request=request)
for result in response.results:
print(f"Transcript: {result.alternatives[0].transcript}")
對遠端檔案執行同步語音辨識
為方便起見,Speech-to-Text API 可以對位於 Cloud Storage 中的音訊檔案直接執行同步語音辨識,您無需在要求內容中傳送音訊檔案的內容。
Speech-to-Text 會使用服務帳戶存取 Cloud Storage 中的檔案。依預設,服務帳戶可存取同一個專案中的 Cloud Storage 檔案。
服務帳戶電子郵件地址如下:
service-PROJECT_NUMBER@gcp-sa-speech.iam.gserviceaccount.com
如要轉錄其他專案中的 Cloud Storage 檔案,請在其他專案中將 [Speech-to-Text 服務代理程式][speech-service-agent] 角色授予這個服務帳戶:
gcloud projects add-iam-policy-binding PROJECT_ID \
--member=serviceAccount:service-PROJECT_NUMBER@gcp-sa-speech.iam.gserviceaccount.com \
--role=roles/speech.serviceAgent如要進一步瞭解專案 IAM 政策,請參閱 [管理專案、資料夾和機構的存取權][manage-access]。
您也可以授予服務帳戶特定 Cloud Storage bucket 的權限,提供更精細的存取權:
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
--member=serviceAccount:service-PROJECT_NUMBER@gcp-sa-speech.iam.gserviceaccount.com \
--role=roles/storage.admin如要進一步瞭解如何管理 Cloud Storage 存取權,請參閱 Cloud Storage 說明文件中的 [建立及管理存取權控管清單][buckets-manage-acl]。
以下是對位於 Cloud Storage 中的檔案執行同步語音辨識的範例:
Python
from google.cloud.speech_v2 import SpeechClient
from google.cloud.speech_v2.types import cloud_speech
# Instantiates a client
client = SpeechClient()
# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
config = cloud_speech.RecognitionConfig(
auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),
language_codes=["en-US"],
model="chirp_3",
)
request = cloud_speech.RecognizeRequest(
recognizer=f"projects/{PROJECT_ID}/locations/global/recognizers/_",
config=config,
uri="gs://cloud-samples-data/speech/audio.flac", # URI of the audio file in Google Cloud Storage
)
# Transcribes the audio into text
response = client.recognize(request=request)
for result in response.results:
print(f"Transcript: {result.alternatives[0].transcript}")
清除所用資源
為避免系統向您的 Google Cloud 帳戶收取本頁面所用資源的費用,請按照下列步驟操作。
-
選用:撤銷您建立的驗證憑證,並刪除本機憑證檔案。
gcloud auth application-default revoke
-
選用:從 gcloud CLI 撤銷憑證。
gcloud auth revoke
控制台
gcloud
刪除 Google Cloud 專案:
gcloud projects delete PROJECT_ID