本頁面由 Cloud Translation API 翻譯而成。

透過模型調整機制來改善語音轉錄結果

總覽

您可以使用模型調整功能，讓 Speech-to-Text 優先辨識特定字詞或詞組，而非系統建議的其他選項。舉例來說，假設音訊資料經常包含「天氣」一詞，當 Speech-to-Text 遇到「weather」一詞時，您希望系統轉錄為「weather」的頻率高於「whether」。在這種情況下，您可以使用模型調適功能，讓 Speech-to-Text 傾向辨識「weather」。

模型調整功能在下列情況中特別實用：

提升音訊資料中經常出現的字詞或詞組的準確率。舉例來說，您可以提醒辨識模型使用者通常會說出的語音指令。
擴充 Speech-to-Text 辨識的字詞詞彙。 Speech-to-Text 包含的詞彙量非常大。不過，如果音訊資料經常包含一般語言使用中罕見的字詞 (例如專有名詞或特定領域的字詞)，您可以使用模型調適功能新增這些字詞。
如果提供的音訊包含噪音或不太清楚，可提高語音轉錄的準確度。

如要查看模型調整功能是否支援你的語言，請參閱語言支援頁面。

提升字詞和詞組辨識準確度

如要提高語音轉文字在轉錄音訊資料時辨識「天氣」一詞的機率，可以在 SpeechAdaptation 資源的 PhraseSet 物件中傳遞「天氣」一詞。

提供多字詞詞組時，Speech-to-Text 更有可能依序辨識出這些字詞。提供詞組也會提高辨識出部分詞組 (包括個別字詞) 的可能性。如要瞭解這些詞組的數量和大小限制，請參閱內容限制頁面。

您也可以使用模型適應提升功能，微調模型適應的強度。

使用類別提升辨識準確度

類別代表自然語言中常見的概念，例如貨幣單位和日曆日期。如果有一大群字詞對應到相同概念，但並非一律包含相同字詞或詞組，您可以使用類別提升轉錄準確度。

舉例來說，假設音訊資料包含使用者說出街道地址的錄音，你可能錄到某人說「我家是中山路 123 號，左邊第四間房子」。在本例中，您希望 Speech-to-Text 將第一組數字序列「123」辨識為地址，而非序數「一百二十三」。不過，並非所有人都住在「中正路 123 號」。在 PhraseSet 資源中列出所有可能的街道地址並不切實際。您可以改用類別，指出無論實際數字為何，都應辨識為門牌號碼。在本例中，語音轉文字服務可更準確地轉錄「123 Main Street」和「987 Grand Boulevard」等詞組，因為這兩者都會辨識為地址號碼。

類別權杖

如要在模型調整中使用類別，請在 PhraseSet 資源的 phrases 欄位中加入類別權杖。請參閱支援的類別權杖清單，瞭解您的語言支援哪些權杖。舉例來說，如要提升來源音訊中地址號碼的轉錄品質，請在 SpeechContext 物件中提供 $ADDRESSNUM 值。

您可以將類別當做 phrases 陣列中的獨立項目，也可以在較長的多字詞片語中嵌入一或多個類別權杖。舉例來說，您可以在較長的片語中指出地址號碼，方法是在字串中加入類別權杖：["my address is $ADDRESSNUM"]。不過，如果音訊包含類似但並非完全相同的詞組，例如「我在 123 Main Street」，這類詞組就無法提供協助。為協助辨識類似詞組，請務必另外加入類別符記：["my address is $ADDRESSNUM", "$ADDRESSNUM"]。如果使用無效或格式錯誤的類別符記，Speech-to-Text 會忽略該符記，不會觸發錯誤，但仍會使用其餘片語做為情境。

自訂類別

您也可以建立自己的 CustomClass，也就是由自訂清單組成的類別，清單內含相關的項目或值。舉例來說，您想轉錄的音訊資料可能包含數百家區域餐廳的名稱。一般語音中較少出現餐廳名稱，因此辨識模型不太可能將其選為「正確」答案。您可以自訂調整辨識模型，這些名稱出現在音訊中時，就能偏向正確的辨識結果。

如要使用自訂類別，請建立 CustomClass 資源，其中包含每個餐廳名稱做為 ClassItem。自訂類別的運作方式與預先建構的類別權杖相同。phrase 可以包含預先建構的類別權杖和自訂類別。

ABNF 文法

您也可以使用擴展巴科斯範式 (ABNF) 的文法指定字詞模式。在要求模型調整中加入 ABNF 文法，可提高 Speech-to-Text 辨識所有符合指定文法的字詞的機率。

如要使用這項功能，請在要求的 SpeechAdaptation 欄位中加入 ABNF grammar 物件。ABNF 文法也可以包含對 CustomClass 和 PhraseSet 資源的參照。如要進一步瞭解這個欄位的語法，請參閱下方的Speech Recognition Grammar Specification和code sample。

使用增強功能微調轉錄結果

預設情況下，模型調整項應已在大多數情況下提供足夠的影響。模型調整增強功能可讓您為某些詞組指派較高的權重，藉此提高辨識模型偏誤。建議您只在下列情況下導入提升功能：1) 您已導入模型調整功能；2) 您想進一步調整模型調整功能對轉錄結果的影響程度。

舉例來說，你有很多錄音檔，內容是使用者詢問「進入縣市集市的票價」，其中「集市」一詞的出現頻率高於「票價」。在這種情況下，您可以透過模型調整功能，在 PhraseSet 資源中加入「fair」和「fare」作為 phrases，提高模型辨識這兩個字詞的機率。這樣一來，語音轉文字服務就會比方說更常辨識出「fair」和「fare」，而非「hare」或「lair」。

不過，由於「fair」在音訊中出現的頻率較高，因此系統應該會更常辨識出「fair」而非「fare」。您可能已使用 Speech-to-Text API 轉錄音訊，但發現系統無法正確辨識「fair」這個字，導致錯誤率偏高。在這種情況下，您可能還想使用增強功能，為「fair」指派比「fare」更高的增強值。「fair」的加權值較高，因此 Speech-to-Text API 偏向選擇「fair」，而非「fare」。如果沒有提升值，辨識模型會以相同機率辨識「fair」和「fare」。

商家宣傳廣告基本概念

使用提升功能時，請在 PhraseSet 資源中為 phrase 項目指派加權值。Speech-to-Text 會參考這個加權值，為音訊資料中的字詞選取可能的轉錄內容。增強值越高，Speech-to-Text 從可能選項中選擇該字詞或詞組的機率便越高。

舉例來說，您想為「我最喜歡的美國自然史博物館展覽是藍鯨」這個詞組指派提升值。如果將該詞組新增至 phrase 物件並指派提升值，辨識模型就更有可能逐字辨識該詞組。

如果提升多字詞組的成效後，仍未獲得預期結果，建議您將組成該詞組的所有雙連字 (依序排列的 2 個字) 新增為額外的 phrase 項目，並為每個項目指派提升值。延續上述範例，您可以考慮新增其他雙連詞和 N 元語法 (超過 2 個字)，例如「我最喜歡」、「我最喜歡的展覽」、「最喜歡的展覽」、「我最喜歡的美國自然歷史博物館展覽」、「美國自然歷史博物館」、「藍鯨」等等。這樣一來，Speech-to-Text 辨識模型就更有可能辨識出音訊中含有部分原始加強詞組，但並非逐字相符的相關詞組。

設定增幅值

升幅值必須是大於 0 的浮點值。加成值的實際上限為 20。為獲得最佳結果，請調整升幅值，直到轉錄結果準確為止，藉此實驗轉錄結果。

如果提升值較高，偽陰性情形就會減少。偽陰性是指音訊中出現的字詞或詞組，但語音轉文字服務未正確辨識。不過，提高準確度也可能增加誤判的機率，也就是說，即使音訊中沒有出現該字詞或詞組，轉錄稿中仍可能出現。

接收逾時通知

Speech-to-Text 回應包含 SpeechAdaptationInfo 欄位，提供辨識期間的模型調整行為資訊。如果發生與模型調整相關的逾時問題，adaptationTimeout 會是 true，而 timeoutMessage 會指定導致逾時的調整設定。發生逾時時，模型調整不會影響傳回的轉錄稿。

使用模型適應的用途範例

以下範例逐步說明如何使用模型調整功能，轉錄某人說「call me fionity and oh my gosh what do we have here ionity」的錄音內容。在本例中，模型必須正確識別「fionity」和「ionity」。

下列指令會對音訊執行辨識作業，但不進行模型調整。轉錄結果不正確：「call me Fiona tea and oh my gosh what do we have here I own a day」。

   curl -H "Authorization: Bearer $(gcloud auth
   --impersonate-service-account=$SA_EMAIL print-access-token)" -H
   "Content-Type: application/json; charset=utf-8"
   "https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
   {"languageCode": "en-US"}, "audio":
   {"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

     {
       "config":{
       "languageCode":"en-US"
       },
       "audio":{
          "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
       }
     }

使用 `PhraseSet` 改善轉錄品質

建立 PhraseSet：

curl -X POST -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/phraseSets"
-d '{"phraseSetId": "test-phrase-set-1"}'

要求範例：

{
   "phraseSetId":"test-phrase-set-1"
}

取得 PhraseSet：

curl -X GET -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id>/locations/global/phraseSets/test-phrase-set-1"\

將「fionity」和「ionity」這兩個詞組新增至 PhraseSet，並為每個詞組指派 boost 值 10：

curl -X PATCH -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/phraseSets/test-phrase-set-1?updateMask=phrases"\
-d '{"phrases": [{"value": "ionity", "boost": 10}, {"value": "fionity", "boost": 10}]}'

「PhraseSet」現已更新為：

{
  "phrases":[
     {
          "value":"ionity",
          "boost":10
       },
       {
          "value":"fionity",
          "boost":10
       }
    ]
 }

再次辨識音訊，這次請使用模型調適和PhraseSet先前建立的內容。轉錄結果現在正確無誤：「call me fionity and oh my gosh what do we have here ionity」。

curl -H "Authorization: Bearer $(gcloud auth --impersonate-service-account=$SA_EMAIL print-access-token)"
-H "Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
{"adaptation": {"phrase_set_references": ["projects/project_id/locations/global/phraseSets/test-phrase-set-1"]},
"languageCode": "en-US"}, "audio": {"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

{
   "config":{
      "adaptation":{
         "phrase_set_references":[
            "projects/project_id/locations/global/phraseSets/test-phrase-set-1"
         ]
      },
      "languageCode":"en-US"
   },
   "audio":{
      "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
   }
}

使用 `CustomClass` 改善轉錄結果

建立 CustomClass：

curl -X POST -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/customClasses"
-d '{"customClassId": "test-custom-class-1"}'

要求範例：

{
   "customClassId": "test-custom-class-1"
}

取得 CustomClass：

 curl -X GET -H "Authorization: Bearer $(gcloud auth
 --impersonate-service-account=$SA_EMAIL print-access-token)" -H
 "Content-Type: application/json; charset=utf-8"
 "https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/customClasses/test-custom-class-1"

辨識測試音訊片段。CustomClass 為空，因此傳回的轉錄稿仍不正確：「call me Fiona tea and oh my gosh what do we have here I own a day」：

curl -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
{"adaptation": {"phraseSets": [{"phrases": [{"value":
"${projects/project_idlocations/global/customClasses/test-custom-class-1}",
"boost": "10"}]}]}, "languageCode": "en-US"}, "audio":
{"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

  {
   "config":{
      "adaptation":{
         "phraseSets":[
            {
               "phrases":[
                  {
                     "value":"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
                     "boost":"10"
                  }
               ]
            }
         ]
      },
      "languageCode":"en-US"
   },
   "audio":{
      "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
   }
 }

將「fionity」和「ionity」詞組新增至自訂類別：

curl -X PATCH -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/customClasses/test-custom-class-1?updateMask=items"
-d '{"items": [{"value": "ionity"}, {"value": "fionity"}]}'

這會將自訂類別更新為下列內容：

{
   "items":[
      {
         "value":"ionity"
      },
      {
         "value":"fionity"
      }
   ]
}

再次辨識範例音訊，這次請在 CustomClass 中加入「fionity」和「ionity」。轉錄稿現在正確顯示：「call me fionity and oh my gosh what do we have here ionity」。

curl -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
{"adaptation": {"phraseSets": [{"phrases": [{"value":
"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
"boost": "10"}]}]}, "languageCode": "en-US"}, "audio":
{"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

{
   "config":{
      "adaptation":{
         "phraseSets":[
            {
               "phrases":[
                  {
"value":"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
                     "boost":"10"
                  }
               ]
            }
         ]
      },
      "languageCode":"en-US"
   },
   "audio":{
      "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
   }
}

在 `PhraseSet` 中參照 `CustomClass`

更新先前建立的 PhraseSet 資源，以參照 CustomClass：

curl -X PATCH -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/phraseSets/test-phrase-set-1?updateMask=phrases"
-d '{"phrases": [{"value": "${projects/project_id/locations/global/customClasses/test-custom-class-1}", "boost": 10}]}'

要求範例：

{
   "config":{
      "adaptation":{
         "phraseSets":[
            {
               "phrases":[
                  {
                     "value":"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
                     "boost":"10"
                  }
               ]
            }
         ]
      },
      "languageCode":"en-US"
   },
   "audio":{
      "uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"
   }
}

使用 PhraseSet 資源 (參照 CustomClass) 辨識音訊。轉錄稿正確無誤：「call me fionity and oh my gosh what do we have here ionity」。

curl -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/speech:recognize" -d '{"config":
{"adaptation": {"phrase_set_references":
["projects/project_id/locations/global/phraseSets/test-phrase-set-1"]},
"languageCode": "en-US"}, "audio":
{"uri":"gs://biasing-resources-test-audio/call_me_fionity_and_ionity.wav"}}'

要求範例：

{
   "phrases":[
      {
         "value":"${projects/project_id/locations/global/customClasses/test-custom-class-1}",
         "boost":10
      }
   ]
}

刪除 `CustomClass` 和 `PhraseSet`

刪除 PhraseSet：

curl -X DELETE -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/phraseSets/test-phrase-set-1"

刪除 CustomClass：

curl -X DELETE -H "Authorization: Bearer $(gcloud auth
--impersonate-service-account=$SA_EMAIL print-access-token)" -H
"Content-Type: application/json; charset=utf-8"
"https://speech.googleapis.com/v1p1beta1/projects/project_id/locations/global/customClasses/test-custom-class-1"

使用 `ABNF Grammar` 改善轉錄結果

使用 abnf_grammar 辨識音訊。這個範例是指 CustomClass 資源：projects/project_id/locations/global/customClasses/test-custom-class-1、內嵌 CustomClass：test-custom-class-2、類別權杖：ADDRESSNUM，以及 PhraseSet 資源：projects/project_id/locations/global/phraseSets/test-phrase-set-1。字串中的第一條規則 (外部宣告後) 會視為根。

要求範例：

{
   "config":{
      "adaptation":{
         "abnf_grammar":{
            "abnf_strings": [ 
              "external ${projects/project_id/locations/global/phraseSets/test-phrase-set-1}" ,
              "external ${projects/project_id/locations/global/customClasses/test-custom-class-1}" ,
              "external ${test-custom-class-2}" ,
              "external $ADDRESSNUM" ,
              "$root = $test-phrase-set-1 $name lives in $ADDRESSNUM;" ,
              "$name = $title $test-custom-class-1 $test-custom-class-2" ,
              "$title = Mr | Mrs | Miss | Dr | Prof ;" 
            ]
         }
      }
   }
}

後續步驟

瞭解如何在 Speech-to-Text 要求中使用模型調整功能。
查看支援的類別符記清單。

透過模型調整機制來改善語音轉錄結果 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

總覽

提升字詞和詞組辨識準確度

使用類別提升辨識準確度

類別權杖

自訂類別

ABNF 文法

使用增強功能微調轉錄結果

商家宣傳廣告基本概念

設定增幅值

接收逾時通知

使用模型適應的用途範例

使用 PhraseSet 改善轉錄品質

使用 CustomClass 改善轉錄結果

在 PhraseSet 中參照 CustomClass

刪除 CustomClass 和 PhraseSet

使用 ABNF Grammar 改善轉錄結果

後續步驟

透過模型調整機制來改善語音轉錄結果

使用 `PhraseSet` 改善轉錄品質

使用 `CustomClass` 改善轉錄結果

在 `PhraseSet` 中參照 `CustomClass`

刪除 `CustomClass` 和 `PhraseSet`

使用 `ABNF Grammar` 改善轉錄結果