Cloud Storage ファイルの同期音声文字変換

Cloud Storage に保存されている音声ファイルの同期文字変換を行います。

このコードサンプルを含む詳細なドキュメントについては、以下をご覧ください。

短い音声ファイルの音声を文字に変換する

コードサンプル

Go

Cloud STT 用のクライアントライブラリをインストールして使用する方法については、Cloud STT クライアントライブラリをご覧ください。詳細については、Cloud STT Go API のリファレンスドキュメントをご覧ください。

Cloud STT に対する認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証の設定をご覧ください。


func recognizeGCS(w io.Writer, gcsURI string) error {
	ctx := context.Background()

	client, err := speech.NewClient(ctx)
	if err != nil {
		return err
	}
	defer client.Close()

	// Send the request with the URI (gs://...)
	// and sample rate information to be transcripted.
	resp, err := client.Recognize(ctx, &speechpb.RecognizeRequest{
		Config: &speechpb.RecognitionConfig{
			Encoding:        speechpb.RecognitionConfig_LINEAR16,
			SampleRateHertz: 16000,
			LanguageCode:    "en-US",
		},
		Audio: &speechpb.RecognitionAudio{
			AudioSource: &speechpb.RecognitionAudio_Uri{Uri: gcsURI},
		},
	})

	// Print the results.
	for _, result := range resp.Results {
		for _, alt := range result.Alternatives {
			fmt.Fprintf(w, "\"%v\" (confidence=%3f)\n", alt.Transcript, alt.Confidence)
		}
	}
	return nil
}

Java

Cloud STT 用のクライアントライブラリをインストールして使用する方法については、Cloud STT クライアントライブラリをご覧ください。詳細については、Cloud STT Java API のリファレンスドキュメントをご覧ください。

/**
 * Performs speech recognition on remote FLAC file and prints the transcription.
 *
 * @param gcsUri the path to the remote FLAC audio file to transcribe.
 */
public static void syncRecognizeGcs(String gcsUri) throws Exception {
  // Instantiates a client with GOOGLE_APPLICATION_CREDENTIALS
  try (SpeechClient speech = SpeechClient.create()) {
    // Builds the request for remote FLAC file
    RecognitionConfig config =
        RecognitionConfig.newBuilder()
            .setEncoding(AudioEncoding.FLAC)
            .setLanguageCode("en-US")
            .setSampleRateHertz(16000)
            .build();
    RecognitionAudio audio = RecognitionAudio.newBuilder().setUri(gcsUri).build();

    // Use blocking call for getting audio transcript
    RecognizeResponse response = speech.recognize(config, audio);
    List<SpeechRecognitionResult> results = response.getResultsList();

    for (SpeechRecognitionResult result : results) {
      // There can be several alternative transcripts for a given chunk of speech. Just use the
      // first (most likely) one here.
      SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
      System.out.printf("Transcription: %s%n", alternative.getTranscript());
    }
  }
}

Node.js

Cloud STT 用のクライアントライブラリをインストールして使用する方法については、Cloud STT クライアントライブラリをご覧ください。詳細については、Cloud STT Node.js API のリファレンスドキュメントをご覧ください。

// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');

// Creates a client
const client = new speech.SpeechClient();

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const gcsUri = 'gs://my-bucket/audio.raw';
// const encoding = 'Encoding of the audio file, e.g. LINEAR16';
// const sampleRateHertz = 16000;
// const languageCode = 'BCP-47 language code, e.g. en-US';

const config = {
  encoding: encoding,
  sampleRateHertz: sampleRateHertz,
  languageCode: languageCode,
};
const audio = {
  uri: gcsUri,
};

const request = {
  config: config,
  audio: audio,
};

// Detects speech in the audio file
const [response] = await client.recognize(request);
const transcription = response.results
  .map(result => result.alternatives[0].transcript)
  .join('\n');
console.log('Transcription: ', transcription);

PHP

Cloud STT 用のクライアントライブラリをインストールして使用する方法については、Cloud STT クライアントライブラリをご覧ください。

use Google\Cloud\Speech\V2\Client\SpeechClient;
use Google\Cloud\Speech\V2\RecognitionConfig;
use Google\Cloud\Speech\V2\ExplicitDecodingConfig;
use Google\Cloud\Speech\V2\ExplicitDecodingConfig\AudioEncoding;
use Google\Cloud\Speech\V2\RecognizeRequest;

/**
 * @param string $projectId The Google Cloud project ID.
 * @param string $location The location of the recognizer.
 * @param string $recognizerId The ID of the recognizer to use.
 * @param string $uri The Cloud Storage object to transcribe (gs://your-bucket-name/your-object-name)
 */
function transcribe_sync_gcs(string $projectId, string $location, string $recognizerId, string $uri)
{
    // create the speech client
    $apiEndpoint = $location === 'global' ? null : sprintf('%s-speech.googleapis.com', $location);
    $speech = new SpeechClient(['apiEndpoint' => $apiEndpoint]);

    $recognizerName = SpeechClient::recognizerName($projectId, $location, $recognizerId);

    $config = (new RecognitionConfig())
        // Can also use {@see Google\Cloud\Speech\V2\AutoDetectDecodingConfig}
        // ->setAutoDecodingConfig(new AutoDetectDecodingConfig());

        ->setExplicitDecodingConfig(new ExplicitDecodingConfig([
            // change these variables if necessary
            'encoding' => AudioEncoding::LINEAR16,
            'sample_rate_hertz' => 16000,
            'audio_channel_count' => 1,
        ]));

    $request = (new RecognizeRequest())
        ->setRecognizer($recognizerName)
        ->setConfig($config)
        ->setUri($uri);

    try {
        $response = $speech->recognize($request);
        foreach ($response->getResults() as $result) {
            $alternatives = $result->getAlternatives();
            $mostLikely = $alternatives[0];
            $transcript = $mostLikely->getTranscript();
            $confidence = $mostLikely->getConfidence();
            printf('Transcript: %s' . PHP_EOL, $transcript);
            printf('Confidence: %s' . PHP_EOL, $confidence);
        }
    } finally {
        $speech->close();
    }
}

Python

Cloud STT 用のクライアントライブラリをインストールして使用する方法については、Cloud STT クライアントライブラリをご覧ください。詳細については、Cloud STT Python API のリファレンスドキュメントをご覧ください。

def transcribe_gcs(audio_uri: str) -> speech.RecognizeResponse:
    """Transcribes the audio file specified by the gcs_uri.
    Args:
        audio_uri (str): The Google Cloud Storage URI of the input audio file.
            E.g., gs://cloud-samples-data/speech/audio.flac
    Returns:
        cloud_speech.RecognizeResponse: The response containing the transcription results
    """
    from google.cloud import speech

    client = speech.SpeechClient()

    audio = speech.RecognitionAudio(uri=audio_uri)
    config = speech.RecognitionConfig(
        encoding=speech.RecognitionConfig.AudioEncoding.FLAC,
        sample_rate_hertz=16000,
        language_code="en-US",
    )

    response = client.recognize(config=config, audio=audio)

    # Each result is for a consecutive portion of the audio. Iterate through
    # them to get the transcripts for the entire audio file.
    for result in response.results:
        # The first alternative is the most likely one for this portion.
        print(f"Transcript: {result.alternatives[0].transcript}")

    return response

Ruby

Cloud STT 用のクライアントライブラリをインストールして使用する方法については、Cloud STT クライアントライブラリをご覧ください。

# storage_path = "Path to file in Cloud Storage, eg. gs://bucket/audio.raw"

require "google/cloud/speech"

speech = Google::Cloud::Speech.speech version: :v1

config = { encoding:          :LINEAR16,
           sample_rate_hertz: 16_000,
           language_code:     "en-US" }
audio  = { uri: storage_path }

response = speech.recognize config: config, audio: audio

results = response.results

alternatives = results.first.alternatives
alternatives.each do |alternative|
  puts "Transcription: #{alternative.transcript}"
end

次のステップ

他の Google Cloud プロダクトのコードサンプルを検索およびフィルタするには、Google Cloud サンプルブラウザをご覧ください。

Cloud Storage ファイルの同期音声文字変換 コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

もっと見る