Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Menggunakan context cache

Anda dapat menggunakan REST API atau Python SDK untuk mereferensikan konten yang disimpan dalam context cache di aplikasi AI generatif. Sebelum dapat digunakan, Anda harus membuat context cache terlebih dahulu .

Objek context cache yang Anda gunakan dalam kode Anda mencakup properti berikut:

name - Nama resource context cache. Formatnya adalah projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID. Saat Anda membuat context cache, nama resource-nya akan ditampilkan dalam respons. Nomor project adalah ID unik untuk project Anda. ID cache adalah ID untuk cache Anda. Saat menentukan context cache dalam kode, Anda harus menggunakan nama resource context cache lengkap. Berikut adalah contoh yang menunjukkan cara menentukan nama resource konten yang di-cache dalam isi permintaan:
```
"cached_content": "projects/123456789012/locations/us-central1/123456789012345678"
```
model - Nama resource untuk model yang digunakan untuk membuat cache. Formatnya adalah projects/PROJECT_NUMBER/locations/LOCATION/publishers/PUBLISHER_NAME/models/MODEL_ID.
createTime - Timestamp yang menentukan waktu pembuatan context cache.
updateTime - Timestamp yang menentukan waktu update terbaru a context cache. Setelah context cache dibuat, dan sebelum diupdate, createTime dan updateTime-nya akan sama.
expireTime - Timestamp yang menentukan kapan context cache akan berakhir masa berlakunya. expireTime default adalah 60 menit setelah createTime. Anda dapat memperbarui cache dengan waktu habis masa berlaku baru. Untuk mengetahui informasi selengkapnya, lihat Memperbarui context cache. Setelah masa berlaku cache berakhir, cache akan ditandai untuk dihapus dan Anda tidak boleh menganggap bahwa cache tersebut dapat digunakan atau diperbarui. Jika perlu menggunakan context cache yang masa berlakunya telah berakhir, Anda harus membuatnya ulang dengan waktu habis masa berlaku yang sesuai.

Batasan penggunaan context cache

Jika fitur berikut ditentukan saat Anda membuat context cache, Anda tidak boleh menentukannya lagi dalam permintaan:

Properti GenerativeModel.system_instructions. Properti ini digunakan untuk menentukan petunjuk ke model sebelum model menerima petunjuk dari pengguna. Untuk mengetahui informasi selengkapnya, lihat Petunjuk sistem.
Properti GenerativeModel.tool_config. Properti tool_config digunakan untuk menentukan alat yang digunakan oleh model Gemini, seperti alat yang digunakan oleh fitur panggilan fungsi.
Properti GenerativeModel.tools. Properti GenerativeModel.tools digunakan untuk menentukan fungsi guna membuat aplikasi panggilan fungsi. Untuk mengetahui informasi selengkapnya, lihat Panggilan fungsi.

Menggunakan contoh context cache

Berikut ini cara menggunakan context cache. Saat menggunakan context cache, Anda tidak dapat menentukan properti berikut:

GenerativeModel.system_instructions
GenerativeModel.tool_config
GenerativeModel.tools

Python

Instal

pip install --upgrade google-genai

Untuk mempelajari lebih lanjut, lihat dokumentasi referensi SDK.

Tetapkan variabel lingkungan untuk menggunakan Gen AI SDK dengan Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, HttpOptions

client = genai.Client(http_options=HttpOptions(api_version="v1"))
# Use content cache to generate text response
# E.g cache_name = 'projects/.../locations/.../cachedContents/1111111111111111111'
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Summarize the pdfs",
    config=GenerateContentConfig(
        cached_content=cache_name,
    ),
)
print(response.text)
# Example response
#   The Gemini family of multimodal models from Google DeepMind demonstrates remarkable capabilities across various
#   modalities, including image, audio, video, and text....

Go

Pelajari cara menginstal atau memperbarui Go.

Untuk mempelajari lebih lanjut, lihat dokumentasi referensi SDK.

Tetapkan variabel lingkungan untuk menggunakan Gen AI SDK dengan Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

import (
	"context"
	"fmt"
	"io"

	genai "google.golang.org/genai"
)

// useContentCacheWithTxt shows how to use content cache to generate text content.
func useContentCacheWithTxt(w io.Writer, cacheName string) error {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return fmt.Errorf("failed to create genai client: %w", err)
	}

	resp, err := client.Models.GenerateContent(ctx,
		"gemini-2.5-flash",
		genai.Text("Summarize the pdfs"),
		&genai.GenerateContentConfig{
			CachedContent: cacheName,
		},
	)
	if err != nil {
		return fmt.Errorf("failed to use content cache to generate content: %w", err)
	}

	respText := resp.Text()

	fmt.Fprintln(w, respText)

	// Example response:
	// The provided research paper introduces Gemini 1.5 Pro, a multimodal model capable of recalling
	// and reasoning over information from very long contexts (up to 10 million tokens).  Key findings include:
	//
	// * **Long Context Performance:**
	// ...

	return nil
}

Java

Pelajari cara menginstal atau memperbarui Java.

Untuk mempelajari lebih lanjut, lihat dokumentasi referensi SDK.

Tetapkan variabel lingkungan untuk menggunakan Gen AI SDK dengan Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.HttpOptions;

public class ContentCacheUseWithText {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash";
    // E.g cacheName = "projects/111111111111/locations/global/cachedContents/1111111111111111111"
    String cacheName = "your-cache-name";
    contentCacheUseWithText(modelId, cacheName);
  }

  // Shows how to generate text using cached content
  public static String contentCacheUseWithText(String modelId, String cacheName) {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (Client client =
        Client.builder()
            .location("global")
            .vertexAI(true)
            .httpOptions(HttpOptions.builder().apiVersion("v1").build())
            .build()) {

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              "Summarize the pdfs",
              GenerateContentConfig.builder().cachedContent(cacheName).build());

      System.out.println(response.text());
      // Example response
      // The Gemini family of multimodal models from Google DeepMind demonstrates remarkable
      // capabilities across various
      // modalities, including image, audio, video, and text....
      return response.text();
    }
  }
}

Node.js

Instal

npm install @google/genai

Untuk mempelajari lebih lanjut, lihat dokumentasi referensi SDK.

Tetapkan variabel lingkungan untuk menggunakan Gen AI SDK dengan Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


const {GoogleGenAI} = require('@google/genai');

const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION = process.env.GOOGLE_CLOUD_LOCATION || 'global';

async function useContentCache(
  projectId = GOOGLE_CLOUD_PROJECT,
  location = GOOGLE_CLOUD_LOCATION,
  cacheName = 'example-cache'
) {
  const client = new GoogleGenAI({
    vertexai: true,
    project: projectId,
    location: location,
    httpOptions: {
      apiVersion: 'v1',
    },
  });

  const response = await client.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: 'Summarize the pdfs',
    config: {
      cachedContent: cacheName,
    },
  });

  console.log(response.text);

  return response.text;
}
// Example response
//    The Gemini family of multimodal models from Google DeepMind demonstrates remarkable capabilities across various
//    modalities, including image, audio, video, and text....

REST

Anda dapat menggunakan REST untuk menggunakan context cache dengan perintah menggunakan the Agent Platform API untuk mengirim permintaan POST ke endpoint model penayang.

Sebelum menggunakan salah satu data permintaan, lakukan penggantian berikut:

PROJECT_ID: [Project ID](/resource-manager/docs/creating-managing-projects#identifiers) Anda. .
LOCATION: Region tempat permintaan untuk membuat context cache diproses.
PROMPT_TEXT: Perintah teks yang akan dikirimkan ke model.

Metode HTTP dan URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent

Meminta isi JSON:

{
  "cachedContent": "projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID",
  "contents": [
      {"role":"user","parts":[{"text":"PROMPT_TEXT"}]}
  ],
  "generationConfig": {
      "maxOutputTokens": 8192,
      "temperature": 1,
      "topP": 0.95,
  },
  "safetySettings": [
      {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_HARASSMENT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      }
  ],
}

Untuk mengirim permintaan Anda, pilih salah satu opsi berikut:

curl

Catatan: Perintah berikut mengasumsikan bahwa Anda telah login ke gcloud CLI menggunakan akun pengguna Anda dengan menjalankan gcloud init atau gcloud auth login, atau dengan menggunakan Cloud Shell, yang secara otomatis membuat Anda login ke gcloud CLI. Anda dapat memeriksa akun yang saat ini aktif dengan menjalankan gcloud auth list.

Simpan isi permintaan dalam file bernama request.json, dan jalankan perintah berikut:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent"

PowerShell

Catatan: Perintah berikut mengasumsikan bahwa Anda telah login ke gcloud CLI menggunakan akun pengguna Anda dengan menjalankan gcloud init atau gcloud auth login. Anda dapat memeriksa akun yang saat ini aktif dengan menjalankan gcloud auth list.

Simpan isi permintaan dalam file bernama request.json, dan jalankan perintah berikut:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent" | Select-Object -Expand Content

Anda akan melihat respons JSON yang mirip seperti berikut:

Respons

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": "MODEL_RESPONSE"
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.21866937,
          "severity": "HARM_SEVERITY_NEGLIGIBLE",
          "severityScore": 0.19946389
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "MEDIUM",
          "probabilityScore": 0.6880493,
          "severity": "HARM_SEVERITY_MEDIUM",
          "severityScore": 0.43374163
        },
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.4442634,
          "severity": "HARM_SEVERITY_LOW",
          "severityScore": 0.37903354
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.10502681,
          "severity": "HARM_SEVERITY_LOW",
          "severityScore": 0.28170192
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 55927,
    "candidatesTokenCount": 105,
    "totalTokenCount": 56032
  }
}

Contoh perintah curl

LOCATION="us-central1"
MODEL_ID="gemini-2.5-flash"
PROJECT_ID="test-project"

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:generateContent" -d \
'{
  "cachedContent": "projects/${PROJECT_NUMBER}/locations/${LOCATION}/cachedContents/${CACHE_ID}",
  "contents": [
      {"role":"user","parts":[{"text":"What are the benefits of exercise?"}]}
  ],
  "generationConfig": {
      "maxOutputTokens": 8192,
      "temperature": 1,
      "topP": 0.95,
  },
  "safetySettings": [
    {
      "category": "HARM_CATEGORY_HATE_SPEECH",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    }
  ],
}'

Langkah berikutnya

Pelajari cara memperbarui waktu habis masa berlaku context cache.
Pelajari cara membuat context cache baru.
Pelajari cara mendapatkan informasi tentang semua context cache yang terkait dengan Google Cloud project.
Pelajari cara menghapus context cache.

Menggunakan context cache Tetap teratur dengan koleksi Simpan dan kategorikan konten berdasarkan preferensi Anda.

Batasan penggunaan context cache

Menggunakan contoh context cache

Python

Instal

Go

Java

Node.js

Instal

REST

curl

PowerShell

Respons

Contoh perintah curl

Langkah berikutnya

Menggunakan context cache