שימוש במטמון הקשר

אתם יכולים להשתמש בממשקי REST API או ב-Python SDK כדי להפנות לתוכן שמאוחסן במטמון של הקשר באפליקציית AI גנרטיבי. לפני שתוכלו להשתמש בו, תצטרכו קודם ליצור את מטמון ההקשר.

אובייקט מטמון ההקשר שבו אתם משתמשים בקוד כולל את המאפיינים הבאים:

‫name – שם המשאב של מטמון ההקשר. הפורמט שלו הוא projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID. כשיוצרים מטמון הקשר, שם המשאב שלו מופיע בתגובה. מספר הפרויקט הוא מזהה ייחודי של הפרויקט. מזהה המטמון הוא מזהה של המטמון. כשמציינים מטמון הקשר בקוד, צריך להשתמש בשם המשאב המלא של מטמון ההקשר. הדוגמה הבאה מראה איך מציינים שם של משאב תוכן שנשמר במטמון בגוף הבקשה:
```
"cached_content": "projects/123456789012/locations/us-central1/123456789012345678"
```
‫model – שם המשאב של המודל ששימש ליצירת המטמון. הפורמט שלו הוא projects/PROJECT_NUMBER/locations/LOCATION/publishers/PUBLISHER_NAME/models/MODEL_ID.
‫createTime – Timestamp שמציין את שעת היצירה של מטמון ההקשר.
‫updateTime – Timestamp שמציין את זמן העדכון האחרון של מטמון ההקשר. אחרי שיוצרים מטמון הקשר ולפני שהוא מתעדכן, הערכים של createTime ושל updateTime זהים.
‫expireTime - Timestamp שמציין מתי יפוג תוקף של מטמון הקשר. ברירת המחדל expireTime היא 60 דקות אחרי createTime. אפשר לעדכן את המטמון עם זמן תפוגה חדש. מידע נוסף זמין במאמר בנושא עדכון המטמון של ההקשר. אחרי שתוקף המטמון פג, הוא מסומן למחיקה ואין להניח שאפשר להשתמש בו או לעדכן אותו. אם אתם צריכים להשתמש במטמון הקשר שתוקף השימוש בו פג, תצטרכו ליצור אותו מחדש עם זמן תפוגה מתאים.

הגבלות על השימוש במטמון ההקשר

כשיוצרים מטמון הקשר, אפשר לציין את התכונות הבאות. אין לציין אותם שוב בבקשה:

המאפיין GenerativeModel.system_instructions. המאפיין הזה משמש כדי לציין הוראות למודל לפני שהמודל מקבל הוראות ממשתמש. מידע נוסף זמין במאמר בנושא הוראות מערכת.
המאפיין GenerativeModel.tool_config. המאפיין tool_config משמש לציון כלים שבהם נעשה שימוש במודל Gemini, כמו כלי שמשמש את התכונה קריאה לפונקציה.
המאפיין GenerativeModel.tools. המאפיין GenerativeModel.tools משמש לציון פונקציות ליצירת אפליקציה להפעלת פונקציות. מידע נוסף זמין במאמר בנושא קריאה לפונקציות.

שימוש בדוגמה של מטמון הקשר

בדוגמה הבאה אפשר לראות איך משתמשים במטמון הקשר. כשמשתמשים במטמון הקשר, אי אפשר לציין את המאפיינים הבאים:

GenerativeModel.system_instructions
GenerativeModel.tool_config
GenerativeModel.tools

Python

התקנה

pip install --upgrade google-genai

מידע נוסף מופיע ב מאמרי העזרה בנושא SDK.

מגדירים משתני סביבה כדי להשתמש ב-Gen AI SDK עם Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, HttpOptions

client = genai.Client(http_options=HttpOptions(api_version="v1"))
# Use content cache to generate text response
# E.g cache_name = 'projects/.../locations/.../cachedContents/1111111111111111111'
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Summarize the pdfs",
    config=GenerateContentConfig(
        cached_content=cache_name,
    ),
)
print(response.text)
# Example response
#   The Gemini family of multimodal models from Google DeepMind demonstrates remarkable capabilities across various
#   modalities, including image, audio, video, and text....

Go

כך מתקינים או מעדכנים את Go.

מידע נוסף מופיע ב מאמרי העזרה בנושא SDK.

מגדירים משתני סביבה כדי להשתמש ב-Gen AI SDK עם Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

import (
	"context"
	"fmt"
	"io"

	genai "google.golang.org/genai"
)

// useContentCacheWithTxt shows how to use content cache to generate text content.
func useContentCacheWithTxt(w io.Writer, cacheName string) error {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return fmt.Errorf("failed to create genai client: %w", err)
	}

	resp, err := client.Models.GenerateContent(ctx,
		"gemini-2.5-flash",
		genai.Text("Summarize the pdfs"),
		&genai.GenerateContentConfig{
			CachedContent: cacheName,
		},
	)
	if err != nil {
		return fmt.Errorf("failed to use content cache to generate content: %w", err)
	}

	respText := resp.Text()

	fmt.Fprintln(w, respText)

	// Example response:
	// The provided research paper introduces Gemini 1.5 Pro, a multimodal model capable of recalling
	// and reasoning over information from very long contexts (up to 10 million tokens).  Key findings include:
	//
	// * **Long Context Performance:**
	// ...

	return nil
}

Java

כך מתקינים או מעדכנים את Java.

מידע נוסף מופיע ב מאמרי העזרה בנושא SDK.

מגדירים משתני סביבה כדי להשתמש ב-Gen AI SDK עם Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.HttpOptions;

public class ContentCacheUseWithText {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash";
    // E.g cacheName = "projects/111111111111/locations/global/cachedContents/1111111111111111111"
    String cacheName = "your-cache-name";
    contentCacheUseWithText(modelId, cacheName);
  }

  // Shows how to generate text using cached content
  public static String contentCacheUseWithText(String modelId, String cacheName) {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (Client client =
        Client.builder()
            .location("global")
            .vertexAI(true)
            .httpOptions(HttpOptions.builder().apiVersion("v1").build())
            .build()) {

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              "Summarize the pdfs",
              GenerateContentConfig.builder().cachedContent(cacheName).build());

      System.out.println(response.text());
      // Example response
      // The Gemini family of multimodal models from Google DeepMind demonstrates remarkable
      // capabilities across various
      // modalities, including image, audio, video, and text....
      return response.text();
    }
  }
}

Node.js

התקנה

npm install @google/genai

מידע נוסף מופיע ב מאמרי העזרה בנושא SDK.

מגדירים משתני סביבה כדי להשתמש ב-Gen AI SDK עם Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


const {GoogleGenAI} = require('@google/genai');

const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION = process.env.GOOGLE_CLOUD_LOCATION || 'global';

async function useContentCache(
  projectId = GOOGLE_CLOUD_PROJECT,
  location = GOOGLE_CLOUD_LOCATION,
  cacheName = 'example-cache'
) {
  const client = new GoogleGenAI({
    vertexai: true,
    project: projectId,
    location: location,
    httpOptions: {
      apiVersion: 'v1',
    },
  });

  const response = await client.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: 'Summarize the pdfs',
    config: {
      cachedContent: cacheName,
    },
  });

  console.log(response.text);

  return response.text;
}
// Example response
//    The Gemini family of multimodal models from Google DeepMind demonstrates remarkable capabilities across various
//    modalities, including image, audio, video, and text....

REST

אפשר להשתמש ב-REST כדי להשתמש במטמון הקשר עם הנחיה באמצעות Vertex AI API לשליחת בקשת POST לנקודת הקצה של המודל של בעל התוכן הדיגיטלי.

לפני שמשתמשים בנתוני הבקשה, צריך להחליף את הנתונים הבאים:

‫PROJECT_ID: מזהה הפרויקט.
‫LOCATION: האזור שבו הבקשה ליצירת מטמון ההקשר עובדה.
‫MIME_TYPE: הנחיית הטקסט לשליחה למודל.

ה-method של ה-HTTP וכתובת ה-URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent

גוף בקשת JSON:

{
  "cachedContent": "projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID",
  "contents": [
      {"role":"user","parts":[{"text":"PROMPT_TEXT"}]}
  ],
  "generationConfig": {
      "maxOutputTokens": 8192,
      "temperature": 1,
      "topP": 0.95,
  },
  "safetySettings": [
      {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
          "category": "HARM_CATEGORY_HARASSMENT",
          "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      }
  ],
}

כדי לשלוח את הבקשה עליכם לבחור אחת מהאפשרויות הבאות:

curl

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login, או באמצעות Cloud Shell שמחבר אתכם אוטומטית ל-CLI של gcloud. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

שומרים את גוף הבקשה בקובץ בשם request.json ומריצים את הפקודה הבאה:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent"

PowerShell

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

שומרים את גוף הבקשה בקובץ בשם request.json ומריצים את הפקודה הבאה:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent" | Select-Object -Expand Content

אתם אמורים לקבל תגובת JSON שדומה לזו:

תשובה

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "text": "MODEL_RESPONSE"
          }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HATE_SPEECH",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.21866937,
          "severity": "HARM_SEVERITY_NEGLIGIBLE",
          "severityScore": 0.19946389
        },
        {
          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability": "MEDIUM",
          "probabilityScore": 0.6880493,
          "severity": "HARM_SEVERITY_MEDIUM",
          "severityScore": 0.43374163
        },
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.4442634,
          "severity": "HARM_SEVERITY_LOW",
          "severityScore": 0.37903354
        },
        {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability": "NEGLIGIBLE",
          "probabilityScore": 0.10502681,
          "severity": "HARM_SEVERITY_LOW",
          "severityScore": 0.28170192
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 55927,
    "candidatesTokenCount": 105,
    "totalTokenCount": 56032
  }
}

דוגמה לפקודת curl

LOCATION="us-central1"
MODEL_ID="gemini-2.0-flash-001"
PROJECT_ID="test-project"

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:generateContent" -d \
'{
  "cachedContent": "projects/${PROJECT_NUMBER}/locations/${LOCATION}/cachedContents/${CACHE_ID}",
  "contents": [
      {"role":"user","parts":[{"text":"What are the benefits of exercise?"}]}
  ],
  "generationConfig": {
      "maxOutputTokens": 8192,
      "temperature": 1,
      "topP": 0.95,
  },
  "safetySettings": [
    {
      "category": "HARM_CATEGORY_HATE_SPEECH",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    }
  ],
}'

שימוש במטמון הקשר קל לארגן דפים בעזרת אוספים אפשר לשמור ולסווג תוכן על סמך ההעדפות שלך.

הגבלות על השימוש במטמון ההקשר

שימוש בדוגמה של מטמון הקשר

Python

התקנה

Go

Java

Node.js

התקנה

REST

curl

PowerShell

תשובה

דוגמה לפקודת curl

שימוש במטמון הקשר