RAG 快速入门

本页面介绍了如何使用 Vertex AI SDK 在 Gemini Enterprise Agent Platform 上运行 RAG Engine 任务。

您也可以使用此笔记本 RAG 引擎简介 进行操作。

所需的角色

向您的用户账号授予角色。对以下每个 IAM 角色运行以下命令一次: roles/aiplatform.user

gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE

替换以下内容:

  • PROJECT_ID:您的项目 ID。
  • USER_IDENTIFIER:用户账号的标识符。 例如,myemail@example.com
  • ROLE:您授予用户账号的 IAM 角色。

准备 Google Cloud 控制台

如需使用 RAG 引擎,请执行以下操作:

  1. 安装 Agent Platform SDK for Python

  2. 在 Google Cloud 控制台中运行以下命令以设置项目。

    gcloud config set project {project}

  3. 运行此命令以授权您的登录。

    gcloud auth application-default login

运行 RAG 引擎

将以下示例代码复制并粘贴到 Google Cloud 控制台中,以运行 RAG 引擎。

Python

如需了解如何安装或更新 Vertex AI SDK for Python,请参阅安装 Vertex AI SDK for Python。 如需了解详情,请参阅 Python API 参考文档

from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai

# Create a RAG Corpus, Import Files, and Generate a response

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# display_name = "test_corpus"
# paths = ["https://drive.google.com/file/d/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-east4")

# Create RagCorpus
# Configure embedding model, for example "text-embedding-005".
embedding_model_config = rag.RagEmbeddingModelConfig(
    vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
        publisher_model="publishers/google/models/text-embedding-005"
    )
)

rag_corpus = rag.create_corpus(
    display_name=display_name,
    backend_config=rag.RagVectorDbConfig(
        rag_embedding_model_config=embedding_model_config
    ),
)

# Import Files to the RagCorpus
rag.import_files(
    rag_corpus.name,
    paths,
    # Optional
    transformation_config=rag.TransformationConfig(
        chunking_config=rag.ChunkingConfig(
            chunk_size=512,
            chunk_overlap=100,
        ),
    ),
    max_embedding_requests_per_min=1000,  # Optional
)

# Direct context retrieval
rag_retrieval_config = rag.RagRetrievalConfig(
    top_k=3,  # Optional
    filter=rag.Filter(vector_distance_threshold=0.5),  # Optional
)
response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=rag_corpus.name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="What is RAG and why it is helpful?",
    rag_retrieval_config=rag_retrieval_config,
)
print(response)

# Enhance generation
# Create a RAG retrieval tool
rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=rag_corpus.name,  # Currently only 1 corpus is allowed.
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag_retrieval_config,
        ),
    )
)

# Create a Gemini model instance
rag_model = GenerativeModel(
    model_name="gemini-2.0-flash-001", tools=[rag_retrieval_tool]
)

# Generate response
response = rag_model.generate_content("What is RAG and why it is helpful?")
print(response.text)
# Example response:
#   RAG stands for Retrieval-Augmented Generation.
#   It's a technique used in AI to enhance the quality of responses
# ...

curl

  1. 创建 RAG 语料库。

      export LOCATION=LOCATION
      export PROJECT_ID=PROJECT_ID
      export CORPUS_DISPLAY_NAME=CORPUS_DISPLAY_NAME
    
      // CreateRagCorpus
      // Output: CreateRagCorpusOperationMetadata
      curl -X POST \
      -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      -H "Content-Type: application/json" \
      https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora \
      -d '{
            "display_name" : "'"CORPUS_DISPLAY_NAME"'"
        }'
    

    如需了解详情,请参阅创建 RAG 语料库示例

  2. 导入 RAG 文件。

      // ImportRagFiles
      // Import a single Cloud Storage file or all files in a Cloud Storage bucket.
      // Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
      export RAG_CORPUS_ID=RAG_CORPUS_ID
      export GCS_URIS=GCS_URIS
      export CHUNK_SIZE=CHUNK_SIZE
      export CHUNK_OVERLAP=CHUNK_OVERLAP
      export EMBEDDING_MODEL_QPM_RATE=EMBEDDING_MODEL_QPM_RATE
    
      // Output: ImportRagFilesOperationMetadataNumber
      // Use ListRagFiles, or import_result_sink to get the correct rag_file_id.
      curl -X POST \
      -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      -H "Content-Type: application/json" \
      https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
      -d '{
        "import_rag_files_config": {
          "gcs_source": {
            "uris": "GCS_URIS"
          },
          "rag_file_chunking_config": {
            "chunk_size": CHUNK_SIZE,
            "chunk_overlap": CHUNK_OVERLAP
          },
          "max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
        }
      }'
    

    如需了解详情,请参阅导入 RAG 文件示例

  3. 运行 RAG 检索查询。

      export RAG_CORPUS_RESOURCE=RAG_CORPUS_RESOURCE
      export VECTOR_DISTANCE_THRESHOLD=VECTOR_DISTANCE_THRESHOLD
      export SIMILARITY_TOP_K=SIMILARITY_TOP_K
    
      {
      "vertex_rag_store": {
          "rag_resources": {
            "rag_corpus": "RAG_CORPUS_RESOURCE"
          },
          "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
        },
        "query": {
        "text": TEXT
        "similarity_top_k": SIMILARITY_TOP_K
        }
      }
    
      curl -X POST \
          -H "Authorization: Bearer $(gcloud auth print-access-token)" \
          -H "Content-Type: application/json; charset=utf-8" \
          -d @request.json \
          "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"
    

    如需了解详情,请参阅 RAG Engine API

  4. 生成内容。

    {
    "contents": {
      "role": "USER",
      "parts": {
        "text": "INPUT_PROMPT"
      }
    },
    "tools": {
      "retrieval": {
      "disable_attribution": false,
      "vertex_rag_store": {
        "rag_resources": {
          "rag_corpus": "RAG_CORPUS_RESOURCE"
        },
        "similarity_top_k": "SIMILARITY_TOP_K",
        "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
      }
      }
    }
    }
    
    curl -X POST \
        -H "Authorization: Bearer $(gcloud auth print-access-token)" \
        -H "Content-Type: application/json; charset=utf-8" \
        -d @request.json \
        "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"
    

    如需了解详情,请参阅 RAG Engine API

后续步骤