이 페이지는 Cloud Translation API를 통해 번역되었습니다.

LangChain 에이전트 사용

시작하기 전에

이 튜토리얼에서는 사용자가 다음 안내를 읽고 따랐다고 가정합니다.

LangChain 에이전트 개발: agent를 LangchainAgent 인스턴스로 개발합니다.
사용자 인증: 에이전트 쿼리를 위해 사용자로 인증을 수행합니다.
SDK 가져오기 및 초기화: 필요한 경우 배포된 인스턴스를 가져올 수 있도록 클라이언트를 초기화합니다.

에이전트 인스턴스 가져오기

LangchainAgent를 쿼리하려면 먼저 새 인스턴스를 만들거나 기존 인스턴스를 가져와야 합니다.

특정 리소스 ID에 해당하는 LangchainAgent를 가져오려면 다음 안내를 따르세요.

Python용 Vertex AI SDK

다음 코드를 실행합니다.

import vertexai

client = vertexai.Client(  # For service interactions via client.agent_engines
    project="PROJECT_ID",
    location="LOCATION",
)

agent = client.agent_engines.get(name="projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

print(agent)

각 항목의 의미는 다음과 같습니다.

PROJECT_ID는 에이전트를 개발하고 배포하는 데 사용되는 Google Cloud 프로젝트 ID입니다.
LOCATION: 지원되는 리전 중 하나입니다.
RESOURCE_ID는 배포된 에이전트의 ID이며 reasoningEngine 리소스로 등록되어 있습니다.

Python 요청 라이브러리

다음 코드를 실행합니다.

from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests

def get_identity_token():
    credentials, _ = google_auth.default()
    auth_request = google_requests.Request()
    credentials.refresh(auth_request)
    return credentials.token

response = requests.get(
f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID",
    headers={
        "Content-Type": "application/json; charset=utf-8",
        "Authorization": f"Bearer {get_identity_token()}",
    },
)

REST API

curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID

Python용 Vertex AI SDK를 사용할 때 agent 객체는 다음을 포함하는 AgentEngine 클래스에 해당합니다.

배포된 에이전트에 관한 정보가 포함된 agent.api_resource agent.operation_schemas()를 호출하여 에이전트가 지원하는 작업 목록을 반환할 수도 있습니다. 자세한 내용은 지원되는 작업을 참고하세요.
동기 서비스 상호작용을 허용하는 agent.api_client
비동기 서비스 상호작용을 허용하는 agent.async_api_client

이 섹션의 나머지 부분에서는 agent라는 이름의 AgentEngine 인스턴스가 있다고 가정합니다.

지원되는 작업

지원되는 작업은 다음과 같습니다.

query: 쿼리에 대한 응답을 동기식으로 가져옵니다.
stream_query: 쿼리에 대한 응답을 스트리밍합니다.

query 및 stream_query 메서드 모두 같은 유형의 인수를 지원합니다.

input: 에이전트에게 전송할 메시지입니다.
config: 쿼리 컨텍스트의 구성(해당하는 경우)입니다.

에이전트 쿼리

명령어:

agent.query(input="What is the exchange rate from US dollars to SEK today?")

다음과 동일합니다(전체 형식).

agent.query(input={
    "input": [ # The input is represented as a list of messages (each message as a dict)
        {
            # The role (e.g. "system", "user", "assistant", "tool")
            "role": "user",
            # The type (e.g. "text", "tool_use", "image_url", "media")
            "type": "text",
            # The rest of the message (this varies based on the type)
            "text": "What is the exchange rate from US dollars to Swedish currency?",
        },
    ]
})

역할은 대답할 때 모델이 다양한 유형의 메시지를 구분하는 데 사용됩니다. 입력에서 role이 생략되면 기본값은 "user"입니다.

역할	설명
`system`	채팅 모델의 동작 방식을 알려주고 추가 컨텍스트를 제공하는 데 사용됩니다. 일부 채팅 모델 제공업체에서는 지원되지 않습니다.
`user`	모델과 상호작용하는 사용자의 입력을 나타냅니다. 일반적으로 텍스트 또는 기타 대화형 입력의 형태입니다.
`assistant`	모델의 응답을 나타내며, 여기에는 텍스트 또는 도구 호출 요청이 포함될 수 있습니다.
`tool`	외부 데이터 또는 처리가 검색된 후 도구 호출 결과를 모델에 다시 전달하는 데 사용되는 메시지입니다.

메시지의 type에 따라 나머지 메시지가 해석되는 방식도 결정됩니다(멀티모달 콘텐츠 처리 참고).

멀티모달 콘텐츠로 에이전트 쿼리

다음 에이전트(입력을 모델에 전달하고 도구를 사용하지 않음)를 사용하여 멀티모달 입력을 에이전트에게 전달하는 방법을 설명합니다.

agent = agent_engines.LangchainAgent(
    model="gemini-2.0-flash",
    runnable_builder=lambda model, **kwargs: model,
)

멀티모달 메시지는 type 및 해당 데이터를 지정하는 콘텐츠 블록을 통해 표현됩니다. 일반적으로 멀티모달 콘텐츠의 경우 type을 "media"로, file_uri를 Cloud Storage URI로, mime_type을 파일 해석용으로 지정합니다.

이미지

agent.query(input={"input": [
    {"type": "text", "text": "Describe the attached media in 5 words!"},
    {"type": "media", "mime_type": "image/jpeg", "file_uri": "gs://cloud-samples-data/generative-ai/image/cricket.jpeg"},
]})

동영상

agent.query(input={"input": [
    {"type": "text", "text": "Describe the attached media in 5 words!"},
    {"type": "media", "mime_type": "video/mp4", "file_uri": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4"},
]})

오디오

agent.query(input={"input": [
    {"type": "text", "text": "Describe the attached media in 5 words!"},
    {"type": "media", "mime_type": "audio/mp3", "file_uri": "gs://cloud-samples-data/generative-ai/audio/pixel.mp3"},
]})

Gemini에서 지원하는 MIME 유형 목록은 다음 문서를 참고하세요.

실행 가능한 구성으로 에이전트 쿼리

에이전트에 쿼리할 때 에이전트의 config(RunnableConfig의 스키마를 따름)를 지정할 수도 있습니다. 두 가지 일반적인 시나리오는 다음과 같습니다.

기본 구성 파라미터:
- run_id/run_name: 실행의 식별자입니다.
- tags/metadata: OpenTelemetry를 사용한 trace 시 실행의 분류자입니다.
맞춤 구성 파라미터(configurable를 통해):
- session_id: 실행이 진행되는 세션입니다(채팅 기록 저장 참고).
- thread_id: 실행이 진행되는 스레드입니다(체크포인트 저장 참고).

예를 들면 다음과 같습니다.

import uuid

run_id = uuid.uuid4()  # Generate an ID for tracking the run later.

response = agent.query(
    input="What is the exchange rate from US dollars to Swedish currency?",
    config={  # Specify the RunnableConfig here.
        "run_id": run_id                               # Optional.
        "tags": ["config-tag"],                        # Optional.
        "metadata": {"config-key": "config-value"},    # Optional.
        "configurable": {"session_id": "SESSION_ID"}   # Optional.
    },
)

print(response)