이 페이지는 Cloud Translation API를 통해 번역되었습니다.

맞춤 에이전트 사용

시작하기 전에

이 튜토리얼에서는 사용자가 다음 안내를 읽고 따랐다고 가정합니다.

커스텀 에이전트 개발: 커스텀 agent를 개발합니다.
사용자 인증: 에이전트 쿼리를 위해 사용자로 인증을 수행합니다.
SDK 가져오기 및 초기화: 필요한 경우 배포된 인스턴스를 가져올 수 있도록 클라이언트를 초기화합니다.

에이전트 인스턴스 가져오기

에이전트를 쿼리하려면 먼저 에이전트의 인스턴스가 필요합니다. 에이전트의 새 인스턴스를 만들거나 기존 인스턴스를 가져올 수 있습니다.

특정 리소스 ID에 해당하는 에이전트를 가져오려면 다음 안내를 따르세요.

Python용 Vertex AI SDK

다음 코드를 실행합니다.

import vertexai

client = vertexai.Client(  # For service interactions via client.agent_engines
    project="PROJECT_ID",
    location="LOCATION",
)

agent = client.agent_engines.get(name="projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

print(agent)

각 항목의 의미는 다음과 같습니다.

PROJECT_ID는 에이전트를 개발하고 배포하는 데 사용되는 Google Cloud 프로젝트 ID입니다.
LOCATION: 지원되는 리전 중 하나입니다.
RESOURCE_ID는 배포된 에이전트의 ID이며 reasoningEngine 리소스로 등록되어 있습니다.

요청

다음 코드를 실행합니다.

from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests

def get_identity_token():
    credentials, _ = google_auth.default()
    auth_request = google_requests.Request()
    credentials.refresh(auth_request)
    return credentials.token

response = requests.get(
f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID",
    headers={
        "Content-Type": "application/json; charset=utf-8",
        "Authorization": f"Bearer {get_identity_token()}",
    },
)

REST

curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID

Python용 Vertex AI SDK를 사용할 때 agent 객체는 다음을 포함하는 AgentEngine 클래스에 해당합니다.

배포된 에이전트에 관한 정보가 포함된 agent.api_resource agent.operation_schemas()를 호출하여 에이전트가 지원하는 작업 목록을 반환할 수도 있습니다. 자세한 내용은 지원되는 작업을 참고하세요.
동기 서비스 상호작용을 허용하는 agent.api_client
비동기 서비스 상호작용을 허용하는 agent.async_api_client

이 섹션의 나머지 부분에서는 agent라는 이름의 인스턴스가 있다고 가정합니다.

지원되는 작업 나열

로컬에서 에이전트를 개발할 때는 에이전트가 지원하는 작업에 액세스하고 이를 알 수 있습니다. 배포된 에이전트를 사용하려면 지원되는 작업을 열거하면 됩니다.

Python용 Vertex AI SDK

다음 코드를 실행합니다.

print(agent.operation_schemas())

요청

다음 코드를 실행합니다.

import json

json.loads(response.content).get("spec").get("classMethods")

REST

curl 요청에 대한 응답에서 spec.class_methods에 표시됩니다.

각 작업의 스키마는 호출할 수 있는 에이전트의 메서드 정보를 문서화하는 사전입니다. 지원되는 작업 집합은 에이전트를 개발하는 데 사용한 프레임워크에 따라 다릅니다.

예를 들어 다음은 LangchainAgent의 query 작업의 스키마입니다.

{'api_mode': '',
 'name': 'query',
 'description': """Queries the Agent with the given input and config.
    Args:
        input (Union[str, Mapping[str, Any]]):
            Required. The input to be passed to the Agent.
        config (langchain_core.runnables.RunnableConfig):
            Optional. The config (if any) to be used for invoking the Agent.
    Returns:
        The output of querying the Agent with the given input and config.
""",            '        ',
 'parameters': {'$defs': {'RunnableConfig': {'description': 'Configuration for a Runnable.',
                                             'properties': {'configurable': {...},
                                                            'run_id': {...},
                                                            'run_name': {...},
                                                            ...},
                                             'type': 'object'}},
                'properties': {'config': {'nullable': True},
                               'input': {'anyOf': [{'type': 'string'}, {'type': 'object'}]}},
                'required': ['input'],
                'type': 'object'}}

각 항목의 의미는 다음과 같습니다.

name은 작업 이름입니다(예: query라는 작업의 경우 agent.query).
api_mode는 작업의 API 모드입니다(동기의 경우 "", 스트리밍의 경우 "stream").
description은 메서드의 문서 문자열을 기반으로 한 작업 설명입니다.
parameters는 OpenAPI 스키마 형식의 입력 인수 스키마입니다.

지원되는 작업을 사용하여 에이전트 쿼리

맞춤 에이전트의 경우 에이전트를 개발할 때 정의한 다음 쿼리 또는 스트리밍 작업을 사용할 수 있습니다.

query
stream_query
async_query
async_stream_query

일부 프레임워크는 특정 쿼리 또는 스트리밍 작업만 지원합니다.

프레임워크	지원되는 쿼리 작업
에이전트 개발 키트	`async_stream_query`
LangChain	`query`, `stream_query`
LangGraph	`query`, `stream_query`
AG2	`query`
LlamaIndex	`query`

에이전트 쿼리

query 작업을 사용하여 에이전트를 쿼리합니다.

Python용 Vertex AI SDK

agent.query(input="What is the exchange rate from US dollars to Swedish Krona today?")

요청

from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests

def get_identity_token():
    credentials, _ = google_auth.default()
    auth_request = google_requests.Request()
    credentials.refresh(auth_request)
    return credentials.token

requests.post(
    f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID:query",
    headers={
        "Content-Type": "application/json; charset=utf-8",
        "Authorization": f"Bearer {get_identity_token()}",
    },
    data=json.dumps({
        "class_method": "query",
        "input": {
            "input": "What is the exchange rate from US dollars to Swedish Krona today?"
        }
    })
)

REST

curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID:query -d '{
  "class_method": "query",
  "input": {
    "input": "What is the exchange rate from US dollars to Swedish Krona today?"
  }
}'

쿼리 응답은 로컬 애플리케이션 테스트 출력과 비슷한 문자열입니다.

{"input": "What is the exchange rate from US dollars to Swedish Krona today?",
 # ...
 "output": "For 1 US dollar you will get 10.7345 Swedish Krona."}

에이전트의 응답 스트리밍

stream_query 작업을 사용하여 에이전트의 응답을 스트리밍합니다.

Python용 Vertex AI SDK

agent = agent_engines.get("projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

for response in agent.stream_query(
    input="What is the exchange rate from US dollars to Swedish Krona today?"
):
    print(response)

요청

from google import auth as google_auth
from google.auth.transport import requests as google_requests
import requests

def get_identity_token():
    credentials, _ = google_auth.default()
    auth_request = google_requests.Request()
    credentials.refresh(auth_request)
    return credentials.token

requests.post(
    f"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID:streamQuery",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"Bearer {get_identity_token()}",
    },
    data=json.dumps({
        "class_method": "stream_query",
        "input": {
            "input": "What is the exchange rate from US dollars to Swedish Krona today?"
        },
    }),
    stream=True,
)

REST

curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID:streamQuery?alt=sse -d '{
  "class_method": "stream_query",
  "input": {
    "input": "What is the exchange rate from US dollars to Swedish Krona today?"
  }
}'

Vertex AI Agent Engine은 반복적으로 생성된 객체의 시퀀스로 응답을 스트리밍합니다. 예를 들어 세 개의 응답 세트는 다음과 같을 수 있습니다.

{'actions': [{'tool': 'get_exchange_rate', ...}]}  # first response
{'steps': [{'action': {'tool': 'get_exchange_rate', ...}}]}  # second response
{'output': 'The exchange rate is 11.0117 SEK per USD as of 2024-12-03.'}  # final response

비동기적으로 에이전트 쿼리

에이전트를 개발할 때 async_query 작업을 정의한 경우 Vertex AI SDK for Python에서 에이전트의 클라이언트 측 비동기 쿼리가 지원됩니다.

Python용 Vertex AI SDK

agent = agent_engines.get("projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

response = await agent.async_query(
    input="What is the exchange rate from US dollars to Swedish Krona today?"
)
print(response)

쿼리 응답은 로컬 테스트 출력과 동일한 사전입니다.

{"input": "What is the exchange rate from US dollars to Swedish Krona today?",
 # ...
 "output": "For 1 US dollar you will get 10.7345 Swedish Krona."}

에이전트의 응답을 비동기적으로 스트리밍

에이전트를 개발할 때 async_stream_query 작업을 정의한 경우 작업 중 하나(예: async_stream_query)를 사용하여 에이전트의 응답을 비동기적으로 스트리밍할 수 있습니다.

Vertex AI SDK for Python

agent = agent_engines.get("projects/PROJECT_ID/locations/LOCATION/reasoningEngines/RESOURCE_ID")

async for response in agent.async_stream_query(
    input="What is the exchange rate from US dollars to Swedish Krona today?"
):
    print(response)

async_stream_query 작업은 내부적으로 동일한 streamQuery 엔드포인트를 호출하고 반복적으로 생성된 객체의 시퀀스로 응답을 비동기적으로 스트리밍합니다. 예를 들어 세 개의 응답 세트는 다음과 같을 수 있습니다.

{'actions': [{'tool': 'get_exchange_rate', ...}]}  # first response
{'steps': [{'action': {'tool': 'get_exchange_rate', ...}}]}  # second response
{'output': 'The exchange rate is 11.0117 SEK per USD as of 2024-12-03.'}  # final response

응답은 로컬 테스트 중에 생성된 응답과 동일해야 합니다.

맞춤 에이전트 사용 컬렉션을 사용해 정리하기 내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.

시작하기 전에

에이전트 인스턴스 가져오기

Python용 Vertex AI SDK

요청

REST

지원되는 작업 나열

Python용 Vertex AI SDK

요청

REST

지원되는 작업을 사용하여 에이전트 쿼리

에이전트 쿼리

Python용 Vertex AI SDK

요청

REST

에이전트의 응답 스트리밍

Python용 Vertex AI SDK

요청

REST

비동기적으로 에이전트 쿼리

Python용 Vertex AI SDK

에이전트의 응답을 비동기적으로 스트리밍

Vertex AI SDK for Python

다음 단계

맞춤 에이전트 사용