This tutorial demonstrates how you can use Memory Bank with ADK to manage long-term memories. After you configure your Agent Development Kit (ADK) agent to use Memory Bank, your agent orchestrates calls to Memory Bank to manage long-term memories for you.
Using Memory Bank with ADK involves the following steps:
Create your ADK agent and runner. ADK runners connect your agent to services that provide session and memory management.
Interact with your agent to dynamically generate long-term memories that are accessible across sessions.
To make calls directly to Memory Bank without ADK orchestration, see Quickstart with Agent Engine SDK. Using the Agent Engine SDK is helpful for understanding how Memory Bank generates memories or for inspecting the contents of Memory Bank.
Manage memories with ADK memory service and Memory Bank
VertexAiMemoryBankService is an ADK wrapper around Memory Bank that is defined by ADK's BaseMemoryService. You can define callbacks and tools that interact with the memory service to read and write memories.
The VertexAiMemoryBankService interface includes:
memory_service.add_session_to_memorytriggers aGenerateMemoriesrequest to Memory Bank using all of the events in the providedadk.Sessionas the source content. You can orchestrate calls to this method usingcallback_context.add_session_to_memoryin your callbacks.from google.adk.agents.callback_context import CallbackContext async def add_session_to_memory_callback(callback_context: CallbackContext): await callback_context.add_session_to_memory() return Nonememory_service.add_events_to_memorywhich triggers aGenerateMemoriesrequest to Memory Bank using a subset of events. You can orchestrate calls to this method usingcallback_context.add_events_to_memoryin your callbacks.from google.adk.agents.callback_context import CallbackContext async def add_events_to_memory_callback(callback_context: CallbackContext): await callback_context.add_events_to_memory(events=callback_context.session.events[-5:-1]) return Nonememory_service.search_memorytriggers aRetrieveMemoriesrequest to Memory Bank to fetch relevant memories for the currentuser_idandapp_name. You can orchestrate calls to this method using built-in memory tools (LoadMemoryToolorPreloadMemoryTool) or a custom tool that invokestool_context.search_memory.
Before you begin
To complete the steps demonstrated in this tutorial, you must first follow the steps in the Getting started section of the Set up Memory Bank page.
Set environment variables
To use ADK, set your environment variables:
import os
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "TRUE"
os.environ["GOOGLE_CLOUD_PROJECT"] = "PROJECT_ID"
os.environ["GOOGLE_CLOUD_LOCATION"] = "LOCATION"
Replace the following:
- PROJECT_ID: Your project ID.
- LOCATION: Your region. See the supported regions for Memory Bank.
Create your ADK agent
To create a memory-enabled agent, set up tools and callbacks that orchestrate calls to your memory service.
Define a memory generation callback
To orchestrate calls for memory generation, create a callback function that triggers memory generation. You can either send a subset of events (with callback_context.add_events_to_memory) or all of the events in a session (with callback_context.add_session_to_memory) to be processed in the background:
from google.adk.agents.callback_context import CallbackContext
async def generate_memories_callback(callback_context: CallbackContext):
# Option 1 (Recommended): Send events to Memory Bank for memory generation,
# which is ideal for incremental processing of events.
await callback_context.add_events_to_memory(
events=callback_context.session.events[-5:-1])
# Option 2: Send the full session to Memory Bank for memory generation.
# It's recommended to only call this at the end of a session to minimize
# how many times a single event is re-processed.
await callback_context.add_session_to_memory()
return None
Define a memory retrieval tool
When developing your ADK agent, include a memory tool that controls when the agent retrieves memories and how memories are included in the prompt.
If you use PreloadMemoryTool, your agent will retrieve memories at the start of each turn and include the retrieved memories in the system instruction, which is good for establishing baseline context about the user. If you use LoadMemoryTool, the model will call this tool when it decides that memories are necessary to answer the user query.
from google import adk
from google.adk.tools.load_memory_tool import LoadMemoryTool
from google.adk.tools.preload_memory_tool import PreloadMemoryTool
memory_retrieval_tools = [
# Option 1: Retrieve memories at the start of every turn.
PreloadMemoryTool(),
# Option 2: Retrieve memories via tool calls. The model will only call this tool
# when it decides that memories are necessary to respond to the user query.
LoadMemoryTool()
]
agent = adk.Agent(
model="gemini-2.5-flash",
name='stateful_agent',
instruction="""You are a Vehicle Voice Agent, designed to assist users with information and in-vehicle actions.
1. **Direct Action:** If a user requests a specific vehicle function (e.g., "turn on the AC"), execute it immediately using the corresponding tool. You don't have the outcome of the actual tool execution, so provide a hypothetical tool execution outcome.
2. **Information Retrieval:** Respond concisely to general information requests with your own knowledge (e.g., restaurant recommendation).
3. **Clarity:** When necessary, try to seek clarification to better understand the user's needs and preference before taking an action.
4. **Brevity:** Limit responses to under 30 words.
""",
tools=memory_retrieval_tools,
after_agent_callback=generate_memories_callback
)
Alternatively, you can create your own custom tool to retrieve memories, which is helpful for when you want to provide instructions to your agent on when to retrieve memories:
from google import adk
from google.adk.tools import ToolContext, FunctionTool
async def search_memories(query: str, tool_context: ToolContext):
"""Query this tool when you need to fetch information about user preferences."""
return await tool_context.search_memory(query)
agent = adk.Agent(
model="gemini-2.5-flash",
name='stateful_agent',
instruction="""...""",
tools=[FunctionTool(func=search_memories)],
after_agent_callback=generate_memories_callback
)
Define an ADK Memory Bank memory service and runtime
After you've created your memory-enabled agent, you need to link it to a memory service. The process of configuring your ADK memory service depends on where your ADK agent runs, which orchestrates the execution of your agents, tools, and callbacks.
Create an Agent Engine instance
You first need to create an Agent Engine instance to use for Memory Bank. This step is optional if you're using Agent Engine Runtime to deploy your agent. For more information on customizing your Memory Bank behavior, see the Configure your Agent Engine instance for Memory Bank section on the Set up Memory Bank page.
import vertexai
client = vertexai.Client(
project="PROJECT_ID",
location="LOCATION"
)
# If you don't have an Agent Engine instance already, create an Agent Engine
# Memory Bank instance using the default configuration.
agent_engine = client.agent_engines.create()
# Optionally, print out the Agent Engine resource name. You will need the
# resource name if you want to interact with your Agent Engine instance later on.
print(agent_engine.api_resource.name)
agent_engine_id = agent_engine.api_resource.name.split("/")[-1]
Replace the following:
- PROJECT_ID: Your project ID.
- LOCATION: Your region. See the supported regions for Memory Bank.
Create an ADK runtime
Pass the Agent Engine ID to the runtime or deployment scripts so that your agent uses Memory Bank as the ADK memory service.
Local runner
adk.Runner is generally used in a local environment, like Colab. In this case, you need to directly create the memory service and runner.
import asyncio
from google.adk.memory import VertexAiMemoryBankService
from google.adk.sessions import VertexAiSessionService
from google.genai import types
memory_service = VertexAiMemoryBankService(
project="PROJECT_ID",
location="LOCATION",
agent_engine_id="AGENT_ENGINE_ID",
)
# You can use any ADK session service. This example uses Agent Engine Sessions.
session_service = VertexAiSessionService(
project="PROJECT_ID",
location="LOCATION",
agent_engine_id="AGENT_ENGINE_ID",
)
runner = adk.Runner(
agent=agent,
app_name="APP_NAME",
session_service=session_service,
memory_service=memory_service
)
async def call_agent(query, session, user_id):
content = types.Content(role='user', parts=[types.Part(text=query)])
events = runner.run_async(
user_id=user_id, session_id=session, new_message=content)
async for event in events:
if event.is_final_response():
final_response = event.content.parts[0].text
print("Agent Response: ", final_response)
Replace the following:
- PROJECT_ID: Your project ID.
- LOCATION: Your region. See the supported regions for Memory Bank.
- APP_NAME: ADK app name. The app name will be included in the generated memories'
scopedictionary so that memories are isolated across both users and apps. - AGENT_ENGINE_ID: The Agent Engine ID to use for Memory Bank and Sessions. For example,
456inprojects/my-project/locations/us-central1/reasoningEngines/456.
Agent Engine
The Agent Engine ADK template (AdkApp) can be used both locally and to deploy an ADK agent to Agent Engine Runtime. When deployed on Agent Engine Runtime, the Agent Engine ADK template uses VertexAiMemoryBankService as the default memory service, using the same Agent Engine instance for Memory Bank as the Agent Engine Runtime. So, you can create your Memory Bank instance and deploy to a runtime in a single step.
See Configure Agent Engine for more details on setting up your Agent Engine Runtime, including how to customize the behavior of your Memory Bank.
Use the following code to deploy your memory-enabled ADK agent to Agent Engine Runtime:
import asyncio
import vertexai
from vertexai.agent_engines import AdkApp
client = vertexai.Client(
project="PROJECT_ID",
location="LOCATION"
)
adk_app = AdkApp(agent=agent)
# Create a new Agent Engine with your agent deployed to Agent Engine Runtime.
# The Agent Engine instance will also include an empty Memory Bank.
agent_engine = client.agent_engines.create(
agent_engine=adk_app,
config={
"staging_bucket": "STAGING_BUCKET",
"requirements": ["google-cloud-aiplatform[agent_engines,adk]"]
}
)
# Alternatively, update an existing Agent Engine to deploy your agent to Agent Engine Runtime.
# Your agent will have access to the Agent Engine instance's existing memories.
agent_engine = client.agent_engines.update(
name=agent_engine.api_resource.name,
agent_engine=adk_app,
config={
"staging_bucket": "STAGING_BUCKET",
"requirements": ["google-cloud-aiplatform[agent_engines,adk]"]
}
)
async def call_agent(query, session_id, user_id):
async for event in agent_engine.async_stream_query(
user_id=user_id,
session_id=session_id,
message=query,
):
print(event)
Replace the following:
- PROJECT_ID: Your project ID.
- LOCATION: Your region. See the supported regions for Memory Bank.
- STAGING_BUCKET: Your Cloud Storage bucket to use for staging your Agent Engine Runtime.
When run locally, the ADK template uses InMemoryMemoryService as the default memory service. However, you can override the default memory service to use VertexAiMemoryBankService:
def memory_bank_service_builder():
return VertexAiMemoryBankService(
project="PROJECT_ID",
location="LOCATION",
agent_engine_id="AGENT_ENGINE_ID"
)
adk_app = AdkApp(
agent=adk_agent,
# Override the default memory service.
memory_service_builder=memory_bank_service_builder
)
async def call_agent(query, session_id, user_id):
# adk_app is a local agent. If you want to deploy it to Agent Engine Runtime,
# use `client.agent_engines.create(...)` or `client.agent_engines.update(...)`
# and call the returned Agent Engine instance instead.
async for event in adk_app.async_stream_query(
user_id=user_id,
session_id=session_id,
message=query,
):
print(event)
Replace the following:
- PROJECT_ID: Your project ID.
- LOCATION: Your region. See the supported regions for Memory Bank.
- AGENT_ENGINE_ID: The Agent Engine ID to use for Memory Bank. For example,
456inprojects/my-project/locations/us-central1/reasoningEngines/456.
Cloud Run
To deploy your agent to Cloud Run, refer to the instructions in the ADK documentation to learn how to define your agent to deploy to Cloud Run.
adk deploy cloud_run \
...
--memory_service_uri=agentengine://AGENT_ENGINE_ID
GKE
To deploy your agent to Google Kubernetes Engine (GKE), refer to the instructions in the ADK documentation to learn how to define your agent to deploy to GKE.
adk deploy gke \
...
--memory_service_uri=agentengine://AGENT_ENGINE_ID
ADK Web
The ADK web interface lets you test your agents directly in the browser.
export GOOGLE_CLOUD_PROJECT="PROJECT_ID"
export GOOGLE_CLOUD_LOCATION="LOCATION"
adk web --memory_service_uri=agentengine://AGENT_ENGINE_ID
Replace the following:
- PROJECT_ID: Your project ID.
- LOCATION: Your region. See the supported regions for Memory Bank.
- AGENT_ENGINE_ID: The Agent Engine ID to use for Memory Bank. For example,
456inprojects/my-project/locations/us-central1/reasoningEngines/456.
Interact with your agent
After defining your agent and setting up Memory Bank, you can interact with your agent. If you provided a callback to trigger memory generation when initializing your agent, memories generation will be triggered every time that the agent is invoked.
Memories will be stored using the scope {"user_id": USER_ID, "app_name": APP_NAME} corresponding to the user ID and app name used to execute your agent.
The method of interacting with your agent depends on its execution environment:
Local runner
# Use `asyncio.run(session_service.create(...))` if you're running this
# code as a standard Python script.
session = await session_service.create_session(
app_name="APP_NAME",
user_id="USER_ID"
)
# Use `asyncio.run(call_agent(...))` if you're running this code as a
# standard Python script.
await call_agent(
"Can you fix the temperature?",
session.id,
"USER_ID"
)
Replace the following:
- APP_NAME: App name for your runner.
- USER_ID: An identifier for your user. Memories generated from this session are keyed by this opaque identifier. The generated memories' scope is stored as
{"user_id": "USER_ID"}.
Agent Engine
When using the Agent Engine ADK template, you can call your Agent Engine Runtime to interact with memory and sessions.
# Use `asyncio.run(agent_engine.async_create_session(...))` if you're
# running this code as a standard Python script.
session = await agent_engine.async_create_session(user_id="USER_ID")
# Use `asyncio.run(call_agent(...))` if you're running this code as a
# standard Python script.
await call_agent(
"Can you fix the temperature?",
session.get("id"),
"USER_ID"
)
Replace the following:
- USER_ID: An identifier for your user. Memories generated from this session are keyed by this opaque identifier. The generated memories' scope is stored as
{"user_id": "USER_ID"}.
Cloud Run
Refer to the Testing your agent section of the ADK Cloud Run deployment documentation.
GKE
Refer to the Testing your agent section of the ADK GKE deployment documentation.
ADK Web
To use ADK Web, navigate to the local server at http://localhost:8000.
By default, ADK Web will set the user ID to user. To override the default user ID, include userId in the query parameters, like http://localhost:8000?userId=YOUR_USER_ID.
For more information, refer to the ADK Web page in the ADK documentation.
Example interaction
First session
If you used the PreloadMemoryTool, the agent will try to retrieve memories at the beginning of each turn to access preferences the user previously communicated to the agent. During the agent's first interaction with the user, there are no available memories to be retrieved. So, the agent doesn't know any user preferences, such as their preferred temperature, as shown in the following example:
First turn:
User: "Can you fix the temperature?"
(Tool Call): ADK tries to fetch memories; no memories are available.
Model: "What temperature do you prefer?"
(Callback): ADK triggers memory generation. No memories are extracted.
Second turn:
User: I'm comfortable at 71 degrees.
(Tool Call): ADK tries to fetch memories; no memories are available.
Model: Ok, I've updated the temperature to 71 degrees.
(Callback): ADK triggers memory generation. Memory "I like the temperature 71 degrees" is created.
Second session
The extracted memory will be available for the next session for the same app name and user ID. If the user provides similar or contradictory information to existing memories, the new information will be consolidated with the existing memories.
First turn
User: Fix the temperature. It's so uncomfortable!
(Tool Call): ADK tries to fetch memories. Memory "I like the temperature 71 degrees" is retrieved.
Model: Ok, I've updated the temperature to 71 degrees.
(Callback): ADK triggers memory generation. No memories are extracted, because the user didn't share anything meaningful to persist.
Second turn
User: Actually, I prefer it to be warmer in the mornings.
(Tool Call): ADK tries to fetch memories. Memory "I like the temperature 71 degrees" is retrieved.
Model: Ok, I've made the temperature warmer.
(Callback): ADK triggers memory generation. The existing memory "I like the temperature 71 degrees" is updated to "I generally like the temperature to be 71 degrees, but I like it to be warmer in the mornings".
Clean up
To clean up all resources used in this project, you can delete the Google Cloud project you used for the quickstart.
Otherwise, you can delete the individual resources you created in this tutorial, as follows:
Use the following code sample to delete the Vertex AI Agent Engine instance, which also deletes any Sessions or Memories belonging to that Vertex AI Agent Engine.
agent_engine.delete(force=True)Delete any locally created files.