Access and search data stored in Elasticsearch by creating a foreign data wrapper (FDW) and foreign table in AlloyDB for PostgreSQL.
Limitations
Before connecting AlloyDB to Elasticsearch, acknowledge the following limitations:
Elasticsearch integration is only available on PostgreSQL major version
17and up.AlloyDB reads, but doesn't write to, Elasticsearch data.
You are responsible for synchronizing data between AlloyDB and Elasticsearch.
Specialized Elasticsearch types, such as
geo_pointare not supported. For the full list of supported data types, see Supported data types.
Before you begin
Before you begin, ensure that you've completed the following:
Enable outbound connectivity on your primary AlloyDB instance.
Create a read-only Personal/User API key that AlloyDB can use to access your Elasticsearch cluster.
Store Elasticsearch API key in Secret Manager
AlloyDB stores and reads your Elasticsearch API key from Secret Manager. For more information on how to use Secret Manager, see Create and access a secret using Secret Manager.
Ensure that you give your AlloyDB service account permission to read the secret. For more information, see Create and access a secret using Secret Manager.
Enable and configure external_search_fdw extension
To start your integration with Elasticsearch, complete the following
instructions to enable and configure the
external_search_fdw
AlloyDB extension:
Enable the
external_search_fdwextension.CREATE EXTENSION external_search_fdw;Configure access to your Elasticsearch cluster through a foreign data server.
CREATE SERVER ELASTICSEARCH_SERVER_NAME FOREIGN DATA WRAPPER external_search_fdw OPTIONS (server 'ELASTICSEARCH_SERVER_HOST_PORT', search_provider 'elastic', auth_mode 'secret_manager', auth_method 'AUTH_METHOD', secret_path 'SECRET_PATH', max_deadline_ms 'MAX_DEADLINE', pagination_num_results 'PAGINATION_NUM_RESULTS', pagination_context_timeout_ms 'PAGINATION_CONTEXT_TIMEOUT');Replace the following variables:
ELASTICSEARCH_SERVER_NAME: name for your foreign data server. For example,my-elasticsearch-server.ELASTICSEARCH_SERVER_HOST_PORT: Public-facing URL for your Elasticsearch cluster. For example,https://node1.elastic.test.com:9200.AUTH_METHOD: type of authentication to use. You can choose between the following options:ApiKey: Elasticsearch Personal/User API key.Basic: Elasticsearch username and password.
SECRET_PATH: Secret Manager path to your Elasticsearch authentication credentials. For example,projects/123456789012/secrets/apikey/versions/1.123456789012represents your Google Cloud project ID.(Optional)
MAX_DEADLINE: maximum amount of time, in milliseconds, that AlloyDB waits for a response from Elasticsearch. You should set this value based on the locations of your AlloyDB and Elasticsearch instances. The default value is10000.(Optional)
PAGINATION_NUM_RESULTS: max number of results fetched per batch from Elasticsearch. If more results are requested, AlloyDB retrieves the results in multiple batches of this size. The default value is32.(Optional)
PAGINATION_CONTEXT_TIMEOUT: amount of time, in milliseconds, that Elasticsearch keeps the pagination request context active. The default value is30000.
Define the PostgreSQL user mapping for the Elasticsearch server. Note that PostgreSQL FDWs require this user mapping to function. AlloyDB authenticates using the REST authorization header.
CREATE USER MAPPING FOR CURRENT_USER SERVER ELASTICSEARCH_SERVER_NAME;Configure the schema for your Elasticsearch data through a foreign data table.
CREATE FOREIGN TABLE ELASTICSEARCH_FD_TABLE( metadata external_search_fdw_schema.OpaqueMetadata, ELASTICSEARCH_FIELDS) SERVER ELASTICSEARCH_SERVER_NAME OPTIONS(remote_table_name 'ELASTICSEARCH_INDEX_NAME');Replace the following new variables:
ELASTICSEARCH_FD_TABLE: name of the foreign data table that represents your Elasticsearch table. For example,my-fd-elasticsearch-table.ELASTICSEARCH_FIELDS: comma-separated list of Elasticsearch field schema definitions in the following format:elasticsearch_field_name PG_DATA_TYPE. For example,elasticsearch_boolean_field_name BOOLEAN, elasticsearch_double_field_name DOUBLE PRECISION. These fields must match the field names in Elasticsearch unless theremote_field_nameoption is appended. For example,elasticsearch_foo OPTIONS (remote_field_name 'elasticsearch_FOO').For the list of Elasticsearch data types that can be defined for AlloyDB, see Supported data types.
ELASTICSEARCH_INDEX_NAME: name of your Elasticsearch index. For example,my-elasticsearch-index.
Supported data types
AlloyDB supports the following Elasticsearch data types:
| Data type(s) | PostgreSQL type |
|---|---|
alias
|
PostgreSQL type for the field that alias is referencing
|
binary
|
bytea
|
boolean
|
BOOLEAN
|
|
|
SMALLINT
|
date
|
TIMESTAMPTZ
|
DOUBLE PRECISION
|
|
REAL
|
|
integer
|
INTEGER
|
long
|
BIGINT
|
jsonb
|
|
|
|
TEXT
|
unsigned_long
|
NUMERIC
|
Query your Elasticsearch data
AlloyDB takes SQL queries and converts them to
Elasticsearch REST API queries. During this conversion,
AlloyDB attempts to push down as much query logic as
possible without changing the query's identity, including the SQL query's
LIMIT. However, there are cases where you might specify not to push down
certain Elasticsearch fields or where query logic cannot be pushed down. For
example, LIKE and
other text-matching operators cannot be pushed down. For more examples of what
can and can't be pushed down, see Pushdown examples.
In scenarios where the LIMIT is set higher than pagination_num_results or
where LIMIT is not specified or cannot be pushed down,
AlloyDB uses the
Scroll API,
which can be resource-intensive.
Because the Scroll API can be resource-intensive, we recommend examining your
queries using EXPLAIN VERBOSE to see which APIs are used. Limiting the use of
the Scroll API and using LIMIT improves performance.
To query your Elasticsearch data, you have the following options:
- Standard SQL queries
- Query DSL
- Hybrid searches
Standard SQL queries
Standard SQL queries can be written using Elasticsearch's Lucene syntax.
To perform a standard SQL query, see the following example query:
SELECT id, body
FROM ELASTICSEARCH_FD_TABLE
WHERE FILTER
ORDER BY metadata <@> 'QUERY';
Replace the following variables:
ELASTICSEARCH_FD_TABLE: name of the foreign data table that represents your Elasticsearch table. For example,my-fd-elasticsearch-table.(Optional)
FILTER: filter to apply to your Elasticsearch query. For example,AND qubits < 105.QUERY: query to send to Elasticsearch. For a few example queries, see the following list:body:quantum body:computingbody:(quantum computing)body:(quantum AND computing)body:"quantum computing"body:"quantum computing" AND qubits:[* TO 105}
Query DSL
Query DSL is Elasticsearch's full-featured, JSON-style query language recommended for advanced use cases. Query DSL lets you perform complex searches, filtering, and aggregations that can't be expressed in SQL query syntax.
To perform queries using Query DSL, see the following example query:
SELECT id, body
FROM ELASTICSEARCH_FD_TABLE
ORDER BY
metadata <@> $${
"query": {
"bool": {
"must": [
{
"query_string": {
"query" : "QUERY"
}
}
],
"filter": [
{
"range": {
"id": {
"lt": "10"
}
}
}
]
}
},
"sort": [
{
"id": {
"order": "desc"
}
}
]
}$$
LIMIT 1;
Replace the following variables:
ELASTICSEARCH_FD_TABLE: name of the foreign data table that represents your Elasticsearch table. For example,my-fd-elasticsearch-table.QUERY: query to send to Elasticsearch. For example,"elasticsearch_field_name:\"quantum computing\" OR int_field:[* TO 3]".
Note that for Query DSL, you're only expected to propagate the query,
filter, and sort expressions.
Hybrid searches
To perform a hybrid search on your Elasticsearch data, see the following example search:
SELECT *
FROM
ai.hybrid_search(
ARRAY[
'{"limit": LIMIT,
"data_type": "external_search_fdw",
"weight": WEIGHT,
"table_name": "ELASTICSEARCH_FD_TABLE",
"key_column": "DOCUMENT_ID_COLUMN_NAME",
"query_text_input": QUERY}'::jsonb],
NULL::TEXT,
'RRF',
FALSE)
ORDER BY score DESC;
Replace the following variables:
LIMIT: number of results to return. For example,3.WEIGHT: contribution of this search entry to the overall Reciprocal Rank Fusion (RRF).ELASTICSEARCH_FD_TABLE: name of the foreign data table that represents your Elasticsearch table. For example,my-fd-elasticsearch-table.DOCUMENT_ID_COLUMN_NAME: name of the document ID column.QUERY: query to send to Elasticsearch. For example,"elasticsearch_field_name:\"quantum computing\""searches for the phrase "quantum computing" in theelasticsearch_field_namefield. All query types mentioned in Supported data types can be used in your query.
For more information on the parameters available for hybrid searches, see Hybrid search function parameters.
Pushdown examples
To make queries more efficient, AlloyDB attempts to push down the following aspects of the query directly into the API call made to Elasticsearch:
SELECTfieldsWHEREfiltersORDER BYsortsLIMIT
For example queries that illustrate which aspects AlloyDB is and isn't able to push down, see the following table.
| Query type | Query example | Query elements pushed down |
|---|---|---|
| Unfiltered queries |
SELECT id, body FROM elasticsearch_table ORDER BY metadata <@> 'body:foo' DESC LIMIT 10; |
|
| Exact text match |
SELECT id, body FROM elasticsearch_table WHERE body = 'foo' LIMIT 10; |
|
| Single-field expressions |
SELECT id, body FROM elasticsearch_table WHERE id > 10 ORDER BY metadata <@> 'body:foo' LIMIT 10; |
|
| Constant expressions |
SELECT id, body FROM elasticsearch_table WHERE id > (1+1) LIMIT 10; |
|
| Expressions with functions |
SELECT id, body FROM elasticsearch_table WHERE id > CEIL(3.14) LIMIT 10; |
|
| Multi-field expressions |
SELECT id, body FROM elasticsearch_table WHERE dbl_field < flt_field LIMIT 10; |
|
| Score filtering |
SELECT id, body, (metadata <@> 'body:bar') AS score FROM elasticsearch_table WHERE score > 0.5 ORDER by score desc LIMIT 10; |
|
LIKE and similar operators |
SELECT id, body FROM elasticsearch_table WHERE id > 10 AND body LIKE '%foo%' LIMIT 10; |
|
| Raw queries |
SELECT id, body FROM elasticsearch_table WHERE id < 10 ORDER BY metadata <@> $${"query": { "match_all": {}}}$$ DESC LIMIT 10; |
|