Add filters to a Microsoft OneDrive data store using the API

This page explains how to add filters to your Microsoft OneDrive data stores in Gemini Enterprise using the API. Filters determine which information is retrieved from the Microsoft OneDrive data source when users chat with the Gemini Enterprise assistant using the Gemini Enterprise app.

Create a data store with filters

When creating a Microsoft OneDrive data store, you can add filters to retrieve only specific data from the Microsoft OneDrive data source. You specify these filters within the params object using the structured_search_filter field. This field accepts an object containing key-value pairs. The keys represent the Microsoft OneDrive fields you want to filter on, and the values are arrays of strings containing the criteria to match.

Before you begin

Ensure the following before you set up your Microsoft OneDrive federated connection:

  1. Grant the Discovery Engine Editor role (roles/discoveryengine.editor). This role is required for the user to create the data store. To grant this role, do the following:

    1. In the Google Cloud console, go to the IAM page.

      Go to IAM

    2. Locate the user account and click the edit Edit icon.
    3. Grant the Discovery Engine Editor role to the user. For more information, see IAM roles and permissions.

  2. Register Gemini Enterprise as an OAuth 2.0 application in Microsoft Entra ID and obtain the following credentials:

    • Client ID

    • Client secret

    • Tenant ID

  3. Configure the Microsoft API permissions with the consent of a Microsoft OneDrive admin.

Create a data store with filters using the API

To add filters when creating a data store, call the setUpDataConnector method with the structured_search_filter parameter:

REST

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-H "X-Goog-User-Project: PROJECT_ID" \
"https://ENDPOINT_LOCATION-discoveryengine.googleapis.com/v1alpha/projects/PROJECT_ID/locations/LOCATION:setUpDataConnector" \
-d '{
  "collectionId": "COLLECTION_ID",
  "collectionDisplayName": "COLLECTION_DISPLAY_NAME",
  "dataConnector": {
    "dataSource": "onedrive_federated_search",
    "params": {
      "client_id": "CLIENT_ID",
      "client_secret": "CLIENT_SECRET",
      "tenant_id": "TENANT_ID",
      "structured_search_filter": {
        "FILTER_KEY": [
          "FILTER_VALUE1",
          "FILTER_VALUE2"
        ]
      }
    },
    "entities": [
      {
        "entityName": "file"
      }
    ],
    "refreshInterval": "7200s",
    "connectorType": "THIRD_PARTY_FEDERATED",
    "connectorModes": [
      "FEDERATED"
    ]
  }
}'

Replace the following:

  • PROJECT_ID: the ID of your project.
  • ENDPOINT_LOCATION: the multi-region for your API request. Specify one of the following values:
    • us for the US multi-region
    • eu for the EU multi-region
    • global for the Global location
    For more information, see Specify a multi-region for your data store.
  • LOCATION: the multi-region of your data store: global, us, or eu
  • COLLECTION_ID: the unique ID of the data store.
  • COLLECTION_DISPLAY_NAME: the display name of the data store.
  • CLIENT_ID: the client ID for Microsoft OneDrive authentication.
  • CLIENT_SECRET: the client secret for Microsoft OneDrive authentication.
  • TENANT_ID: the tenant ID for Microsoft OneDrive.
  • FILTER_KEY: the key for the filter, corresponding to a field in your data.
  • FILTER_VALUES: the value or values to filter on for the specified FILTER_KEY.

Add or update filters in an existing data store

To add filters to an existing data store or to update filters in an existing Microsoft OneDrive data store, call the updateDataConnector method.

In the request body, the structured_search_filter field must contain all the filters you want to have in the data store, including the filters that are not updated.

REST

To add or update filters in an existing Microsoft OneDrive data store, run the following command:

curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-H "X-Goog-User-Project: PROJECT_ID" \
"https://ENDPOINT_LOCATION-discoveryengine.googleapis.com/v1alpha/projects/PROJECT_ID/locations/LOCATION/collections/COLLECTION_ID/dataConnector?updateMask=params" \
-d '{
    "params": {
      "structured_search_filter": {
        "UPDATED_FILTER_KEY": [
          "UPDATED_FILTER_VALUE1",
          "UPDATED_FILTER_VALUE2"
        ]
      }
    }
}'

Replace the following:

  • PROJECT_ID: the ID of your project.
  • ENDPOINT_LOCATION: the multi-region for your API request. Specify one of the following values:
    • us for the US multi-region
    • eu for the EU multi-region
    • global for the Global location
    For more information, see Specify a multi-region for your data store.
  • LOCATION: the multi-region of your data store: global, us, or eu
  • COLLECTION_ID: The ID of the collection containing the data store.
  • UPDATED_FILTER_KEY: the key for the filter, corresponding to a field in your data.
  • UPDATED_FILTER_VALUES: the value or values to filter on for the specified UPDATED_FILTER_KEY.

Verify that the data store includes the filters

You can get the details of the data store to confirm that the filters you added or updated are correctly applied using the following API method:

REST

To verify data store settings, call the dataConnector.get method:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "X-Goog-User-Project: PROJECT_ID" \
"https://ENDPOINT_LOCATION-discoveryengine.googleapis.com/v1alpha/projects/PROJECT_ID/locations/LOCATION/collections/COLLECTION_ID/dataConnector"

Replace the following:

  • PROJECT_ID: the ID of your project.
  • ENDPOINT_LOCATION: the multi-region for your API request. Specify one of the following values:
    • us for the US multi-region
    • eu for the EU multi-region
    • global for the Global location
    For more information, see Specify a multi-region for your data store.
  • LOCATION: the multi-region of your data store: global, us, or eu
  • COLLECTION_ID: the ID of the collection containing the data store.

In the response, view the structured_search_filter field in params to verify the filters.

What's next