Set up a Box data store

This page describes how to create a data store and connect Box to Gemini Enterprise.

Before you begin

Before you set up your Box connection, ensure you perform the following:

  1. Grant the Discovery Engine Editor role (roles/discoveryengine.editor). This role is required for the user to create the data store. To grant this role, do the following:
    1. In the Google Cloud console, go to the IAM page.

      Go to IAM

    2. Locate the user account and click the edit Edit icon.
    3. Grant the Discovery Engine Editor role to the user. For more information, see IAM roles and permissions.
  2. Create, and authorize the Box app account.
  3. Configure Box and set the necessary permissions. For the list of scopes required for search and data ingestion, see Required permissions.
  4. Obtain the authentication information to use during data store creation.
  5. Set up a Google Cloud project with an administrator account capable of managing organization-level configurations, ensuring the organization can set up a workforce pool.
  6. Make sure your organization is set up to manage a workforce pool.

Create the Box data store

To create the Box data store, perform the following steps:

  1. In the Google Cloud console, go to the Gemini Enterprise page.

    Gemini Enterprise

  2. Select or create a Google Cloud project.

  3. In the navigation menu, click Data stores.

  4. Click Create data store.

  5. In the Source section, search for Box, and click Select.

  6. In the Data section:

    1. In the Connector mode section, select Federated search or Data ingestion as the connection mode.
    2. Click Continue.

    3. In the Authentication settings section, configure authentication based on your chosen connection mode.

      • If you selected Federated Search, enter the following details:

        • Client ID: The unique identifier for your Box application.
        • Client secret: The secret key associated with your Box application.
      • If you selected Data ingestion, enter the following details:

        • Enterprise ID: The unique identifier for your Box enterprise.
        • Client ID: The public identifier for your Box application.
        • Client secret: The secret key associated with your Box application.
        • Instance URI: The base URL for your Box instance API.
        • Private key: The private key used for authenticating your Box application.
        • Key ID: The identifier for the private key.
        • Pass phrase: The passphrase used to decrypt the private key.

        For the list of scopes required for search and data ingestion, see Required scopes. For information on how to obtain authentication information, see Obtain Box authentication information.

      • If you selected Federated Search, click Login and complete the sign-in.

    4. Click Continue.

    5. In the Advanced options section:

      • If you selected Federated search, select the Impersonate user mode as Admin or User, and click Continue.
      • Optional. If you selected Data ingestion, select the Enable static IP addresses checkbox to allow a set of static IP addresses in your system.
    6. Click Continue.

    7. In the Entities to search (if you selected Federated search) or Entities to sync (if you selected Data ingestion) section:

      1. Select all the required entities.
      2. If you selected Data ingestion, continue with the following steps:
        1. Optional: To sync specific projects, do the following:
          1. Click Filter.
          2. To filter entities out of the index, select the Exclude from the index checkbox, or to ensure that they are included in the index, select the Include to the index checkbox.
          3. Enter the keys. Press enter after each key.
          4. Click Save.
        2. To configure the sync schedule, do the following:
          1. In the Sync frequency list, select the sync frequency.
            • To schedule separate full syncs of entity and identity data, expand the menu in the Full sync section and then select Custom options.
          2. In the Incremental sync frequency list, select the incremental sync frequency. For more information, see Sync schedules.
  7. Click Continue.

  8. In the Configuration section:

    1. From the Multi-region list, select the location for your data connector.
    2. In the Data connector name field, enter a name for your connector.
    3. If you selected US or EU as the location, configure the Encryption settings:
      • Optional: If you haven't configured single-region keys, click Go to settings page to do so. For more information, see Register a single-region key for third-party connectors.
      • Select Google-managed encryption key or Cloud KMS key.
      • If you selected Cloud KMS key:
        • In the Key management type list, select the appropriate type.
        • In the Cloud KMS key list, select the key.
      For more information, see Customer-managed encryption keys.

  9. Click Continue.

  10. In the Billing section, select General pricing or Configurable pricing. For more information, see Verify the billing status of your projects and Licenses.

  11. Click Create. Gemini Enterprise creates your data store and displays your data stores on the Data Stores page.

To verify the state of the data store, do the following:

  1. Navigate to the data store in the data store list and monitor its state until it changes to Active.
  2. When the data store state changes from Creating to Active, the Box connector is ready to be used.

For an ingestion data store created with Box, the data store state transitions from Creating to Running upon synchronization initiation. It then changes to Active once ingestion is complete, signifying that the data store is fully configured. Depending on data volume, ingestion may require several hours.

If you have created a Box federated data store, you must authorize Gemini Enterprise to Box before executing the query.

Data handling and query execution

This section describes how Gemini Enterprise manages your query and the privacy implications of using the federated data store.

Query execution

After you authorize Box and send a search query to Gemini Enterprise:

  • Gemini Enterprise sends your search query directly to the Box API.
  • Gemini Enterprise blends the results with those from other connected data sources and displays a comprehensive search result.

Data handling

When using third-party federated search, the following data handling rules apply:

  • Your query string is sent to the third-party search backend (Box API).
  • These third parties may associate queries with your identity.
  • If multiple federated search data sources are enabled, the query might be sent to all of them.
  • Once the data reaches the third-party system, it is governed by that system's terms of service and privacy policies.

What's next