This page shows you how to configure and manage external data sources for use with PolyBase and your Cloud SQL for SQL Server instances.
Manage access to external objects
Cloud SQL supports operations to enable and disable PolyBase only. Users need to manage SQL Server entities manually, using T-SQL statements.
Before managing your SQL server entities manually, review the following Microsoft resources:
- Create a MASTER KEY using T-SQL
- Configure PolyBase to access external data in Oracle
- Configure PolyBase to access external data in S3-compatible object storage
Create an external data source
The following steps show you how to create an external data source to your Cloud SQL for SQL Server instance.
SQL Server defines an external data source as a resource with connection information such as the server name, database name, and connection string.
SQL Server defines external tables as metadata objects within SQL Server that point to the data in the external data sources. They define the schema of the external data as it will be presented within SQL Server.
This procedure requires creating a database-scoped credential (DSC) for your external data source.
DSCs store the authentication information needed to connect to the external data sources. These credentials are encrypted by the database master key (DMK).
The DMK serves as the root of the encryption hierarchy within a database, protecting secrets in database-scoped credentials. This symmetric key is encrypted by the service master key (SMK) and a user-provided password. The DMK isn't automatically created and requires explicit management by the user using T-SQL.
The SMK serves as the root of the encryption hierarchy for a specific instance of SQL Server. This symmetric key is automatically generated by the SQL Server itself, the first time it starts, and is used to protect other security objects like DMKs and login passwords.
Oracle
Complete the following steps in SQL Server.
Create a database-scoped credential (DSC) for Oracle.
If one doesn't already exist, then switch to a user database to create a master key:
USE USER_DB; CREATE MASTER KEY ENCRYPTION BY PASSWORD='MK_PASSWORD';Replace the following:
- USER_DB: the user database you want to use to create the master key.
For example,
polybasedb. - MK_PASSWORD: the password for the master key you want to use. Make sure to save the password for the master key for later use.
- USER_DB: the user database you want to use to create the master key.
For example,
Create a database-scoped credential for the Oracle data source.
CREATE DATABASE SCOPED CREDENTIAL DB_CREDENTIAL_NAME WITH IDENTITY = 'EXTERNAL_DATABASE_USERNAME', SECRET = 'EXTERNAL_DATABASE_PASSWORD';Replace the following:
- DB_CREDENTIAL_NAME: the credential name you want to use for the external database.
- EXTERNAL_DATABASE_USERNAME: the username you want to use to access the external database.
- EXTERNAL_DATABASE_PASSWORD: the password for the external database. External data source credentials need to be updated when rotated and must be managed manually by you.
Create an external data source for Oracle.
CREATE EXTERNAL DATA SOURCE DATA_SOURCE_NAME WITH ( LOCATION = 'oracle://SERVER_IP:PORT', PUSHDOWN = PUSHDOWN, CREDENTIAL = DB_CREDENTIAL_NAME );Replace the following:
- DATA_SOURCE_NAME: the name you want to use for the external database.
- SERVER_IP: the IP of the server or the hostname.
- PORT: the port you want to use.
- PUSHDOWN: Required. Pushdown
is a capability offered in SQL Server and must be configured. Set the value
to
ONorOFF, depending on your use case. - DB_CREDENTIAL_NAME: the credential you just created.
Create an external table.
Define the schema and structure of the external data:
- Create the external table definition. The schema must match the remote table.
- Define columns with appropriate SQL Server data types and collations.
- The location is the remote table path.
CREATE EXTERNAL TABLE EXTERNAL_TABLE_NAME( COLUMN_NAME_1 DATA_TYPE_1 COLLATE COLLATION_1, COLUMN_NAME_2 DATA_TYPE_2 COLLATE COLLATION_2, ... ) WITH( LOCATION = 'REMOTE_DATABASE.REMOTE_USERNAME.REMOTE_TABLE_NAME', DATA_SOURCE = DB_DATA_SOURCE_NAME );Replace the following:
- EXTERNAL_TABLE_NAME: the name of the external table you want to create.
- COLUMN_NAME_1: the name of the first column of the table.
- DATA_TYPE_1: the data type of the first column.
- COLLATION_1: the collation you want to use for the first column.
- COLUMN_NAME_2: the name of the second column of the table.
- DATA_TYPE_2: the data type of the second column.
- COLLATION_2: the collation you want to use for the second column.
- REMOTE_DATABASE: the name of the remote database.
- REMOTE_USERNAME: the username of the remote user.
- REMOTE_TABLE_NAME: the name of the remote table.
- DB_DATA_SOURCE_NAME: the name of the external data source.
Query the external Oracle table you created.
SELECT TOP 10 * FROM EXTERNAL_TABLE_NAME;Replace the following:
- EXTERNAL_TABLE_NAME: the name of the external table you want to query.
Cloud Storage
Complete the following steps in SQL Server.
Create a database-scoped credential for Cloud Storage.
If one doesn't already exist, then switch to a user database to create a master key:
USE USER_DB; CREATE MASTER KEY ENCRYPTION BY PASSWORD='MK_PASSWORD';Replace the following:
- USER_DB: the user database you want to use to create the master key.
For example,
polybasedb. - MK_PASSWORD: the password you want to use for the master key. Make sure to save the password for the master key for later use.
- USER_DB: the user database you want to use to create the master key.
For example,
Cloud Storage requires an access key ID and a secret key ID:
CREATE DATABASE SCOPED CREDENTIAL STORAGE_CREDENTIAL_NAME WITH IDENTITY = 'S3 Access Key', SECRET = 'ACCESS_KEY_ID:SECRET_KEY_ID';Replace the following:
- STORAGE_CREDENTIAL_NAME: the name of the Cloud Storage credential you want to use.
- ACCESS_KEY_ID: the access key ID.
- SECRET_KEY_ID: the secret key ID.
Only basic authentication is supported.
Create an external data source.
CREATE EXTERNAL DATA SOURCE STORAGE_DATA_SOURCE_NAME WITH ( LOCATION = 's3://storage.googleapis.com/', CREDENTIAL = STORAGE_CREDENTIAL_NAME );Replace the following:
- STORAGE_DATA_SOURCE_NAME: the name of the external data source in Cloud Storage.
- STORAGE_CREDENTIAL_NAME: the name of the credential for the external storage resource.
Create an external table for Cloud Storage.
Define the format and structure of the external data:
- Create the external file. The schema must match the file structure.
- Define columns with appropriate SQL Server data types and collations.
- The location is the path within the data source.
CREATE EXTERNAL FILE FORMAT FILE_FORMAT_NAME WITH ( FORMAT_TYPE = FORMAT_TYPE, FORMAT_OPTIONS ( FIELD_TERMINATOR = 'FIELD_TERMINATOR', STRING_DELIMITER = 'DELIMITER', FIRST_ROW = FIRST_ROW ) ); CREATE EXTERNAL TABLE FILE_EXTERNAL_TABLE_NAME ( COLUMN_NAME_1 DATA_TYPE_1, COLUMN_NAME_2 DATA_TYPE_2, ... ) WITH ( LOCATION = 'PATH_TO_BUCKET/FILENAME', DATA_SOURCE = STORAGE_DATA_SOURCE_NAME, FILE_FORMAT = FILE_FORMAT_NAME );Replace the following:
- FILE_FORMAT_NAME: the name of the external file format you want to create.
- FORMAT_TYPE: the format type you want to use, such as
DELIMITEDTEXT. For a list of supported values, see Supported file formats. - FIELD_TERMINATOR: the field delimiter you want to use.
- DELIMITER: the string delimiter you want to use.
- FIRST_ROW: the row from which you want to start reading.
- FILE_EXTERNAL_TABLE_NAME: the name of the external table you want to create.
- COLUMN_NAME_1: the name of the first column of the table.
- DATA_TYPE_1: the data type of the first column.
- COLUMN_NAME_2: the name of the second column of the table.
- DATA_TYPE_2: the data type of the second column.
- PATH_TO_BUCKET: the full path to the Cloud Storage bucket.
- FILENAME: the filename located within the Cloud Storage bucket.
- STORAGE_DATA_SOURCE_NAME: the name of the external data source in Cloud Storage.
- FILE_FORMAT_NAME: the name of the external file format.
Query the external Cloud Storage table you created.
SELECT * FROM FILE_EXTERNAL_TABLE_NAME;Replace the following:
- FILE_EXTERNAL_TABLE_NAME: the name of the external table you want to query.
Manage the database master key after a database restore operation
The database-scoped credentials used to access external data sources are encrypted using a database master key (DMK).
The DMK is encrypted with the service master key (SMK) of the source instance and user-provided DMK password. On the source instance, SQL Server silently opens the DMK using an SMK.
If you restore a Cloud SQL for SQL Server instance, the DMK won't open automatically on the new instance due to changes in the SMK.
As a result, you need to first decrypt the DMK using the user password established during its creation and re-encrypt it with the target instance SMK.
For more information, see the following Microsoft resources:
Logs
The following PolyBase logs are available in Logs Explorer:
Polybase_Dms_errors.logPolybase_Dms_movement.logPolybase_DWEngine_errors.logPolybase_DWEngine_movement.logPolybase_DWEngine_server.log
For more information, see Microsoft documentation on PolyBase troubleshooting.