Use the JDBC driver for BigQuery
The Java Database Connectivity (JDBC) driver for BigQuery connects your Java applications to BigQuery, letting you use BigQuery features with your preferred tooling and infrastructure. To connect non-Java applications to BigQuery, use the Simba Open Database Connectivity (ODBC) driver for BigQuery.
Limitations
The JDBC driver for BigQuery is subject to the following limitations:
- The driver is specific to BigQuery and can't be used with other products or services.
- The
INTERVALdata type isn't supported with the BigQuery Storage Read API. - All data manipulation language (DML) limitations apply.
Before you begin
- Make sure that you're familiar with JDBC drivers, Apache Maven, and
the
java.sqlpackage. - Verify that your system is configured with Java Runtime Environment (JRE) 8.0 or later. For information on checking your JRE version, see Verifying the JRE Environment.
Authenticate to BigQuery, and take note of the following information, which is used later when you establish a connection with the JDBC driver for BigQuery. You only need to note the information that corresponds to the authentication method that you use.
Authentication method Authentication information Example Connection property (to set later) Standard service account Service account email bq-jdbc-sa@mytestproject.iam.gserviceaccount.comOAuthServiceAcctEmailService account key (JSON object) my-sa-keyOAuthPvtKeyService account key file Service account key file (full path) path/to/file/secret.jsonOAuthPvtKeyPathGoogle user account Client ID 123-abc.apps.googleusercontent.comOAuthClientIdClient secret _aB-C1D_E2fGh3Ij4kL5m6No7p8QR9sT0uVOAuthClientSecretPre-generated access token Access token ya29.a0AfH6SMCiH1L-x_yZOAuthAccessTokenPre-generated refresh token Refresh token 1/fFAGRNJru1FTz70BzhT3ZgOAuthRefreshTokenClient ID 123-abc.apps.googleusercontent.comOAuthClientIdClient secret _aB-C1D_E2fGh3Ij4kL5m6No7p8QR9sT0uVOAuthClientSecretApplication Default Credentials None N/A N/A Configuration file Configuration file (JSON object or full path) path/to/file/secret.jsonOAuthPvtKeyExternal account configuration object Account configuration object external_account_configuration_objectOAuthPvtKeyOther Audience property of the external account configuration file //iam.googleapis.com/projects/my-project/locations/US-EAST1/workloadIdentityPools/my-pool-/providers/my-providerBYOID_AudienceUriToken retrieval and environmental information file {\"file\":\"/path/to/file\"}BYOID_CredentialSourceUser project (only if using a workforce pool) my_projectBYOID_PoolUserProjectURI for service account impersonation (only if using a workforce pool) my-saBYOID_SA_Impersonation_UriSecurity Token Service token based on the token exchange specification urn:ietf:params:oauth:tokentype:id_tokenBYOID_SubjectTokenTypeSecurity Token Service token exchange endpoint https://sts.googleapis.com/v1/tokenBYOID_TokenUri
Configure your development environment
To configure your development environment with the JDBC driver for BigQuery, do the following:
Download one of the following JDBC packages:
- Uber JAR. A JAR file with all dependencies included.
- Shaded Uber JAR. A shaded JAR file with all dependencies included.
- Thin JAR with dependencies. A zip file that includes a thin JAR file and all dependencies.
Add the downloaded JAR file to your classpath so that the Java compiler and runtime can locate the necessary JDBC classes. For information on adding a file to your classpath, see Setting the Classpath.
Add the following dependency to your build file:
<dependency> <groupId>com.google.cloud</groupId> <artifactId>google-cloud-bigquery-jdbc</artifactId> <version>0.0.1</version> <scope>system</scope> <systemPath>path/to/file/google-jdbc-jar-with-dependencies.jar</systemPath> </dependency>
If you're using a Gradle project, add the following to your build file:
dependencies { // ... other dependencies implementation files('path/to/file/google-jdbc-jar-with-dependencies.jar') }
Establish a connection
To establish a connection between your Java application and BigQuery with the JDBC driver for BigQuery, do the following:
Identify your connection string for the JDBC driver for BigQuery. This string captures all the required information to establish a connection between your Java application and BigQuery. The connection string has the following format:
jdbc:bigquery://HOST:PORT;ProjectId=PROJECT_ID;OAuthType=AUTH_TYPE;AUTH_PROPS;OTHER_PROPS
Replace the following:
HOST: the DNS or IP address of the server.PORT: the TCP port number.PROJECT_ID: the ID of your BigQuery project.AUTH_TYPE: a number specifying the type of authentication that you used. One of the following:0: for service account authentication (standard and key file)1: for Google user account authentication2: for pre-generated refresh or access token authentication3: for Application Default Credential authentication4: for other authentication methods
AUTH_PROPS: the authentication information that you noted when you authenticated to BigQuery, listed in theproperty_1=value_1; property_2=value_2;...format—for example,OAuthPvtKeyPath=path/to/file/secret.json, if you authenticated with a service account key file.OTHER_PROPS(optional): additional connection properties for the JDBC driver, listed in theproperty_1=value_1; property_2=value_2;...format. For a full list of connection properties, see Connection properties.
Connect your Java application to the JDBC driver for BigQuery with either the
DriverManagerorDataSourceclass.Connect with the
DriverManagerclass:import java.sql.Connection; import java.sql.DriverManager; private static Connection getJdbcConnectionDM(){ Connection connection = DriverManager.getConnection(CONNECTION_STRING); return connection; }
Replace
CONNECTION_STRINGwith the connection string from the previous step.Connect with the
DataSourceclass:import com.google.cloud.bigquery.jdbc.DataSource; import java.sql.Connection; import java.sql.SQLException; private static public Connection getJdbcConnectionDS() throws SQLException { Connection connection = null; DataSource dataSource = new com.google.cloud.bigquery.jdbc.DataSource(); dataSource.setURL(CONNECTION_STRING); connection = dataSource.getConnection(); return connection; }
Replace
CONNECTION_STRINGwith the connection string from the previous step.The
DataSourceclass also has setter methods that you can use to set connection properties, rather than including them in the connection string. The following is an example:private static Connection getConnection() throws SQLException { DataSource ds = new DataSource(); ds.setURL(jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;); ds.setAuthType(3); // Application Default Credentials ds.setProjectId("MyTestProject"); ds.setEnableHighThroughputAPI(true); ds.setLogLevel("6"); ds.setUseQueryCache(false); return ds.getConnection(); }
Connection properties
JDBC driver connection properties are configuration parameters that you include in the connection string or pass through setter methods when you establish a connection to a database. The following connection properties are supported by the JDBC driver for BigQuery.
| Connection property | Description | Default value | Data type | Required |
|---|---|---|---|---|
AdditionalProjects |
Projects that the driver can access for queries and metadata operations,
in addition to the primary project set by the ProjectId
property.
|
N/A | Comma-separated string | No |
AllowLargeResults |
Determines if the driver processes query results that are larger than
128 MB when the QueryDialect property is set to
BIG_QUERY. If the QueryDialect property is set
to SQL, the driver always processes large query results.
|
TRUE |
Boolean | No |
BYOID_AudienceUri |
The audience property in an external account configuration file. The audience property can contain the resource name for the workload identity pool or workforce pool, as well as the provider identifier in that pool. | N/A | String | Only when OAuthType=4 |
BYOID_CredentialSource |
The token retrieval and environmental information. | N/A | String | Only when OAuthType=4 |
BYOID_PoolUserProject |
The user project when a workforce pool is being used for authentication. | N/A | String | Only when OAuthType=4 and using the workforce pool |
BYOID_SA_Impersonation_Uri |
The URI for the service account impersonation when a workforce pool is being used for authentication. | N/A | String | Only when OAuthType=4 and using the workforce pool |
BYOID_SubjectTokenType |
The Security Token Service token based on the token exchange specification.
One of the following:
|
urn:ietf:params:oauth:tokentype:id_token |
String | Only when OAuthType=4 |
BYOID_TokenUri |
The Security Token Service token exchange endpoint. | https://sts.googleapis.com/v1/token |
String | No |
ConnectionPoolSize |
The connection pool size, if connection pooling is enabled. | 10 |
Long | No |
DefaultDataset |
The dataset that's used when one isn't specified in a query. | N/A | String | No |
EnableHighThroughputAPI |
Determines if the Storage Read API can be used. The
HighThroughputActivationRatio and
HighThroughputMinTableSize properties must also be set to
TRUE to use the Storage Read API.
|
FALSE |
Boolean | No |
EnableSession |
Determines if the connection starts a session. If set to
TRUE, the session ID is passed to all subsequent queries.
|
FALSE |
Boolean | No |
EnableWriteAPI |
Determines if the Storage Write API can be used. It must be
set to TRUE to enable bulk inserts.
|
FALSE |
Boolean | No |
EndpointOverrides |
Custom endpoints to overwrite the following:
|
N/A | Comma-separated string | No |
FilterTablesOnDefaultDataset |
Determines the scope of metadata returned by the
DatabaseMetaData.getTables() and
DatabaseMetaData.getColumns() methods. When set to
FALSE, no filtering occurs. The DefaultDataset
property must also be set to enable filtering.
|
FALSE |
Boolean | No |
HighThroughputActivationRatio |
The threshold for the number of pages in a query response. When this
number is exceeded, and the EnableHighThroughputAPI and
HighThroughputMinTableSize conditions are met, the driver
starts using the Storage Read API.
|
2 |
Integer | No |
HighThroughputMinTableSize |
The threshold for the number of rows in a query response. When this number
is exceeded, and the EnableHighThroughputAPI and
HighThroughputActivationRatio conditions are met, the driver
starts using the Storage Read API.
|
100 |
Integer | No |
JobCreationMode |
Determines if queries are run with or without jobs. A 1
value means that jobs are created for every query, and a 2
value means that queries can be executed without jobs.
|
2 |
Integer | No |
JobTimeout |
The job timeout (in seconds) after which the job is cancelled on the server. | 0 |
Long | No |
KMSKeyName |
The KMS key name for encrypting data. | N/A | String | No |
Labels |
Labels that are associated with the query to organize and group query jobs. | N/A | Map<String, String> | No |
LargeResultDataset |
The destination dataset for large query results, only when the
LargeResultTable property is set. When you set this property,
data writes bypass the result cache and trigger billing for each query,
even if the results are small.
|
_google_jdbc |
String | No |
LargeResultsDatasetExpirationTime |
The lifetime of all tables in a large result dataset, in milliseconds. This property is ignored if the dataset already has a default expiration time set. | 3600000 |
Long | No |
LargeResultTable |
The destination table for large query results, only when the
LargeResultDataset property is set. When you set this
property, data writes bypass the result cache and trigger billing for each
query, even if the results are small.
|
temp_table... |
String | No |
ListenerPoolSize |
The listener pool size, if connection pooling is enabled. | 10 |
Long | No |
Location |
The location where datasets are created or queried. BigQuery automatically determines the location if this property isn't set. | N/A | String | No |
LogLevel |
The level of detail logged by the
java.util.logging package
during database interactions. Logging can affect performance, so only
enable it temporarily to capture an issue. One of the following:
|
0 |
Integer | No |
LogPath |
The directory where log files are written. | N/A | String | No |
MaximumBytesBilled |
The limit of bytes billed. Queries with bytes billed greater than this number fail without incurring a charge. | 0 |
Long | No |
MaxResults |
The maximum number of results per page. | 10000 |
Long | No |
MetaDataFetchThreadCount |
The number of threads used for database metadata methods. | 32 |
Integer | No |
OAuthAccessToken |
The access token that's used for pre-generated access token authentication. | N/A | String | Only when OAUTH_TYPE=2 |
OAuthClientId |
The client ID for pre-generated refresh token authentication and user account authentication. | N/A | String | Only when OAUTH_TYPE=1 or OAUTH_TYPE=2 |
OAuthClientSecret |
The client secret for pre-generated refresh token authentication and user account authentication. | N/A | String | Only when OAUTH_TYPE=1 or OAUTH_TYPE=2 |
OAuthP12Password |
The password for the PKCS12 key file. | notasecret |
String | No |
OAuthPvtKey |
The service account key when using service account authentication. This value can be a raw JSON keyfile object or a path to the JSON keyfile. | N/A | String |
Only when OAUTH_TYPE=0 and the OAuthPvtKeyPath
value isn't set
|
OAuthPvtKeyPath |
The path to the service account key when using service account authentication. | N/A | String |
Only when OAUTH_TYPE=0 and the OAuthPvtKey and
OAuthServiceAcctEmail values aren't set
|
OAuthRefreshToken |
The refresh token for pre-generated refresh token authentication. | N/A | String |
Only when OAUTH_TYPE=2
|
OAuthServiceAcctEmail |
The service account email when using service account authentication. | N/A | String |
Only when OAUTH_TYPE=0 and the OAuthPvtKeyPath
value isn't set
|
OAuthType |
The authentication type. One of the following:
|
-1 |
Integer | Yes |
PartnerToken |
A token that's used by Google Cloud partners to track usage of the driver. | N/A | String | No |
ProjectId |
The default project ID for the driver. This project is used to execute queries and is billed for resource usage. If not set, the driver infers a project ID. | N/A | String | No, but highly recommended |
ProxyHost |
The hostname or IP address of a proxy server through which the JDBC connection is routed. | N/A | String | No |
ProxyPort |
The port number on which the proxy server is listening for connections. | N/A | String | No |
ProxyPwd |
The password for authentication when connecting through a proxy server that requires it. | N/A | String | No |
ProxyUid |
The username for authentication when connecting through a proxy server that requires it. | N/A | String | No |
QueryDialect |
The SQL dialect for query execution. Use SQL for
GoogleSQL (highly recommended) and BIG_QUERY for
legacy SQL.
|
SQL |
String | No |
QueryProperties |
REST connection properties that customize query behavior. | N/A | Map<String, String> | No |
RequestGoogleDriveScope |
Adds read-only Drive scope to the connection when set to
1.
|
0 |
Integer | No |
RetryInitialDelay |
Sets the delay (in seconds) before the first retry. | 0 |
Long | No |
RetryMaxDelay |
Sets the maximum limit (in seconds) for the retry delay. | 0 |
Long | No |
ServiceAccountImpersonationChain |
A comma-separated list of service account emails in the impersonation chain. | N/A | String | No |
ServiceAccountImpersonationEmail |
The service account email to be impersonated. | N/A | String | No |
ServiceAccountImpersonationScopes |
A comma-separated list of OAuth2 scopes to use with the impersonated account. | https://www.googleapis.com/auth/bigquery |
String | No |
ServiceAccountImpersonationTokenLifetime |
The impersonated account token lifetime (in seconds). | 3600 |
Integer | No |
SSLTrustStore |
The full path to the Java TrustStore that contains trusted Certificate Authority (CA) certificates. The driver utilizes this truststore to validate the identity of the server during the SSL/TLS handshake. | N/A | String | No |
SSLTrustStorePwd |
The password to the Java TrustStore specified in the
SSLTrustStore property.
|
N/A | String | Only if the Java TrustStore is password-protected |
SWA_ActivationRowCount |
The threshold of executeBatch insert rows which, when
exceeded, causes the connector to switch to the
Storage Write API.
|
3 |
Integer | No |
SWA_AppendRowCount |
The size of the write stream. | 1000 |
Integer | No |
Timeout |
The length of time, in seconds, that the connector retries a failed API call before timing out. | 0 |
Long | No |
UniverseDomain |
The top-level domain that's associated with your organization's Google Cloud resources. | googleapis.com |
String | No |
UnsupportedHTAPIFallback |
Determines if the connector falls back to the REST API (when set to
TRUE) or returns an error (when set to FALSE).
|
TRUE |
Boolean | No |
UseQueryCache |
Enables query caching. | TRUE |
Boolean | No |
Run queries with the driver
With your Java application connected to BigQuery through the JDBC driver, you can now run queries in your development environment through the standard JDBC process. All BigQuery quotas and limits apply.
Data type mapping
When you run queries through the JDBC driver for BigQuery, the following data type mapping occurs:
| GoogleSQL type | Java type |
|---|---|
ARRAY |
Array |
BIGNUMERIC |
BigDecimal |
BOOL |
Boolean |
BYTES |
byte[] |
DATE |
Date |
DATETIME |
String |
FLOAT64 |
Double |
GEOGRAPHY |
String |
INT64 |
Long |
INTERVAL |
String |
JSON |
String |
NUMERIC |
BigDecimal |
STRING |
String |
STRUCT |
Struct |
TIME |
Time |
TIMESTAMP |
Timestamp |
Examples
The following sections provide examples that use BigQuery features through the JDBC driver for BigQuery.
Positional parameters
The following example runs a query with a positional parameter:
PreparedStatement preparedStatement = connection.prepareStatement( "SELECT * FROM MyTestTable where testColumn = ?"); preparedStatement.setString(1, "string2"); ResultSet resultSet = statement.executeQuery(selectQuery);
Nested and repeated records
The following example queries the base record of Struct data:
ResultSet resultSet = statement.executeQuery("SELECT STRUCT(\"Adam\" as name, 5 as age)"); resultSet.next(); Struct obj = (Struct) resultSet.getObject(1); System.out.println(obj.toString());
The driver returns the base record as a struct object or a string representation of a JSON object. The result is similar to the following:
{
"v": {
"f": [
{
"v": "Adam"
},
{
"v": "5"
}
]
}
}The following example queries the subcomponents of a Struct object:
ResultSet resultSet = statement.executeQuery("SELECT STRUCT(\"Adam\" as name, 5 as age)"); resultSet.next(); Struct structObject = (Struct) resultSet.getObject(1); Object[] structComponents = structObject.getAttributes(); for (Object component : structComponents){ System.out.println(component.toString()); }
The following example queries a standard array of repeated data, then verifies the result:
// Execute Query ResultSet resultSet = statement.executeQuery("SELECT [1,2,3]"); resultSet.next(); Object[] arrayObject = (Object[]) resultSet.getArray(1).getArray(); // Verify Result int count =0; for (; count < arrayObject.length; count++) { System.out.println(arrayObject[count]); }
The following example queries a Struct array of repeated data, then verifies
the result:
// Execute Query ResultSet resultSet = statement.executeQuery("SELECT " + "[STRUCT(\"Adam\" as name, 12 as age), " + "STRUCT(\"Lily\" as name, 17 as age)]"); Struct[] arrayObject = (Struct[]) resultSet.getArray(1).getArray(); // Verify Result for (int count =0; count < arrayObject.length; count++) { System.out.println(arrayObject[count]); }
Bulk-insert
The following example performs a bulk-insert operation with the
executeBatch method.
Connection conn = DriverManager.getConnection(connectionUrl); PreparedStatement statement = null; Statement st = conn.createStatement(); final String insertQuery = String.format( "INSERT INTO `%s.%s.%s` " + " (StringField, IntegerField, BooleanField) VALUES(?, ?, ?);", DEFAULT_CATALOG, DATASET, TABLE_NAME); statement = conn.prepareStatement(insertQuery1); for (int i=0; i<2000; ++i) { statement.setString(1, i+"StringField"); statement.setInt(2, i); statement.setBoolean(3, true); statement.addBatch(); } statement.executeBatch();
Pricing
You can download the JDBC driver for BigQuery at no cost, and you don't need any additional licenses to use the drivers. However, when you use the driver, standard BigQuery pricing applies.
What's next
- Learn more about the Simba ODBC driver for BigQuery.
- Explore other BigQuery developer tools.