本页介绍了从各种源数据库到相应 BigQuery 数据类型的数据类型映射。了解将数据迁移到 BigQuery 时,不同数据类型的转换方式;BigQuery 如何表示 MongoDB 二进制 JSON 文档;以及如何以 BigQuery ARRAY 数据类型查询 PostgreSQL 数组数据。
地图数据类型
下表列出了从支持的源数据库到 BigQuery 目标的数据类型转换。
| 源数据库 | 源数据类型 | BigQuery 数据类型 |
|---|---|---|
| MySQL | BIGINT(size) |
INT64 |
| MySQL | BIGINT (unsigned) |
DECIMAL |
| MySQL | BINARY(size) |
STRING (hex encoded) |
| MySQL | BIT(size) |
INT64 |
| MySQL | BLOB(size) |
STRING (hex encoded) |
| MySQL | BOOL |
INT64 |
| MySQL | CHAR(size) |
STRING |
| MySQL | DATE |
DATE |
| MySQL | DATETIME(fsp) |
DATETIME |
| MySQL | DECIMAL(precision, scale) |
如果精度值 <=38,且小数位数值 <=9,则为 NUMERIC。否则为 BIGNUMERIC |
| MySQL | DOUBLE(size, d) |
FLOAT64 |
| MySQL | ENUM(val1, val2, val3, ...) |
STRING |
| MySQL | FLOAT(precision) |
FLOAT64 |
| MySQL | FLOAT(size, d) |
FLOAT64 |
| MySQL | INTEGER(size) |
INT64 |
| MySQL | INTEGER (unsigned) |
INT64 |
| MySQL |
|
JSON
支持 |
| MySQL | LONGBLOB |
STRING (hex encoded) |
| MySQL | LONGTEXT |
STRING |
| MySQL | MEDIUMBLOB |
STRING (hex encoded) |
| MySQL | MEDIUMINT(size) |
INT64 |
| MySQL | MEDIUMTEXT |
STRING |
| MySQL | SET(val1, val2, val3, ...) |
STRING |
| MySQL | SMALLINT(size) |
INT64 |
| MySQL | TEXT(size) |
STRING |
| MySQL | TIME(fsp) |
INTERVAL |
| MySQL | TIMESTAMP(fsp) |
TIMESTAMP |
| MySQL | TINYBLOB |
STRING (hex encoded) |
| MySQL | TINYINT(size) |
INT64 |
| MySQL | TINYTEXT |
STRING |
| MySQL | VARBINARY(size) |
STRING (hex encoded) |
| MySQL | VARCHAR |
STRING |
| MySQL | YEAR |
INT64 |
| Oracle | ANYDATA |
UNSUPPORTED |
| Oracle | BFILE |
STRING |
| Oracle | BINARY DOUBLE |
FLOAT64 |
| Oracle | BINARY FLOAT |
FLOAT64 |
| Oracle | BLOB |
BYTES |
| Oracle | CHAR |
STRING |
| Oracle | CLOB |
STRING |
| Oracle | DATE |
DATETIME
|
| Oracle | DOUBLE PRECISION |
FLOAT64 |
| Oracle | FLOAT(p) |
FLOAT64 |
| Oracle | INTERVAL DAY TO SECOND |
UNSUPPORTED |
| Oracle | INTERVAL YEAR TO MONTH |
UNSUPPORTED |
| Oracle | LONG/LONG RAW |
STRING |
| Oracle | NCHAR |
STRING |
| Oracle | NCLOB |
STRING |
| Oracle | NUMBER(precision, scale>0) |
如果 0<p=<78,则映射到参数化的小数类型。如果 p>=79,则映射到 STRING |
| Oracle | NVARCHAR2 |
STRING |
| Oracle | RAW |
STRING |
| Oracle | ROWID |
STRING |
| Oracle | SDO_GEOMETRY |
UNSUPPORTED |
| Oracle | SMALLINT |
INT64 |
| Oracle | TIMESTAMP |
TIMESTAMP
|
| Oracle | TIMESTAMP WITH TIME ZONE |
TIMESTAMP
|
| Oracle | UDT (user-defined type) |
UNSUPPORTED |
| Oracle | UROWID |
STRING |
| Oracle | VARCHAR |
STRING |
| Oracle | VARCHAR2 |
STRING |
| Oracle | XMLTYPE |
UNSUPPORTED |
| PostgreSQL | ARRAY |
JSON
|
| PostgreSQL | BIGINT |
INT64 |
| PostgreSQL | BIT |
BYTES |
| PostgreSQL | BIT_VARYING |
BYTES |
| PostgreSQL | BOOLEAN |
BOOLEAN |
| PostgreSQL | BOX |
UNSUPPORTED |
| PostgreSQL | BYTEA |
BYTES |
| PostgreSQL | CHARACTER |
STRING |
| PostgreSQL | CHARACTER_VARYING |
STRING |
| PostgreSQL | CIDR |
STRING |
| PostgreSQL | CIRCLE |
UNSUPPORTED |
| PostgreSQL | DATE |
DATE |
| PostgreSQL | DOUBLE_PRECISION |
FLOAT64 |
| PostgreSQL | ENUM |
STRING |
| PostgreSQL | INET |
STRING |
| PostgreSQL | INTEGER |
INT64 |
| PostgreSQL | INTERVAL |
INTERVAL |
| PostgreSQL | JSON |
JSON |
| PostgreSQL | JSONB |
JSON |
| PostgreSQL | LINE |
UNSUPPORTED |
| PostgreSQL | LSEG |
UNSUPPORTED |
| PostgreSQL | MACADDR |
STRING |
| PostgreSQL | MONEY |
FLOAT64 |
| PostgreSQL | NUMERIC |
如果精度 = -1,则为 STRING(BigQuery NUMERIC 类型需要固定精度)。否则为 BIGNUMERIC/NUMERIC。如需了解详情,请参阅 PostgreSQL 文档中的任意精度数字部分。 |
| PostgreSQL | OID |
INT64 |
| PostgreSQL | PATH |
UNSUPPORTED |
| PostgreSQL | POINT |
UNSUPPORTED |
| PostgreSQL | POLYGON |
UNSUPPORTED |
| PostgreSQL | REAL |
FLOAT64 |
| PostgreSQL | SMALLINT |
INT64 |
| PostgreSQL | SMALLSERIAL |
INT64 |
| PostgreSQL | SERIAL |
INT64 |
| PostgreSQL | TEXT |
STRING |
| PostgreSQL | TIME |
TIME |
| PostgreSQL | TIMESTAMP |
TIMESTAMP |
| PostgreSQL | TIMESTAMP_WITH_TIMEZONE |
TIMESTAMP |
| PostgreSQL | TIME_WITH_TIMEZONE |
TIME |
| PostgreSQL | TSQUERY |
STRING |
| PostgreSQL | TSVECTOR |
STRING |
| PostgreSQL | TXID_SNAPSHOT |
STRING |
| PostgreSQL | UUID |
STRING |
| PostgreSQL | XML |
STRING |
| SQL Server | BIGINT |
INT64 |
| SQL Server | BINARY |
BYTES |
| SQL Server | BIT |
BOOL |
| SQL Server | CHAR |
STRING |
| SQL Server | DATE |
DATE |
| SQL Server | DATETIME2 |
DATETIME |
| SQL Server | DATETIME |
DATETIME |
| SQL Server | DATETIMEOFFSET |
TIMESTAMP |
| SQL Server | DECIMAL |
BIGNUMERIC |
| SQL Server | FLOAT |
FLOAT64 |
| SQL Server | IMAGE |
BYTES |
| SQL Server | INT |
INT64 |
| SQL Server | MONEY |
BIGNUMERIC |
| SQL Server | NCHAR |
STRING |
| SQL Server | NTEXT |
STRING |
| SQL Server | NUMERIC |
BIGNUMERIC |
| SQL Server | NVARCHAR |
STRING |
| SQL Server | NVARCHAR(MAX) |
STRING |
| SQL Server | REAL |
FLOAT64 |
| SQL Server | SMALLDATETIME |
DATETIME |
| SQL Server | SMALLINT |
INT64 |
| SQL Server | SMALLMONEY |
NUMERIC |
| SQL Server | TEXT |
STRING |
| SQL Server | TIME |
TIME |
| SQL Server | TIMESTAMP/ROWVERSION |
BYTES |
| SQL Server | TINYINT |
INT64 |
| SQL Server | UNIQUEIDENTIFIER |
STRING |
| SQL Server | VARBINARY |
BYTES |
| SQL Server | VARBINARY(MAX) |
BYTES |
| SQL Server | VARCHAR |
STRING |
| SQL Server | VARCHAR(MAX) |
STRING |
| SQL Server | XML |
STRING |
| Salesforce | BOOLEAN |
BOOLEAN |
| Salesforce | BYTE |
BYTES |
| Salesforce | DATE |
DATE |
| Salesforce | DATETIME |
DATETIME |
| Salesforce | DOUBLE |
BIGNUMERIC |
| Salesforce | INT |
INT64 |
| Salesforce | STRING |
STRING |
| Salesforce | TIME |
TIME |
| Salesforce | ANYTYPE(可以是 STRING、DATE、NUMBER 或 BOOLEAN) |
STRING |
| Salesforce | COMBOBOX |
STRING |
| Salesforce | CURRENCY |
FLOAT64
允许的最大长度为 18 位数。 |
| Salesforce | DATACATEGORYGROUPREFERENCE |
STRING |
| Salesforce | EMAIL |
STRING |
| Salesforce | ENCRYPTEDSTRING |
STRING |
| Salesforce | ID |
STRING |
| Salesforce | JUNCTIONIDLIST |
STRING |
| Salesforce | MASTERRECORD |
STRING |
| Salesforce | MULTIPICKLIST |
STRING |
| Salesforce | PERCENT |
FLOAT64
允许的最大长度为 18 位数。 |
| Salesforce | PHONE |
STRING |
| Salesforce | PICKLIST |
STRING |
| Salesforce | REFERENCE |
STRING |
| Salesforce | TEXTAREA |
STRING
允许的长度上限为 255 个字符。 |
| Salesforce | URL |
STRING |
MongoDB 数据类型
MongoDB 二进制 JSON (BSON) 文档以 MongoDB Extended JSON (v1) 严格模式格式写入 BigQuery。下表显示了数据类型在 BigQuery 中的表示方式以及示例值。
| 源数据类型 | 示例值 | BigQuery JSON 类型值 |
|---|---|---|
DOUBLE |
3.1415926535
|
3.1415926535 |
STRING | "Hello, MongoDB!" | "Hello, MongoDB!" |
ARRAY |
| ["item1",123,true,{"subItem":"object in array"}] |
BINARY DATA |
new BinData(0, "SGVsbG8gQmluYXJ5IERhdGE=") |
{"$binary":"SGVsbG8gQmluYXJ5IERhdGE=","$type":"00"} |
BOOLEAN | true | true |
DATE |
2024-12-25T10:30:00.000+00:00
|
{"$date": 1735122600000}
|
NULL | null | null |
REGEX | /^mongo(db)?$/i | {"$options":"i","$regex":"^mongo(db)?$"} |
JAVASCRIPT | function() {return this.stringField.length;} | {"$code":"function() {\n return this.stringField.length;\n }"} |
DECIMAL128 | NumberDecimal("1234567890.1234567890") | {"$numberDecimal":"1234567890.1234567890"} |
OBJECTID | ObjectId('673c5d8dbfe2e51808cc2c3d') | {"$oid": "673c5d8dbfe2e51808cc2c3d"} |
LONG | 3567587327 | {"$numberLong": "3567587327"} |
INT32 | 42 | 42 |
INT64 | 1864712049423024127 | {"$numberLong": "1864712049423024127"} |
TIMESTAMP | new Timestamp(1747888877, 1) | {"$timestamp":{"i":1,"t":1747888877}} |
将 PostgreSQL 数组作为 BigQuery 数组数据类型进行查询
如果您希望将 PostgreSQL 数组作为 BigQuery ARRAY 数据类型进行查询,可以使用 BigQuery JSON_VALUE_ARRAY 函数将 JSON 值转换为 BigQuery 数组:
SELECT ARRAY(SELECT CAST(element AS TYPE) FROM UNNEST(JSON_VALUE_ARRAY(BQ_COLUMN_NAME,'$')) AS element)AS array_col
替换以下内容:
TYPE:与 PostgreSQL 源数组中的元素类型匹配的 BigQuery 类型。例如,如果源类型是
BIGINT值数组,则将 TYPE 替换为INT64。如需详细了解如何映射数据类型,请参阅映射数据类型。
BQ_COLUMN_NAME:BigQuery 表中相关列的名称。
以下两种情况例外:
对于源列中
BIT、BIT_VARYING或BYTEA值的数组,请运行以下查询:SELECT ARRAY(SELECT FROM_BASE64(element) FROM UNNEST(JSON_VALUE_ARRAY(BQ_COLUMN_NAME,'$')) AS element)
AS array_of_bytes 对于源列中
JSON或JSONB值的数组,请使用JSON_QUERY_ARRAY函数:SELECT ARRAY(SELECT element FROM UNNEST(JSON_QUERY_ARRAY(BQ_COLUMN_NAME,'$')) AS element)
AS array_of_jsons