Spanner as a source

This page contains information about:

The behavior of how Datastream handles data that's being pulled from a source Spanner database.
The editions of Spanner databases that Datastream supports.
Known limitations for using a Spanner database as a source.

Behavior

Spanner is a fully managed, scalable, and highly available database service that you can use as a source with Datastream. Datastream uses a Spanner change stream to track changes made in Spanner databases. The changes included in the change stream are then replicated to the destination to reproduce the source events.

Datastream doesn't create or modify change streams, so the database objects that aren't tracked by the provided Spanner change stream can't be included in your Datastream stream.

For more information about Spanner, see the Spanner documentation.

Versions

Datastream supports all available Spanner editions:

Standard edition
Enterprise edition
Enterprise Plus edition

For an overview of each version, see Spanner editions overview.

Known limitations

Known limitations for using a Spanner database as a source include:

Only change streams using the NEW_ROW value capture type are supported.
Datastream doesn't support the PROTO or ENUM data type columns.
Datastream doesn't support arrays of DATE or TIMESTAMP data types.
Backfills for databases of over 3 tebibytes (TiB) in size can take over 24 hours to complete.
Backfills create snapshot epochs, a type of backup created for a specific timestamp that retain the data versions for that timestamp. Snapshot epochs delay major compactions until the backfill is complete. To learn more about compactions, see Spanner columnar engine overview.
Datastream might have issues keeping up with Spanner change streams that have more than 10,000 partitions. This means that the change events might arrive delayed, or the stream might eventually fail.
Datastream might have issues keeping up with Spanner change streams with over 60,000 updates per second. This means that the change events might arrive delayed, or the stream might eventually fail.
Datastream might have issues keeping up with Spanner change streams with over 60 mebibytes (MiB) per second throughput. This means that the change events might arrive delayed, or the stream might eventually fail.
Replicating geo-partitioned data isn't supported because Spanner change streams don't support partitioned data.

What's next

Learn how to configure a Spanner source for use with Datastream.

Spanner as a source Stay organized with collections Save and categorize content based on your preferences.

Behavior

Versions

Known limitations

What's next

Spanner as a source