This page contains information about:
- The behavior of how Datastream handles data that's being pulled from a source Spanner database.
- The editions of Spanner databases that Datastream supports.
- Known limitations for using a Spanner database as a source.
Behavior
Spanner is a fully managed, scalable, and highly available database service that you can use as a source with Datastream. Datastream uses a Spanner change stream to track changes made in Spanner databases. The changes included in the change stream are then replicated to the destination to reproduce the source events.
Datastream doesn't create or modify change streams, so the database objects that aren't tracked by the provided Spanner change stream can't be included in your Datastream stream.
For more information about Spanner, see the Spanner documentation.
Versions
Datastream supports all available Spanner editions:
- Standard edition
- Enterprise edition
- Enterprise Plus edition
For an overview of each version, see Spanner editions overview.
Known limitations
Known limitations for using a Spanner database as a source include:
- Only change streams using the
NEW_ROWvalue capture type are supported. - Datastream doesn't support the
PROTOorENUMdata type columns. - Datastream doesn't support arrays of
DATEorTIMESTAMPdata types. - Backfills for databases of over 3 tebibytes (TiB) in size can take over 24 hours to complete.
- Backfills create snapshot epochs, a type of backup created for a specific timestamp that retain the data versions for that timestamp. Snapshot epochs delay major compactions until the backfill is complete. To learn more about compactions, see Spanner columnar engine overview.
- Datastream might have issues keeping up with Spanner change streams that have more than 10,000 partitions. This means that the change events might arrive delayed, or the stream might eventually fail.
- Datastream might have issues keeping up with Spanner change streams with over 60,000 updates per second. This means that the change events might arrive delayed, or the stream might eventually fail.
- Datastream might have issues keeping up with Spanner change streams with over 60 mebibytes (MiB) per second throughput. This means that the change events might arrive delayed, or the stream might eventually fail.
- Replicating geo-partitioned data isn't supported because Spanner change streams don't support partitioned data.
What's next
- Learn how to configure a Spanner source for use with Datastream.