Manage advanced migration

Advanced migration is a solution for migrating data for large-sized databases with less downtime. This feature is only available for PostgreSQL.

Migration source requirements

Configuring your migration source requires configuring both the source instance and underlying source databases.

Supported versions

The supported self-managed PostgreSQL versions are:

  • PostgreSQL: 9.4, 9.5, 9.6, 10, 11, 12, 13, 14, and 15
  • AlloyDB Omni: 15

Source instance configuration

Your source instance must include the default postgres database. If it doesn't exist, you need to create it. The pglogical library must be installed on the source and included in the shared_preload_libraries configuration variable.

Source database configuration

The migration migrates all databases under the source instance except template databases (template0 and template1 for on-premise sources). For each database that will migrate, you must fulfill these steps:

  • Install extension: You must run CREATE EXTENSION IF NOT EXISTS pglogical to install the extension into the database.
  • Primary key constraint: Data is only migrated for tables with primary keys. For tables without primary keys, the migration only copies the table schema, not the data. Additionally, for tables without primary keys, this migration only supports the initial snapshot and INSERT statements during the change data capture (CDC) phase. You must handle UPDATE and DELETE statements manually.
  • User privileges: The database user that you use to connect to the source instance must have sufficient privileges on the postgres default database, the pglogical schema, and all migration schemas (except for information schemas and schemas starting with pg_). Specifically, for PostgreSQL, you must grant the following privileges:

    GRANT USAGE on SCHEMA <SCHEMA> to <USER>;
    GRANT SELECT on ALL TABLES in SCHEMA pglogical to <USER>;
    GRANT SELECT on ALL TABLES in SCHEMA <SCHEMA> to <USER>;
    GRANT SELECT on ALL SEQUENCES in SCHEMA <SCHEMA> to <USER>;
    ALTER USER <USER> with REPLICATION role;
    

DDL support

Only Data Manipulation Language (DML) changes are automatically synced during the migration. Data Definition Language (DDL) must be managed manually to ensure source and destination databases remain compatible over the migration. We recommend using pglogical.replicate_ddl_command to run DDL on the source and destination at a consistent point.

Connectivity and authorization

You must open your network firewall and configure your pg_hba.conf file to accept incoming connections from the Destination outgoing IP address (the migration endpoint), which is provided on GDC console.

Manage migration

A user with the Project DB Admin role must perform the following steps. Use either the GDC console or the Distributed Cloud CLI to manage migrations:

Console

  1. From the main menu, choose Database Service.
  2. Click Create Migration.
  3. In the Get started dialog, review requirements for the source and connectivity.
  4. In the Specify your source database dialog, specify the source database hostname or IP address, username, password, encryption type, and certificate.
  5. In the Configure your cluster dialog, specify the Cluster ID, password, database version, CPU, memory, and storage capacity of the target database cluster. Ensure you choose enough memory to hold your largest table.
  6. Click Create. Creating the migration and target database cluster can take a few minutes. The status changes from Reconciling to Ready when the cluster is ready. The migration status changes to Unsynced when migration is set up successfully. Use the following options to manage your migration:
    1. Start: This starts the migration and changes the migration status to Running.
    2. Stop: This stops the migration and changes the migration status to Stopped.
    3. Promote: This promotes the target database cluster to a stand alone database.
    4. Delete: This deletes the migration and target database cluster created for this migration.

Periodically rotate the source database replication user password with the following steps:

  1. Go to Source database and click edit Edit.
  2. Make the changes to rotate the replication user password.
  3. Click Save to apply your changes.

After the change is applied, the migration backend uses the new password.

gdcloud

  1. Before using Distributed Cloud CLI, install and initialize it. Then, authenticate with your organization.

  2. Create a migration:

    gdcloud database connection-profiles create DB_ENGINE_TYPE SOURCE_CONNECTION_PROFILE \
        --username REPLICATION_USERNAME \
        --password REPLICATION_PASSWORD \
        --ca-certificate CA_CERT_FILE_PATH
    
    gdcloud database migrations create MIGRATION_NAME \
        --source SOURCE_CONNECTION_PROFILE \
        --destination DESTINATION_DBCLUSTER
    
    gdcloud database clusters create DESTINATION_DBCLUSTER \
        --database-version DB_VERSION \
        --admin-password ADMIN_PASSWORD
    

    Replace the following variables:

    • DB_ENGINE_TYPE, the db engine type for migration. Supported values are: postgresql.
    • SOURCE_CONNECTION_PROFILE, the name for the new connection profile.
    • REPLICATION_USERNAME, the name for the replication user of the source database.
    • REPLICATION_PASSWORD, the password for the replication user of the source database.
    • CA_CERT_FILE_PATH, the path for the source database CA certificate.
    • MIGRATION_NAME, the name for the new migration.
    • DESTINATION_DBCLUSTER, the name for the target database cluster.
    • DB_VERSION, the version string for the new cluster. For example, POSTGRESQL_13.
    • ADMIN_PASSWORD, the administrator user password for the new cluster.
  3. Start a migration:

    gdcloud database migrations start MIGRATION_NAME
    
  4. Stop a migration:

    gdcloud database migrations stop MIGRATION_NAME
    
  5. Promote a migration:

    gdcloud database migrations promote MIGRATION_NAME
    
  6. List existing connection-profiles:

    gdcloud database connection-profiles list DB_ENGINE_TYPE
    
  7. List the existing migration:

    gdcloud database migrations list --destination DESTINATION_DBCLUSTER
    

API

For PostgreSQL source databases:

Create a secret to store source database CA certificate:

apiVersion: v1
data:
  ca.crt:  SOURCE_DB_CA_CERT
kind: Secret
metadata:
  annotations:
    propagation.gdch.gke.io/target-namespace:  USER_PROJECT
  name: es-crt-EXTERNAL_SERVER_NAME
  namespace: USER_PROJECT
type: Opaque

Create a secret to store source database migration user password:

apiVersion: v1
data:
  password: SOURCE_DB_USER_PASSWORD
kind: Secret
metadata:
  annotations:
    propagation.gdch.gke.io/target-namespace: USER_PROJECT
  name: es-pw-EXTERNAL_SERVER_NAME
  namespace: USER_PROJECT
type: Opaque

Create an externalserver:

apiVersion: DBENGINE_NAME.dbadmin.gdc.goog/v1
kind: ExternalServer
metadata:
  name: EXTERNAL_SERVER_NAME
  namespace: USER_PROJECT
spec:
  host: SOURCE_DB_HOST
  port: 5432
  username: SOURCE_DB_USERNAME
  password:
    name: es-pw-EXTERNAL_SERVER_NAME
    namespace: USER_PROJECT
  certRef:
    name: es-crt-EXTERNAL_SERVER_NAME
    namespace: USER_PROJECT

Create an migration:

apiVersion: DBENGINE_NAME.dbadmin.gdc.goog/v1
kind: Migration
metadata:
  name: MIGRATION_NAME
  namespace: USER_PROJECT
spec:
  source:
    reference:
      apiVersion: DBENGINE_NAME.dbadmin.gdc.goog/v1
      kind: ExternalServer
      name: EXTERNAL_SERVER_NAME
  target:
    reference:
      apiVersion: DBENGINE_NAME.dbadmin.gdc.goog/v1
      kind: DBCluster
      name: DBCLUSTER_NAME
  control: MIGRATION_CONTROL

Replace the following variables:

  • EXTERNAL_SERVER_NAME: the name of the externalserver representing the source database.
  • USER_PROJECT: the name of the user project where the externalserver is created.
  • DBENGINE_NAME: the name of the database engine. Supported values are: postgresql.
  • SOURCE_DB_CA_CERT: the ca certificate of the source database.
  • SOURCE_DB_USER_PASSWORD: the migration user password of the source database.
  • SOURCE_DB_USERNAME: the migration username of the source database.
  • SOURCE_DB_HOST: the migration host address of the source database.
  • MIGRATION_NAME: the name of the migration operation.
  • DBCLUSTER_NAME: the name of the migration target database cluster.
  • MIGRATION_CONTROL: the controls of migration operation. It should be one of start or stop when migration is created. It should be promote to promote migration target database cluster.