此页面由 Cloud Translation API 翻译。

将数据写入 Firestore 数据库

本页介绍了迁移流程的第二阶段，您可以在此阶段设置 Dataflow 流水线，并开始将数据从 Cloud Storage 存储桶并发迁移到目标 Firestore（与 MongoDB 兼容）数据库。此操作将与 Datastream 流并发运行。

启动 Dataflow 流水线

以下命令会启动一个具有唯一名称的新 Dataflow 流水线。

以下为可接受的格式：

inputFilePattern=gs://bucket-name/migration/attempt_1/
inputFilePattern=gs://bucket-name/migration/
inputFilePattern=gs://bucket-name/

不支持的转换类型：

inputFilePattern=gs://bucket-name/**/*

DATAFLOW_START_TIME="$(date +'%Y%m%d%H%M%S')"

gcloud dataflow flex-template run "dataflow-mongodb-to-firestore-$DATAFLOW_START_TIME" \
--template-file-gcs-location gs://dataflow-templates-us-central1/latest/flex/Cloud_Datastream_MongoDB_to_Firestore \
--region $LOCATION \
--num-workers $NUM_WORKERS \
--temp-location $TEMP_OUTPUT_LOCATION \
--additional-user-labels "" \
--parameters inputFilePattern=$INPUT_FILE_LOCATION,\
inputFileFormat=avro,\
fileReadConcurrency=10,\
connectionUri=$FIRESTORE_CONNECTION_URI,\
databaseName=$FIRESTORE_DATABASE_NAME,\
shadowCollectionPrefix=shadow_,\
batchSize=500,\
deadLetterQueueDirectory=$DLQ_LOCATION,\
dlqRetryMinutes=10,\
dlqMaxRetryCount=500,\
processBackfillFirst=false,\
useShadowTablesForBackfill=true,\
runMode=regular,\
directoryWatchDurationInMinutes=20,\
streamName=$DATASTREAM_NAME,\
stagingLocation=$STAGING_LOCATION,\
autoscalingAlgorithm=THROUGHPUT_BASED,\
maxNumWorkers=$MAX_WORKERS,\
workerMachineType=$WORKER_TYPE

如需详细了解如何监控 Dataflow 流水线，请参阅问题排查。

后续步骤

继续将流量迁移到 Firestore。