访问历史数据

借助 BigQuery,您可以查询并恢复在时间旅行窗口内更改或删除的 BigQuery 中存储的数据。

查询某个时间点的数据

您可以使用 FOR SYSTEM_TIME AS OF 子句来查询表内任何时间点的历史数据。此子句接受常量时间戳表达式,并引用该时间戳上的最新表版本。该表必须存储在 BigQuery 中;它不能是外部表。 使用 SYSTEM_TIME AS OF 时,表没有大小限制。

例如,以下查询返回表在过去一个小时内的历史版本:

SELECT *
FROM `mydataset.mytable`
  FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR);

如果时间戳指定的时间早于时间旅行窗口或早于创建表的时间,则查询会失败并返回如下错误:

Invalid snapshot time 1601168925462 for table
myproject:mydataset.table1@1601168925462. Cannot read before 1601573410026.

使用 CREATE OR REPLACE TABLE 语句替换现有表后,您可以使用 FOR SYSTEM_TIME AS OF 查询该表的先前版本。

如果表已删除,则查询会失败并返回如下错误:

Not found: Table myproject:mydataset.table was not found in location LOCATION

从某个时间点恢复表

您可以将历史数据复制到新表中,以从历史数据中恢复表。即使表已删除或过期,只要您在时间旅行时间段内恢复表,复制历史数据也仍然有效。

当您从历史数据恢复表时,源表中的标记不会复制到目标表中。 表分区信息也不会复制到目标表。如需重新创建原始表的分区方案,您可以在 Cloud Logging 中查看初始表创建请求,并使用该信息对恢复的表进行分区。

要恢复已删除但仍在时间旅行窗口内的表,您可以使用 @<time> 时间修饰符将该表复制到一个新表。即使使用时间修饰器,您也无法查询已删除的表。您必须先恢复该表。

将以下语法与 @<time> 时间修饰器搭配使用:

  • tableid@TIME,其中 TIME 是从 Unix 纪元开始计算的毫秒数。
  • tableid@-TIME_OFFSET,其中 TIME_OFFSET 是相对于当前时间的偏移量(以毫秒为单位)。
  • tableid@0:指定可用的最旧历史数据。

如需恢复表,请选择以下选项之一:

控制台

您无法使用 Google Cloud 控制台恢复删除表。

bq

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  2. 如需恢复表,请先确定表存在时长的 UNIX 时间戳(以毫秒为单位)。您可以使用 Linux date 命令通过常规时间戳值生成 Unix 时间戳:

    date -d '2023-08-04 16:00:34.456789Z' +%s000
    
  3. 然后,将 bq copy 命令与 @<time> 时间旅行修饰器结合使用来执行表复制操作。

    例如,输入以下命令可将时间为 1418864998000mydataset.mytable 表复制到新表 mydataset.newtable

    bq cp mydataset.mytable@1418864998000 mydataset.newtable
    

    (可选)提供 --location 标志并将其值设置为您的位置

    您还可以指定相对偏移量。以下示例复制一小时前表的版本:

    bq cp mydataset.mytable@-3600000 mydataset.newtable
    
  4. Go

    试用此示例之前,请按照 BigQuery 快速入门:使用客户端库中的 Go 设置说明进行操作。 如需了解详情,请参阅 BigQuery Go API 参考文档

    如需向 BigQuery 进行身份验证,请设置应用默认凭证。如需了解详情,请参阅为客户端库设置身份验证

    import (
    	"context"
    	"fmt"
    	"time"
    
    	"cloud.google.com/go/bigquery"
    )
    
    // deleteAndUndeleteTable demonstrates how to recover a deleted table by copying it from a point in time
    // that predates the deletion event.
    func deleteAndUndeleteTable(projectID, datasetID, tableID string) error {
    	// projectID := "my-project-id"
    	// datasetID := "mydataset"
    	// tableID := "mytable"
    	ctx := context.Background()
    	client, err := bigquery.NewClient(ctx, projectID)
    	if err != nil {
    		return fmt.Errorf("bigquery.NewClient: %v", err)
    	}
    	defer client.Close()
    
    	ds := client.Dataset(datasetID)
    	if _, err := ds.Table(tableID).Metadata(ctx); err != nil {
    		return err
    	}
    	// Record the current time.  We'll use this as the snapshot time
    	// for recovering the table.
    	snapTime := time.Now()
    
    	// "Accidentally" delete the table.
    	if err := client.Dataset(datasetID).Table(tableID).Delete(ctx); err != nil {
    		return err
    	}
    
    	// Construct the restore-from tableID using a snapshot decorator.
    	snapshotTableID := fmt.Sprintf("%s@%d", tableID, snapTime.UnixNano()/1e6)
    	// Choose a new table ID for the recovered table data.
    	recoverTableID := fmt.Sprintf(&quot;%s_recovered", tableID)
    
    	// Construct and run a copy job.
    	copier := ds.Table(recoverTableID).CopierFrom(ds.Table(snapshotTableID))
    	copier.WriteDisposition = bigquery.WriteTruncate
    	job, err := copier.Run(ctx)
    	if err != nil {
    		return err
    	}
    	status, err := job.Wait(ctx)
    	if err != nil {
    		return err
    	}
    	if err := status.Err(); err != nil {
    		return err
    	}
    
    	ds.Table(recoverTableID).Delete(ctx)
    	return nil
    }
    

    Java

    试用此示例之前,请按照 BigQuery 快速入门:使用客户端库中的 Java 设置说明进行操作。 如需了解详情,请参阅 BigQuery Java API 参考文档

    如需向 BigQuery 进行身份验证,请设置应用默认凭证。如需了解详情,请参阅为客户端库设置身份验证

    import com.google.cloud.bigquery.BigQuery;
    import com.google.cloud.bigquery.BigQueryException;
    import com.google.cloud.bigquery.BigQueryOptions;
    import com.google.cloud.bigquery.CopyJobConfiguration;
    import com.google.cloud.bigquery.Job;
    import com.google.cloud.bigquery.JobInfo;
    import com.google.cloud.bigquery.TableId;
    
    // Sample to undeleting a table
    public class UndeleteTable {
    
      public static void runUndeleteTable() {
        // TODO(developer): Replace these variables before running the sample.
        String datasetName = "MY_DATASET_NAME";
        String tableName = "MY_TABLE_TABLE";
        String recoverTableName = "MY_RECOVER_TABLE_TABLE";
        undeleteTable(datasetName, tableName, recoverTableName);
      }
    
      public static void undeleteTable(String datasetName, String tableName, String recoverTableName) {
        try {
          // Initialize client that will be used to send requests. This client only needs to be created
          // once, and can be reused for multiple requests.
          BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
    
          // "Accidentally" delete the table.
          bigquery.delete(TableId.of(datasetName, tableName));
    
          // Record the current time.  We'll use this as the snapshot time
          // for recovering the table.
          long snapTime = System.currentTimeMillis();
    
          // Construct the restore-from tableID using a snapshot decorator.
          String snapshotTableId = String.format("%s@%d", tableName, snapTime);
    
          // Construct and run a copy job.
          CopyJobConfiguration configuration =
              CopyJobConfiguration.newBuilder(
                      // Choose a new table ID for the recovered table data.
                      TableId.of(datasetName, recoverTableName),
                      TableId.of(datasetName, snapshotTableId))
                  .build();
    
          Job job = bigquery.create(JobInfo.of(configuration&&));
          job = job.waitFor();
          if (job.isDone()  job.getStatus().getError() == null) {
            System.out.println("Undelete table recovered successfully.");
          } else {
            System.out.println(
                "BigQuery was unable to copy the table due to an error: \n"
                    + job.getStatus().getError());
            return;
          }
        } catch (BigQueryException | InterruptedException e) {
          System.out.println("Table not found. \n" + e.toString());
        }
      }
    }

    Node.js

    试用此示例之前,请按照 BigQuery 快速入门:使用客户端库中的 Node.js 设置说明进行操作。 如需了解详情,请参阅 BigQuery Node.js API 参考文档

    如需向 BigQuery 进行身份验证,请设置应用默认凭证。如需了解详情,请参阅为客户端库设置身份验证

    // Import the Google Cloud client library
    const {BigQuery} = require(&#39;@google-cloud/bigquery');
    const bigquery = new BigQuery();
    
    async function undeleteTable() {
      // Undeletes "my_table_to_undelete" from "my_dataset".
    
      /**
       * TODO(developer): Uncomment the following lines before running the sample.
       */
      // const datasetId = "my_dataset";
      // const tableId = "my_table_to_undelete";
      // const recoveredTableId = "my_recovered_table";
    
      /**
       * TODO(developer): Choose an appropriate snapshot point as epoch milliseconds.
       * For this example, we choose the current time as we're about to delete the
       * table immediately afterwards.
       */
      const snapshotEpoch = Date.now();
    
      // Delete the table
      await bigquery
        .dataset(datasetId)
        .table(tableId)
        .delete();
    
      console.log(`Table ${tableId} deleted.`);
    
      // Construct the restore-from table ID using a snapshot decorator.
      const snapshotTableId = `${tableId}@${snapshotEpoch}`;
    
      // Construct and run a copy job.
      await bigquery
        .dataset(datasetId)
        .table(snapshotTableId)
        .copy(bigquery.dataset(datasetId).table(recoveredTableId));
    
      console.log(
        `Copied data from deleted table ${tableId} to ${recoveredTableId}`
      );
    }

    Python

    试用此示例之前,请按照 BigQuery 快速入门:使用客户端库中的 Python 设置说明进行操作。 如需了解详情,请参阅 BigQuery Python API 参考文档

    如需向 BigQuery 进行身份验证,请设置应用默认凭证。如需了解详情,请参阅为客户端库设置身份验证

    import time
    
    from google.cloud import bigquery
    
    # Construct a BigQuery client object.
    client = bigquery.Client()
    
    # TODO(developer): Choose a table to recover.
    # table_id = "your-project.your_dataset.your_table"
    
    # TODO(developer): Choose a new table ID for the recovered table data.
    # recovered_table_id = "your-project.your_dataset.your_table_recovered"
    
    # TODO(developer): Choose an appropriate snapshot point as epoch
    # milliseconds. For this example, we choose the current time as we're about
    # to delete the table immediately afterwards.
    snapshot_epoch = int(time.time() * 1000)
    
    # ...
    
    # "Accidentally" delete the table.
    client.delete_table(table_id)  # Make an API request.
    
    # Construct the restore-from table ID using a snapshot decorator.
    snapshot_table_id = "{}@{}".format(table_id, snapshot_epoch)
    
    # Construct and run a copy job.
    job = client.copy_table(
        snapshot_table_id,
        recovered_table_id,
        # Must match the source and destination tables location.
        location="US",
    )  # Make an API request.
    
    job.result()  # Wait for the job to complete.
    
    print(
        "Copied data from deleted table {} to {}".format(table_id, recovered_table_id)
    )

如果您预计可能需要在时间旅行窗口允许的时间后恢复表,请创建该表的表快照。 如需了解详情,请参阅表快照简介

您无法直接恢复逻辑视图。如需了解详情,请参阅恢复视图

后续步骤