Mengimpor glosarium bisnis dan link entri menggunakan file JSON

Halaman ini menjelaskan cara mengimpor glosarium, kategori, istilah, dan link entri secara massal ke Knowledge Catalog (sebelumnya Dataplex Universal Catalog) menggunakan Dataplex API. Anda dapat menggunakan proses ini untuk memigrasikan metadata dari alat katalog lainnya atau untuk melakukan update massal pada glosarium yang ada dengan mengupload file JSON ke Cloud Storage.

Sebelum memulai

Sebelum memulai proses impor, selesaikan prasyarat berikut:

Membuat bucket Cloud Storage

Buat bucket Cloud Storage untuk berfungsi sebagai area staging bagi file impor.

Peran yang diperlukan

Untuk mendapatkan izin yang Anda perlukan untuk mengimpor glosarium dan link entri menggunakan file JSON, minta administrator untuk memberi Anda peran IAM berikut:

Untuk mengetahui informasi selengkapnya tentang pemberian peran, lihat Mengelola akses ke project, folder, dan organisasi.

Anda mungkin juga bisa mendapatkan izin yang diperlukan melalui peran khusus atau peran bawaan lainnya.

Mengaktifkan API

Untuk mengimpor glosarium dan link entri, aktifkan Dataplex API di project Anda.

Peran yang diperlukan untuk mengaktifkan API

Untuk mengaktifkan API, Anda memerlukan peran IAM Service Usage Admin (roles/serviceusage.serviceUsageAdmin), yang berisi izin serviceusage.services.enable. Pelajari cara memberikan peran.

Aktifkan API

Mengimpor glosarium menggunakan file JSON

Untuk mengimpor glosarium, kategori, dan istilah, selesaikan tugas berikut.

Membuat glosarium target

Buat glosarium target untuk mengimpor metadata.

alias gcurl='curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json"'

gcurl -X POST https://dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/glossaries?glossary_id=GLOSSARY_ID -d "$(cat<<EOF

{
   "displayName": "DISPLAY_NAME",
   "description": "DESCRIPTION"
}
EOF
)"

Ganti kode berikut:

  • PROJECT_ID: project ID tempat Anda membuat glosarium
  • LOCATION_ID: lokasi tempat Anda ingin membuat glosarium
  • GLOSSARY_ID: ID glosarium
  • DISPLAY_NAME: nama tampilan glosarium
  • DESCRIPTION: deskripsi glosarium

Menyiapkan file JSON

Buat file JSON yang dibatasi baris baru dengan glosarium, kategori, dan istilah yang ingin Anda upload ke bucket Cloud Storage.

Gunakan skema JSON berikut untuk menyusun file impor Anda:

   {"entry":{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","entryType":"projects/dataplex-types/locations/global/entryTypes/glossary-category","aspects":{"dataplex-types.global.glossary-category-aspect":{"data":{}},"dataplex-types.global.overview":{"data":{"content":"CONTENT"}},"dataplex-types.global.contacts":{"data":{"identities":[{role: "steward", name: "CONTACT_DISPLAY_NAME", id: "CONTACT_EMAIL"}]}}},"parentEntry":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","entrySource":{"resource":"projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","displayName":"CATEGORY_NAME","description":"CATEGORY_DESCRIPTION","ancestors":[{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary"}]}}}
   {"entry":{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM1_ID","entryType":"projects/dataplex-types/locations/global/entryTypes/glossary-term","aspects":{"dataplex-types.global.glossary-term-aspect":{"data":{}},"dataplex-types.global.overview":{"data":{"content":"TERM1_CONTENT"}},"dataplex-types.global.contacts":{"data":{"identities":[{role: "steward", name: "CONTACT_DISPLAY_NAME", id: "CONTACT_EMAIL"}]}}},"parentEntry":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","entrySource":{"resource":"projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM1_ID","displayName":"TERM1_DISPLAY_NAME","description":"TERM1_DESCRIPTION","ancestors":[{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary"},{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary-category"}]}}}
   {"entry":{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM2_ID","entryType":"projects/dataplex-types/locations/global/entryTypes/glossary-term","aspects":{"dataplex-types.global.glossary-term-aspect":{"data":{}},"dataplex-types.global.overview":{"data":{"content":"TERM1_CONTENT"}},"dataplex-types.global.contacts":{"data":{"identities":[{role: "steward", name: "CONTACT_DISPLAY_NAME", id: "CONTACT_EMAIL"}]}}},"parentEntry":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","entrySource":{"resource":"projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM2_ID","displayName":"TERM2_DISPLAY_NAME","description":"TERM2_DESCRIPTION","ancestors":[{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary"},{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary-category"}]}}}

Mengimpor glosarium, kategori, dan istilah

Impor glosarium, kategori, dan istilah ke dalam bucket Cloud Storage:

# Set GCURL alias
alias gcurl='curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json"'

gcurl https://DATAPLEX_API/metadataJobs?metadata_job_id=JOB_ID -d "$(cat<<EOF
{
   "type":"IMPORT",
   "import_spec":{
      "log_level":"DEBUG",
      "source_storage_uri":"gs://STORAGE_BUCKET/",
      "entry_sync_mode":"FULL",
      "aspect_sync_mode":"INCREMENTAL",
      "scope":{
         "glossaries": "GLOSSARY_NAME"
      }
   }
}
EOF
)"

Ganti DATAPLEX_API dengan endpoint Dataplex API dalam format dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID.

Untuk mengimpor link entri secara massal, selesaikan tugas berikut.

Buat file JSON yang dipisahkan newline dengan link entri yang ingin Anda upload ke bucket Cloud Storage.

Gunakan skema JSON berikut untuk menyusun file impor Anda:

Format contoh untuk link antar-istilah

   {"entryLink":{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entryLinks/el-import-0606e3f2-8206-4f3a-aba9-32c6196f6048","entryLinkType":"projects/dataplex-types/locations/global/entryLinkTypes/synonym","entryReferences":[{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-1"},{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-2"}]}}
   {"entryLink":{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entryLinks/el-import-2f7408e3-af3d-405d-81bb-861cf9ec5146","entryLinkType":"projects/dataplex-types/locations/global/entryLinkTypes/related","entryReferences":[{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-1"},{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-2"}]}}
   

Format contoh untuk link antara istilah dan aset data

   {"entryLink":{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entryLinks/el-import-0606e3f2-8206-4f3a-aba9-32c6196f6048","entryLinkType":"projects/dataplex-types/locations/global/entryLinkTypes/definition","entryReferences":[{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-1"},{"name":"projects/PROJECT_NUMBER/locations/us-central1/entryGroups/entry-group-1/entries/entry-1"}]}}
   

Untuk memulai tugas impor metadata yang membuat koneksi synonym atau related antar-istilah katalog Anda, gunakan perintah berikut:

gcurl https://DATAPLEX_API/metadataJobs?metadata_job_id=JOB_ID -d "$(cat<<EOF
   {
      "type":"IMPORT",
      "import_spec":{
         "log_level":"DEBUG",
         "source_storage_uri":"gs://STORAGE_BUCKET/",
         "entry_sync_mode":"FULL",
         "aspect_sync_mode":"INCREMENTAL",
         "scope":{
            "entry_groups":[  "projects/PROJECT_ID/locations/LOCATION_ID/entryGroups/@dataplex"
            ],
            "entry_link_types":[
               "projects/dataplex-types/locations/global/entryLinkTypes/synonym",
               "projects/dataplex-types/locations/global/entryLinkTypes/related"
            ],
            "referenced_entry_scopes":[
               "PROJECT_IDS"
            ]
         }
      }
   }
EOF
)"

Ganti DATAPLEX_API dengan dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID.

Untuk membuat link antara istilah glosarium dan aset data, jalankan tugas impor untuk setiap grup entri yang berisi entri untuk aset data. Semua link entri definisi dibuat dalam grup entri ini.

   gcurl https://DATAPLEX_API/metadataJobs?metadata_job_id=JOB_ID -d "$(cat<<EOF
   {
   "type":"IMPORT",
   "import_spec":{
      "log_level":"DEBUG",
      "source_storage_uri":"gs://STORAGE_BUCKET/",
      "entry_sync_mode":"FULL",
      "aspect_sync_mode":"INCREMENTAL",
      "scope":{
         "entry_groups":[  "projects/PROJECT_ID/locations/ENTRY_GROUP_LOCATION_ID/entryGroups/@dataplex"
         ],
         "entry_link_types":[
            "projects/dataplex-types/locations/global/entryLinkTypes/definition"
         ],
         "referenced_entry_scopes":[
            "PROJECT_IDS"
         ]
      }
   }
   }
EOF
)"

Ganti DATAPLEX_API dengan dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID.

Memantau status tugas impor

  1. Untuk mendapatkan status operasi import, jalankan perintah berikut:

    gcurl https://dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/operations/operation-OPERATION_ID

    Ganti OPERATION_ID dengan ID operasi.

  2. Untuk mendapatkan status tugas metadata, jalankan perintah berikut:

    gcurl -X GET https://dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/metadataJobs/JOB_ID

Langkah berikutnya