JSON 파일을 사용하여 비즈니스 용어집과 항목 링크 가져오기

이 페이지에서는 Dataplex API를 사용하여 용어집, 카테고리, 용어, 항목 링크를 Knowledge Catalog (이전의 Dataplex Universal Catalog)로 일괄 가져오는 방법을 설명합니다. 이 프로세스를 사용하여 다른 카탈로그 도구에서 메타데이터를 마이그레이션하거나 Cloud Storage에 JSON 파일을 업로드하여 기존 용어집을 일괄 업데이트할 수 있습니다.

시작하기 전에

가져오기 프로세스를 시작하기 전에 다음의 사전 요구사항을 완료하세요.

Cloud Storage 버킷 만들기

가져오기 파일의 스테이징 영역 역할을 할 Cloud Storage 버킷을 만듭니다.

필요한 역할

JSON 파일을 사용하여 용어집과 항목 링크를 가져오는 데 필요한 권한을 얻으려면 관리자에게 다음의 IAM 역할을 부여해 달라고 요청하세요.

역할 부여에 대한 자세한 내용은 프로젝트, 폴더, 조직에 대한 액세스 관리를 참조하세요.

커스텀 역할이나 다른 사전 정의된 역할을 통해 필요한 권한을 얻을 수도 있습니다.

API 사용 설정

용어집과 항목 링크를 가져오려면 프로젝트에서 Dataplex API를 사용 설정하세요.

API 사용 설정에 필요한 역할

API를 사용 설정하려면 serviceusage.services.enable 권한이 포함된 서비스 사용량 관리자 IAM 역할(roles/serviceusage.serviceUsageAdmin)이 필요합니다. 역할 부여 방법 알아보기.

API 사용 설정

JSON 파일을 사용하여 용어집 가져오기

용어집, 카테고리, 용어를 가져오려면 다음 작업을 완료하세요.

대상 용어집 만들기

메타데이터를 가져올 대상 용어집을 만듭니다.

alias gcurl='curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json"'

gcurl -X POST https://dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/glossaries?glossary_id=GLOSSARY_ID -d "$(cat<<EOF

{
   "displayName": "DISPLAY_NAME",
   "description": "DESCRIPTION"
}
EOF
)"

다음을 바꿉니다.

  • PROJECT_ID: 용어집을 만들 프로젝트의 ID
  • LOCATION_ID: 용어집을 만들 위치
  • GLOSSARY_ID: 용어집 ID
  • DISPLAY_NAME: 용어집 표시 이름
  • DESCRIPTION: 용어집 설명

JSON 파일 준비

Cloud Storage 버킷에 업로드할 용어집, 카테고리, 용어가 포함된 줄바꿈으로 구분된 JSON 파일을 만듭니다.

다음 JSON 스키마를 사용하여 가져오기 파일을 구성합니다.

   {"entry":{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","entryType":"projects/dataplex-types/locations/global/entryTypes/glossary-category","aspects":{"dataplex-types.global.glossary-category-aspect":{"data":{}},"dataplex-types.global.overview":{"data":{"content":"CONTENT"}},"dataplex-types.global.contacts":{"data":{"identities":[{role: "steward", name: "CONTACT_DISPLAY_NAME", id: "CONTACT_EMAIL"}]}}},"parentEntry":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","entrySource":{"resource":"projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","displayName":"CATEGORY_NAME","description":"CATEGORY_DESCRIPTION","ancestors":[{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary"}]}}}
   {"entry":{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM1_ID","entryType":"projects/dataplex-types/locations/global/entryTypes/glossary-term","aspects":{"dataplex-types.global.glossary-term-aspect":{"data":{}},"dataplex-types.global.overview":{"data":{"content":"TERM1_CONTENT"}},"dataplex-types.global.contacts":{"data":{"identities":[{role: "steward", name: "CONTACT_DISPLAY_NAME", id: "CONTACT_EMAIL"}]}}},"parentEntry":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","entrySource":{"resource":"projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM1_ID","displayName":"TERM1_DISPLAY_NAME","description":"TERM1_DESCRIPTION","ancestors":[{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary"},{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary-category"}]}}}
   {"entry":{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM2_ID","entryType":"projects/dataplex-types/locations/global/entryTypes/glossary-term","aspects":{"dataplex-types.global.glossary-term-aspect":{"data":{}},"dataplex-types.global.overview":{"data":{"content":"TERM1_CONTENT"}},"dataplex-types.global.contacts":{"data":{"identities":[{role: "steward", name: "CONTACT_DISPLAY_NAME", id: "CONTACT_EMAIL"}]}}},"parentEntry":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","entrySource":{"resource":"projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM2_ID","displayName":"TERM2_DISPLAY_NAME","description":"TERM2_DESCRIPTION","ancestors":[{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary"},{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary-category"}]}}}

용어집, 카테고리, 용어 가져오기

용어집, 카테고리, 용어를 Cloud Storage 버킷으로 가져옵니다.

# Set GCURL alias
alias gcurl='curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json"'

gcurl https://DATAPLEX_API/metadataJobs?metadata_job_id=JOB_ID -d "$(cat<<EOF
{
   "type":"IMPORT",
   "import_spec":{
      "log_level":"DEBUG",
      "source_storage_uri":"gs://STORAGE_BUCKET/",
      "entry_sync_mode":"FULL",
      "aspect_sync_mode":"INCREMENTAL",
      "scope":{
         "glossaries": "GLOSSARY_NAME"
      }
   }
}
EOF
)"

DATAPLEX_API를 Dataplex API 엔드포인트 형식 dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID으로 바꿉니다.

항목 링크를 일괄 가져오려면 다음 작업을 완료하세요.

Cloud Storage 버킷에 업로드할 항목 링크가 포함된 줄바꿈으로 구분된 JSON 파일을 만듭니다.

다음 JSON 스키마를 사용하여 가져오기 파일을 구성합니다.

용어 간 링크 샘플 형식

   {"entryLink":{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entryLinks/el-import-0606e3f2-8206-4f3a-aba9-32c6196f6048","entryLinkType":"projects/dataplex-types/locations/global/entryLinkTypes/synonym","entryReferences":[{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-1"},{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-2"}]}}
   {"entryLink":{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entryLinks/el-import-2f7408e3-af3d-405d-81bb-861cf9ec5146","entryLinkType":"projects/dataplex-types/locations/global/entryLinkTypes/related","entryReferences":[{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-1"},{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-2"}]}}
   

용어와 데이터 애셋 간의 링크 샘플 형식

   {"entryLink":{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entryLinks/el-import-0606e3f2-8206-4f3a-aba9-32c6196f6048","entryLinkType":"projects/dataplex-types/locations/global/entryLinkTypes/definition","entryReferences":[{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-1"},{"name":"projects/PROJECT_NUMBER/locations/us-central1/entryGroups/entry-group-1/entries/entry-1"}]}}
   

카탈로그 용어 간에 synonym 또는 related 연결을 설정하는 메타데이터 가져오기 작업을 시작하려면 다음 명령어를 사용하세요.

gcurl https://DATAPLEX_API/metadataJobs?metadata_job_id=JOB_ID -d "$(cat<<EOF
   {
      "type":"IMPORT",
      "import_spec":{
         "log_level":"DEBUG",
         "source_storage_uri":"gs://STORAGE_BUCKET/",
         "entry_sync_mode":"FULL",
         "aspect_sync_mode":"INCREMENTAL",
         "scope":{
            "entry_groups":[  "projects/PROJECT_ID/locations/LOCATION_ID/entryGroups/@dataplex"
            ],
            "entry_link_types":[
               "projects/dataplex-types/locations/global/entryLinkTypes/synonym",
               "projects/dataplex-types/locations/global/entryLinkTypes/related"
            ],
            "referenced_entry_scopes":[
               "PROJECT_IDS"
            ]
         }
      }
   }
EOF
)"

DATAPLEX_APIdataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID로 바꿉니다.

용어집 용어와 데이터 애셋 간의 링크를 만들려면 데이터 애셋 항목이 속한 항목 그룹마다 가져오기 작업을 실행합니다. 모든 정의 항목 링크는 이 항목 그룹에서 생성됩니다.

   gcurl https://DATAPLEX_API/metadataJobs?metadata_job_id=JOB_ID -d "$(cat<<EOF
   {
   "type":"IMPORT",
   "import_spec":{
      "log_level":"DEBUG",
      "source_storage_uri":"gs://STORAGE_BUCKET/",
      "entry_sync_mode":"FULL",
      "aspect_sync_mode":"INCREMENTAL",
      "scope":{
         "entry_groups":[  "projects/PROJECT_ID/locations/ENTRY_GROUP_LOCATION_ID/entryGroups/@dataplex"
         ],
         "entry_link_types":[
            "projects/dataplex-types/locations/global/entryLinkTypes/definition"
         ],
         "referenced_entry_scopes":[
            "PROJECT_IDS"
         ]
      }
   }
   }
EOF
)"

DATAPLEX_APIdataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID로 바꿉니다.

가져오기 작업 상태 모니터링

  1. import 작업의 상태를 가져오려면 다음 명령어를 실행하세요.

    gcurl https://dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/operations/operation-OPERATION_ID

    OPERATION_ID를 작업의 ID로 바꿉니다.

  2. 메타데이터 작업의 상태를 가져오려면 다음 명령어를 실행하세요.

    gcurl -X GET https://dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/metadataJobs/JOB_ID

다음 단계