This page describes how to bulk import glossaries, categories, terms, and entry links into Knowledge Catalog (formerly Dataplex Universal Catalog) using the Dataplex API. You can use this process to migrate metadata from other cataloging tools or to perform bulk updates to your existing glossaries by uploading JSON files to Cloud Storage.
Before you begin
Before you start the import process, complete the following prerequisites:
Create a Cloud Storage bucket
Create a Cloud Storage bucket to serve as a staging area for import files.
Required roles
To get the permissions that you need to import glossaries and entry links using JSON files, ask your administrator to grant you the following IAM roles:
- Dataplex Metadata Editor (
roles/dataplex.metadataEditor) on the project - Storage Object Viewer (
roles/storage.objectViewer) on the bucket containing your import files
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Enable APIs
To import glossaries and entry links, enable the Dataplex API in your project.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM
role (roles/serviceusage.serviceUsageAdmin), which
contains the serviceusage.services.enable permission. Learn how to grant
roles.
Import glossaries using JSON files
To import glossaries, categories, and terms, complete the following tasks.
Create a target glossary
Create a target glossary to import metadata.
alias gcurl='curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json"' gcurl -X POST https://dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/glossaries?glossary_id=GLOSSARY_ID -d "$(cat<<EOF { "displayName": "DISPLAY_NAME", "description": "DESCRIPTION" } EOF )"
Replace the following:
PROJECT_ID: the project ID in which you're creating the glossaryLOCATION_ID: the location in which you want to create the glossaryGLOSSARY_ID: the glossary IDDISPLAY_NAME: the display name of the glossaryDESCRIPTION: the description of the glossary
Prepare JSON files
Create a newline-delimited JSON file with glossary, categories, and terms that you want to upload to the Cloud Storage bucket.
Use the following JSON schema to structure your import files:
{"entry":{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","entryType":"projects/dataplex-types/locations/global/entryTypes/glossary-category","aspects":{"dataplex-types.global.glossary-category-aspect":{"data":{}},"dataplex-types.global.overview":{"data":{"content":"CONTENT"}},"dataplex-types.global.contacts":{"data":{"identities":[{role: "steward", name: "CONTACT_DISPLAY_NAME", id: "CONTACT_EMAIL"}]}}},"parentEntry":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","entrySource":{"resource":"projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","displayName":"CATEGORY_NAME","description":"CATEGORY_DESCRIPTION","ancestors":[{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary"}]}}} {"entry":{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM1_ID","entryType":"projects/dataplex-types/locations/global/entryTypes/glossary-term","aspects":{"dataplex-types.global.glossary-term-aspect":{"data":{}},"dataplex-types.global.overview":{"data":{"content":"TERM1_CONTENT"}},"dataplex-types.global.contacts":{"data":{"identities":[{role: "steward", name: "CONTACT_DISPLAY_NAME", id: "CONTACT_EMAIL"}]}}},"parentEntry":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","entrySource":{"resource":"projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM1_ID","displayName":"TERM1_DISPLAY_NAME","description":"TERM1_DESCRIPTION","ancestors":[{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary"},{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary-category"}]}}} {"entry":{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM2_ID","entryType":"projects/dataplex-types/locations/global/entryTypes/glossary-term","aspects":{"dataplex-types.global.glossary-term-aspect":{"data":{}},"dataplex-types.global.overview":{"data":{"content":"TERM1_CONTENT"}},"dataplex-types.global.contacts":{"data":{"identities":[{role: "steward", name: "CONTACT_DISPLAY_NAME", id: "CONTACT_EMAIL"}]}}},"parentEntry":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","entrySource":{"resource":"projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/terms/TERM2_ID","displayName":"TERM2_DISPLAY_NAME","description":"TERM2_DESCRIPTION","ancestors":[{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary"},{"name":"projects/PROJECT_NUMBER/locations/LOCATION_ID/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/LOCATION_ID/glossaries/GLOSSARY_ID/categories/CATEGORY_ID","type":"projects/dataplex-types/locations/global/entryTypes/glossary-category"}]}}}
Import glossary, categories, and terms
Import the glossary, categories, and terms into the Cloud Storage bucket:
# Set GCURL alias alias gcurl='curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json"' gcurl https://DATAPLEX_API/metadataJobs?metadata_job_id=JOB_ID -d "$(cat<<EOF { "type":"IMPORT", "import_spec":{ "log_level":"DEBUG", "source_storage_uri":"gs://STORAGE_BUCKET/", "entry_sync_mode":"FULL", "aspect_sync_mode":"INCREMENTAL", "scope":{ "glossaries": "GLOSSARY_NAME" } } } EOF )"
Replace DATAPLEX_API with the Dataplex API
endpoint of the format
dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID.
Import entry links using JSON files
To bulk import entry links, complete the following tasks.
Prepare JSON files
Create a newline-delimited JSON file with the entry links that you want to upload to the Cloud Storage bucket.
Use the following JSON schema to structure your import files:
Sample format for links between terms
{"entryLink":{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entryLinks/el-import-0606e3f2-8206-4f3a-aba9-32c6196f6048","entryLinkType":"projects/dataplex-types/locations/global/entryLinkTypes/synonym","entryReferences":[{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-1"},{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-2"}]}} {"entryLink":{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entryLinks/el-import-2f7408e3-af3d-405d-81bb-861cf9ec5146","entryLinkType":"projects/dataplex-types/locations/global/entryLinkTypes/related","entryReferences":[{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-1"},{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-2"}]}}
Sample format for links between terms and data assets
{"entryLink":{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entryLinks/el-import-0606e3f2-8206-4f3a-aba9-32c6196f6048","entryLinkType":"projects/dataplex-types/locations/global/entryLinkTypes/definition","entryReferences":[{"name":"projects/PROJECT_NUMBER/locations/global/entryGroups/@dataplex/entries/projects/PROJECT_NUMBER/locations/global/glossaries/import-glossary/terms/term-1"},{"name":"projects/PROJECT_NUMBER/locations/us-central1/entryGroups/entry-group-1/entries/entry-1"}]}}
Import links between terms as synonym or related terms
To initiate a metadata import job that establishes synonym or related
connections between your catalog terms, use the following command:
gcurl https://DATAPLEX_API/metadataJobs?metadata_job_id=JOB_ID -d "$(cat<<EOF { "type":"IMPORT", "import_spec":{ "log_level":"DEBUG", "source_storage_uri":"gs://STORAGE_BUCKET/", "entry_sync_mode":"FULL", "aspect_sync_mode":"INCREMENTAL", "scope":{ "entry_groups":[ "projects/PROJECT_ID/locations/LOCATION_ID/entryGroups/@dataplex" ], "entry_link_types":[ "projects/dataplex-types/locations/global/entryLinkTypes/synonym", "projects/dataplex-types/locations/global/entryLinkTypes/related" ], "referenced_entry_scopes":[ "PROJECT_IDS" ] } } } EOF )"
Replace DATAPLEX_API with
dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID.
Import links between terms and specific data assets or columns
To create a link between glossary terms and data assets, run the import job for each entry group to which the entry for the data asset belongs. All definition entry links are created in this entry group.
gcurl https://DATAPLEX_API/metadataJobs?metadata_job_id=JOB_ID -d "$(cat<<EOF { "type":"IMPORT", "import_spec":{ "log_level":"DEBUG", "source_storage_uri":"gs://STORAGE_BUCKET/", "entry_sync_mode":"FULL", "aspect_sync_mode":"INCREMENTAL", "scope":{ "entry_groups":[ "projects/PROJECT_ID/locations/ENTRY_GROUP_LOCATION_ID/entryGroups/@dataplex" ], "entry_link_types":[ "projects/dataplex-types/locations/global/entryLinkTypes/definition" ], "referenced_entry_scopes":[ "PROJECT_IDS" ] } } } EOF )"
Replace DATAPLEX_API with
dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID.
Monitor the import job status
To get the status of the
importoperation, run the following command:gcurl https://dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/operations/operation-OPERATION_IDReplace
OPERATION_IDwith the ID of the operation.To get the status of the metadata job, run the following command:
gcurl -X GET https://dataplex.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/metadataJobs/JOB_ID
What's next
- Learn how to manage business glossaries in Knowledge Catalog.
- Learn how to import glossaries from a Google Sheet.
- Learn more about metadata management.