Managed Service for Apache Spark optional Hive WebHCat component

You can install additional components like Hive WebHCat when you create a Managed Service for Apache Spark cluster using the Optional components feature. This page describes the Hive WebHCat component.

The Hive WebHCat component provides a REST API for HCatalog. The REST service is available on port 50111 on the cluster's first master node.

Install the component

Install the component when you create a Managed Service for Apache Spark cluster. Components can be added to clusters created with Managed Service for Apache Spark version 1.3 and later.

See Supported Managed Service for Apache Spark versions for the component version included in each Managed Service for Apache Spark image release.

Google Cloud console

  1. In the Google Cloud console, open Create cluster page.
  2. Click Additional configuration to expand that section.
  3. Edit Optional components.
  4. In the panel that opens, select the checkbox for Hive WebHCat, then click Save.

gcloud CLI

To create a Managed Service for Apache Spark cluster that includes the Hive WebHCat component, use the gcloud dataproc clusters create cluster-name command with the --optional-components flag.

gcloud dataproc clusters create cluster-name \
    --optional-components=HIVE_WEBHCAT \
    --region=region \
    ... other args

REST API

The Hive WebHCat component can be specified through the Dataproc API using SoftwareConfig.Component as part of a clusters.create request.