You can install additional components like Zeppelin when you create a Managed Service for Apache Spark cluster using the Optional components feature. This page describes the Zeppelin component.
The Zeppelin Notebook
component is a Web-based notebook for interactive data analytics. The Zeppelin
Web UI is available on port 8080 on the cluster's first master node.
By default, notebooks are saved in Cloud Storage
in the Managed Service for Apache Spark staging bucket, which is specified by the user or
auto-created
when the cluster is created. The location can be changed at cluster creation
time via the
zeppelin:zeppelin.notebook.gcs.dir property.
Install the component
Install the component when you create a Managed Service for Apache Spark cluster. Components can be added to clusters created with Managed Service for Apache Spark version 1.3 and later.
See Supported Dataproc versions for the component version included in each Managed Service for Apache Spark image release.
Google Cloud console
- In the Google Cloud console, open Create cluster page.
- Click Additional configuration to expand that section.
- Edit Optional components.
- In the panel that opens, select the checkbox for Zeppelin Notebook, then click Save.
gcloud CLI
To create a Managed Service for Apache Spark cluster that includes the Zeppelin component,
use the
gcloud dataproc clusters create cluster-name
command with the --optional-components flag.
gcloud dataproc clusters create cluster-name \ --optional-components=ZEPPELIN \ --region=region \ --enable-component-gateway \ ... other flags
REST API
The Zeppelin component can be specified through the Dataproc API using SoftwareConfig.Component as part of a clusters.create request.Open the Zeppelin notebook
See Viewing and Accessing Component Gateway URLs to click Component Gateway links on the Google Cloud console to open the Zeppelin notebook UI running on the cluster's master node in your local browser.