To run Docker containers on your Dataproc cluster nodes, enable the Docker optional component during cluster creation. This document explains how to install and configure the Docker component on Dataproc.
To learn more about other available optional components in Dataproc, see Available optional components.
How the Docker component works
When you enable the Dataproc Docker component, it installs a
Docker daemon
on each cluster node. It also sets up a Linux user and group, both named
"docker", on each node to run the Docker daemon. Additionally, the component
creates "docker" systemd
service to run the dockerd
service. You should use this systemd service to manage the
lifecycle of the Docker service.
Install the component
Install the component when you create a Dataproc cluster. The Docker component can be installed on clusters created with Dataproc image version 1.5 or later.
See Supported Dataproc versions for the component version included in each Dataproc image release.
gcloud command
To create a Dataproc cluster that includes the Docker component,
use the
gcloud dataproc clusters create cluster-name
command with the --optional-components flag.
gcloud dataproc clusters create cluster-name \ --optional-components=DOCKER \ --region=region \ --image-version=1.5 \ ... other flags
REST API
The Docker component can be specified through the Dataproc API using SoftwareConfig.Component as part of a clusters.create request.
Console
- Enable the component.
- In the Google Cloud console, open the Dataproc Create a cluster page. The Set up cluster panel is selected.
- In the Components section:
- Under Optional components, select Docker and other optional components to install on your cluster.
Enable Docker on YARN
See Customize your Spark job runtime environment with Docker on YARN to use a customized Docker image with YARN.
Docker Logging
By default, the Dataproc Docker component writes logs to
Cloud Logging by setting the gcplogs driver—see
Viewing your logs.
Docker Registry
The Dataproc Docker component configures Docker to use Container Registry in addition to the default Docker registries. Docker will use the Docker credential helper to authenticate with Container Registry.
Use the Docker component on a Kerberos cluster
You can install the Docker optional component on a cluster that is being created with Kerberos security enabled.