You specify a Compute Engine region, such as "us-east1" or "europe-west1", when you create a Managed Service for Apache Spark cluster. Managed Service for Apache Spark will isolate cluster resources, such as VM instances and Cloud Storage and metadata storage, within a zone within the specified region.
You can optionally specify a zone within the specified cluster region, such as "us-east1-a" or "europe-west1-b", when you create a cluster. If you don't specify the zone, Managed Service for Apache Spark Auto Zone Placement will choose a zone within your specified cluster region to locate clusters resources.
The regional namespace corresponds to the /regions/REGION
segment of Managed Service for Apache Spark resource URIs (see, for example, the
cluster
networkUri).
Region names
Region names follow a standard naming convention based on
Compute Engine regions.
For example, the name for the Central US region is us-central1, and the name
of the Western Europe region is europe-west1. Run the gcloud compute regions list
command to see a listing of available regions.
Location and regional endpoints
Google Cloud APIs can provide support for locational and regional endpoints:
Locational endpoints ensure that in-transit data remains in the specified location when accessed through private connectivity.
Format:
{location}-{service}.googleapis.comExample:
us-central-1-dataproc.googleapis.comRegional endpoints ensure that in-transit data remains in the specified location when accessed through either private connectivity or the public internet.
Format:
{service}.{location}.rep.googleapis.comExample:
dataproc.us-central1.rep.googleapis.com
The default Managed Service for Apache Spark endpoint is location endpoint. See the Managed Service for Apache Spark release notes for announcements on Managed Service for Apache Spark support of regional endpoints.
Create a cluster
gcloud CLI
When you create a cluster, specify a region using the required
--region flag.
gcloud dataproc clusters create CLUSTER_NAME \ --region=REGION \ other args ...
REST API
Use the REGION URL parameter in a
clusters.create
request to specify the cluster region.
gRPC
Set the client transport address to the locational endpoint using the following pattern:
REGION-dataproc.googleapis.com
Python (google-cloud-python) example:
from google.cloud import dataproc_v1
from google.cloud.dataproc_v1.gapic.transports import cluster_controller_grpc_transport
transport = cluster_controller_grpc_transport.ClusterControllerGrpcTransport(
address='us-central1-dataproc.googleapis.com:443')
client = dataproc_v1.ClusterControllerClient(transport)
project_id = 'my-project'
region = 'us-central1'
cluster = {...}Java (google-cloud-java) example:
ClusterControllerSettings settings =
ClusterControllerSettings.newBuilder()
.setEndpoint("us-central1-dataproc.googleapis.com:443")
.build();
try (ClusterControllerClient clusterControllerClient = ClusterControllerClient.create(settings)) {
String projectId = "my-project";
String region = "us-central1";
Cluster cluster = Cluster.newBuilder().build();
Cluster response =
clusterControllerClient.createClusterAsync(projectId, region, cluster).get();
}Console
Specify a Managed Service for Apache Spark region in the Location section of the Set up cluster panel on the Managed Service for Apache Spark Create a cluster page in the Google Cloud console.
What's next
- Geography and Regions
- Compute Engine Engine→Regions and Zones
- Compute Engine→Global, Regional, and Zonal Resources
- Managed Service for Apache Spark Auto Zone Placement