"Managed Service for Apache Spark" is the new name for the product formerly known as "Dataproc on Compute Engine" (cluster deployment) and "Google Cloud Serverless for Apache Spark" (serverless deployment).

Managed Service for Apache Spark serverless overview

Managed Service for Apache Spark serverless lets you run Spark workloads without requiring you to provision and manage your own cluster. There are two ways to run Managed Service for Apache Spark workloads: batch workloads and interactive sessions.

Batch workloads

Submit a batch workload using the Google Cloud console, Google Cloud CLI, or REST API. Managed Service for Apache Spark runs the workload on a managed compute infrastructure, autoscaling resources as needed. Charges apply only to the time when the workload is executing.

Batch workload capabilities

You can run the following batch workload types:

PySpark
Spark SQL
Spark R
Spark (Java or Scala)

You can specify Spark properties when you submit a batch workload.

Schedule batch workloads

You can schedule a Spark batch workload as part of an Airflow or Managed Service for Apache Airflow workflow using an Airflow batch operator. For more information, see Run Managed Service for Apache Spark serverless workloads with Managed Airflow.

Get started

To get started, see Run an Apache Spark batch workload.

Interactive sessions

Write and run code in Jupyter notebooks during an interactive session. You can create a notebook session in the following ways:

Run PySpark code in BigQuery Studio notebooks. Open a BigQuery Python notebook to create a Spark-Connect-based interactive session. Each BigQuery notebook can have only one active session associated with it.
Use the JupyterLab plugin to create multiple Jupyter notebook sessions from templates that you create and manage. When you install the plugin on a local machine or Compute Engine VM, different cards that correspond to different Spark kernel configurations appear on the JupyterLab launcher page. Click a card to create a Managed Service for Apache Spark notebook session, then start writing and testing your code in the notebook.

The JupyterLab plugin also lets you use the JupyterLab launcher page to take the following actions:
- Create Managed Service for Apache Spark clusters.
- Submit jobs to clusters.
- View Google Cloud and Spark logs.

Security compliance

Managed Service for Apache Spark adheres to all data residency, CMEK, VPC-SC, and other security requirements that Managed Service for Apache Spark is compliant with.

Managed Service for Apache Spark serverless overview Stay organized with collections Save and categorize content based on your preferences.