Cloud Composer 3 | Cloud Composer 2 | Cloud Composer 1
This page explains how to create a Cloud Composer environment.
- For more information about environments, see Environment architecture.
- For more information about creating an environment with Terraform, see Create environments (Terraform).
Before you begin
Enable the Cloud Composer API. For the full list of services used by Cloud Composer, see Services required by Cloud Composer.
The approximate time to create an environment is 25 minutes.
If you create an environment with Terraform, the service account used by Terraform must have a role with the
composer.environments.createpermission enabled.For more information about the service account for Terraform, see Google Provider Configuration Reference.
For more information about using Terraform to create a Cloud Composer environment, see Terraform documentation.
For more information about additional parameters, see Terraform Argument Reference.
Private IP: There are specific network and peering requirements to create a Private IP environment. For more information, see Configuring private IP.
Shared VPC: There are specific network requirements to use Shared VPC with Cloud Composer. For information, see Configuring shared VPC.
VPC SC: To deploy Cloud Composer environments inside a security perimeter, see Configuring VPC SC. When used with Cloud Composer, VPC Service Controls have several known limitations.
You might notice Compute Engine metadata entries for the GKE clusters that belong to environments in your project. During GKE cluster creation and upgrades, Google Kubernetes Engine automatically adds project-level metadata entries (
google_compute_project_metadata) to track secondary IP address range usage. Don't modify or remove these entries. Google Kubernetes Engine manages them automatically.
Step 1. Create or choose an environment's service account
When you create an environment, you specify a service account. This service account is called environment's service account. Your environment uses this service account to perform most of the operations.
The service account for your environment is not a user account. A service account is a special kind of account used by an application or a virtual machine (VM) instance, not a person.
You can't change the service account of your environment later.
If you don't have a service account for Cloud Composer environments in your project yet, create it.
See Create environments (Terraform) for an extended example of creating a service account for your environment in Terraform.
To create a new service account for your environment:
Create a new service account as described in the Identity and Access Management documentation.
Grant a role to it, as described in the Identity and Access Management documentation. The required role is Composer Worker (
composer.worker).If your environment uses resource location restrictions, or installs PyPI packages from an Artifact Registry repository or from a private repository, then grant the Service Account User (
iam.serviceAccountUser) role to the user-managed service account that runs your environment on itself (both the principal and the resource are the same service account).To access other resources in your Google Cloud project, grant extra permissions to access those resources to this service account. The Composer Worker (
composer.worker) role provides this required set of permissions in most cases. Add extra permissions to this service account only when it's necessary for the operation of your DAGs.
Step 2. Basic setup
This step creates a Cloud Composer environment with default parameters in the specified location.
Console
In the Google Cloud console, go to the Create environment page.
In the Name field, enter a name for your environment.
The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens, and can't end with a hyphen. The environment name is used to create subcomponents for the environment, so you must provide a name that is also valid as a Cloud Storage bucket name. See Bucket naming guidelines for a list of restrictions.
In the Location drop-down list, choose a location for your environment.
A location is the region where the environment is located.
In the Image version drop-down list, select a Cloud Composer image with the required version of Airflow.
In the Service account drop-down list, select a service account for your environment.
If you don't have a service account for your environment yet, see Create or choose an environment's service account.
gcloud
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version IMAGE_VERSION \
--service-account "SERVICE_ACCOUNT"
Replace:
ENVIRONMENT_NAMEwith the name of the environment.The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens, and can't end with a hyphen. The environment name is used to create subcomponents for the environment, so you must provide a name that is also valid as a Cloud Storage bucket name. See Bucket naming guidelines for a list of restrictions.
LOCATIONwith the region for the environment.A location is the region where the environment is located.
SERVICE_ACCOUNTwith the service account for your environment.IMAGE_VERSIONwith the name of a Cloud Composer image.
Example:
gcloud composer environments create example-environment \
--location us-central1 \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "
example-account@example-project.iam.gserviceaccount.com
"
API
Construct an environments.create API request. Specify the
configuration in the Environment resource.
{
"name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
"config": {
"softwareConfig": {
"imageVersion": "IMAGE_VERSION"
},
"nodeConfig": {
"serviceAccount": "SERVICE_ACCOUNT"
}
}
}
Replace:
PROJECT_IDwith the Project ID.LOCATIONwith the region for the environment.A location is the region where the environment is located.
ENVIRONMENT_NAMEwith the environment name.The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens, and can't end with a hyphen. The environment name is used to create subcomponents for the environment, so you must provide a name that is also valid as a Cloud Storage bucket name. See Bucket naming guidelines for a list of restrictions.
IMAGE_VERSIONwith the name of a Cloud Composer image.SERVICE_ACCOUNTwith the service account for your environment.
Example:
// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments
{
"name": "projects/example-project/locations/us-central1/environments/example-environment",
"config": {
"softwareConfig": {
"imageVersion": "composer-2.14.4-airflow-2.10.5"
},
"nodeConfig": {
"serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Terraform
To create an environment with default parameters is a specified location,
add the following resource block to your Terraform configuration and run
terraform apply.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
software_config {
image_version = "IMAGE_VERSION"
}
node_config {
service_account = "SERVICE_ACCOUNT"
}
}
}
Replace:
ENVIRONMENT_NAMEwith the name of the environment.The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens, and can't end with a hyphen. The environment name is used to create subcomponents for the environment, so you must provide a name that is also valid as a Cloud Storage bucket name. See Bucket naming guidelines for a list of restrictions.
LOCATIONwith the region for the environment.A location is the region where the environment is located.
IMAGE_VERSIONwith the name of a Cloud Composer image.SERVICE_ACCOUNTwith the service account for your environment.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
software_config {
image_version = "composer-2.14.4-airflow-2.10.5"
}
node_config {
service_account = "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Step 3. Grant required permissions to Cloud Composer service account
When you enable Cloud Composer API in your project, the Composer Service Agent account is created in your project. Cloud Composer uses this account to perform operations in your Google Cloud project.
Cloud Composer v2 API Service Agent Extension role provides additional permissions to Cloud Composer Service Agent account. This role is not granted automatically. You must grant it manually.
Console
When you create an environment in your project, if Cloud Composer Service Agent does not have required permissions on the environment's service account, then a Grant required permissions to Cloud Composer service account section appears.
You add Cloud Composer Service Agent account as a new principal on your environment's service account and grant the Cloud Composer v2 API Service Agent Extension role to it.
Confirm that you use the intended service account for your environment, and click Grant.
gcloud
You add Cloud Composer Service Agent account as a new principal on your environment's service account and grant the Cloud Composer v2 API Service Agent Extension role to it.
gcloud iam service-accounts add-iam-policy-binding \
SERVICE_ACCOUNT \
--member serviceAccount:service-PROJECT_NUMBER@cloudcomposer-accounts.iam.gserviceaccount.com \
--role roles/composer.ServiceAgentV2Ext
Replace:
SERVICE_ACCOUNTwith the service account for your environment.PROJECT_NUMBERwith the Project number.
Example:
gcloud iam service-accounts add-iam-policy-binding \
example-account@example-project.iam.gserviceaccount.com \
--member serviceAccount:service-00000000000@cloudcomposer-accounts.iam.gserviceaccount.com \
--role roles/composer.ServiceAgentV2Ext
API
To grant the role, you must modify the existing allow policy using the read-modify-write pattern:
- Read the existing allow policy for the service account of your environment.
- Modify it to include the
roles/composer.ServiceAgentV2Extrole for the Cloud Composer Service Agent. - Rewrite the existing allow policy.
For more information, see Controlling access programmatically.
{
"role": "roles/composer.ServiceAgentV2Ext",
"members": [
"serviceAccount:service-PROJECT_NUMBER@cloudcomposer-accounts.iam.gserviceaccount.com"
]
}
Replace:
PROJECT_NUMBERwith the Project number.
Example:
{
"role": "roles/composer.ServiceAgentV2Ext",
"members": [
"serviceAccount:service-00000000000@cloudcomposer-accounts.iam.gserviceaccount.com"
]
}
Terraform
Add a new role binding to your environment's service account allow policy.
You add Cloud Composer Service Agent account as a new principal on your environment's service account and grant the Cloud Composer v2 API Service Agent Extension role to it.
If you are not using Terraform to define your environment's service account allow policy, do not use the following example. Instead, add this binding using other methods.
resource "google_service_account_iam_member" "custom_service_account" {
provider = google-beta
service_account_id = "SERVICE_ACCOUNT"
role = "roles/composer.ServiceAgentV2Ext"
member = "serviceAccount:service-PROJECT_NUMBER@cloudcomposer-accounts.iam.gserviceaccount.com"
}
Replace:
SERVICE_ACCOUNTwith the service account for your environment.PROJECT_NUMBERwith the Project number.
Example:
resource "google_service_account_iam_member" "custom_service_account" {
provider = google-beta
service_account_id = "example-account@example-project.iam.gserviceaccount.com"
role = "roles/composer.ServiceAgentV2Ext"
member = "serviceAccount:service-00000000000@cloudcomposer-accounts.iam.gserviceaccount.com"
}
Step 4. (Optional) Configure environment scale and performance parameters
To specify the scale and performance configuration for your environment, select the environment size and workloads configuration.
You can change all performance and scale parameters after you create an environment.
Following parameters control the scale and performance:
Environment size. Controls the performance parameters of the managed Cloud Composer infrastructure that includes the Airflow database. Consider selecting a larger environment size if you want to run a large number of DAGs and tasks with higher infrastructure performance. For example, larger environment's size increases the amount of Airflow task log entries that your environment can process with minimal delay.
Workloads configuration. Controls the scale and performance of Airflow components that run in a GKE cluster of your environment.
Airflow scheduler. Parses DAG definition files, schedules DAG runs based on the schedule interval, and queues tasks for execution by Airflow workers.
Your environment can run more than one Airflow scheduler at the same time. Use multiple schedulers to distribute load between several scheduler instances for better performance and reliability.
Increasing the number of schedulers does not always improve Airflow performance. For example, having only one scheduler might provide better performance than having two. This might happen when the extra scheduler is not utilized, and thus consumes resources of your environment without contributing to overall performance. The actual scheduler performance depends on the number of Airflow workers, the number of DAGs and tasks that run in your environment, and the configuration of both Airflow and the environment.
We recommend starting with two schedulers and then monitoring the performance of your environment. If you change the number of schedulers, you can always scale your environment back to the original number of schedulers.
For more information about configuring multiple schedulers, see Airflow documentation.
Airflow triggerer. Asynchronously monitors all deferred tasks in your environment. If you have at least one triggerer instance in your environment (or at least two in highly resilient environments), you can use deferrable operators in your DAGs.
In Cloud Composer 2, the Airflow triggerer is disabled by default. If you want to create an environment with a triggerer, set the number of triggerers to one or more.
Airflow web server. Runs the Airflow web interface where you can monitor, manage, and visualize your DAGs.
Airflow workers. Execute tasks that are scheduled by Airflow schedulers. The minimum and maximum number of workers in your environment changes dynamically depending on the number of tasks in the queue.
Console
You can select a preset for your environment. When you select a preset, the scale and performance parameters for that preset are automatically selected. You also have an option to select a custom preset and specify all scale and performance parameters for your environment.
To select the scale and performance configuration for your environment, on the Create environment page:
To use predefined values, in the Environment resources section, click Small, Medium, or Large.
To specify custom values for the scale and performance parameters:
In the Environment resources section, click Custom.
In the Scheduler section, set the number of schedulers you want to use, and the resource allocation for their CPU, memory, and storage.
In the Triggerer section, use the Number of triggerers field to enter the number of triggerers in your environment.
If you don't want to use deferrable operators in your DAGs, set the number of triggerers to zero. By default, the triggerer is disabled in Cloud Composer 2.
If you set at least one triggerer for your environment, use the the CPU, and Memory fields to configure resource allocation for your triggerers.
In the DAG processor section, specify the number of DAG processors in your environment and the amount of CPUs, memory, and storage for each DAG processor.
Highly resilient environments require at least two DAG processors.
In the Web server section, specify the amount of CPUs, memory, and storage for the web server.
In the Worker section, specify:
- The minimum and maximum number of workers for autoscaling limits in your environment.
- The CPU, memory, and storage allocation for your workers
In the Core infrastructure section, in the Environment size drop-down list, select the environment size.
gcloud
When you create an environment, the following arguments control the scale and performance parameters of your environment.
--environment-sizespecifies the environment size.--scheduler-countspecifies the number of schedulers.--scheduler-cpuspecifies the number of CPUs for an Airflow scheduler.--scheduler-memoryspecifies the amount of memory for an Airflow scheduler.--scheduler-storagespecifies the amount of disk space for an Airflow scheduler.--triggerer-countspecifies the number of Airflow triggerers in your environment. The default value for this flag is0. You need triggerers if you want to use deferrable operators in your DAGs.- For standard resilience environments, use a value between
0and10. - For highly resilient environments, use
0or a value between2and10.
- For standard resilience environments, use a value between
--triggerer-cpuspecifies the number of CPUs for an Airflow triggerer, in vCPU units. Allowed values:0.5,0.75,1. The default value is0.5.--triggerer-memoryspecifies the amount of memory for an Airflow triggerer, in GB. The default value is0.5.The minimum required memory is equal to the number of CPUs allocated for the triggerers. The maximum allowed value is equal to the number of triggerer CPUs multiplied by 6.5.
For example, if you set the
--triggerer-cpuflag to1, the minimum value for--triggerer-memoryis1and the maximum value is6.5.--web-server-cpuspecifies the number of CPUs for the Airflow web server.--web-server-memoryspecifies the amount of memory for the Airflow web server.--web-server-storagespecifies the amount of disk space for the Airflow web server.--worker-cpuspecifies the number of CPUs for an Airflow worker.--worker-memoryspecifies the amount of memory for an Airflow worker.--worker-storagespecifies the amount of disk space for an Airflow worker.--min-workersspecifies the minimum number of Airflow workers. Your environment's cluster runs at least this number of workers.--max-workersspecifies the maximum number of Airflow workers. Your environment's cluster runs at most this number of workers.
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "SERVICE_ACCOUNT" \
--environment-size ENVIRONMENT_SIZE \
--scheduler-count SCHEDULER_COUNT \
--scheduler-cpu SCHEDULER_CPU \
--scheduler-memory SCHEDULER_MEMORY \
--scheduler-storage SCHEDULER_STORAGE \
--triggerer-count TRIGGERER_COUNT \
--triggerer-cpu TRIGGERER_CPU \
--triggerer-memory TRIGGERER_MEMORY \
--web-server-cpu WEB_SERVER_CPU \
--web-server-memory WEB_SERVER_MEMORY \
--web-server-storage WEB_SERVER_STORAGE \
--worker-cpu WORKER_CPU \
--worker-memory WORKER_MEMORY \
--worker-storage WORKER_STORAGE \
--min-workers WORKERS_MIN \
--max-workers WORKERS_MAX
Replace:
ENVIRONMENT_SIZEwithsmall,medium, orlarge.SCHEDULER_COUNTwith the number of schedulers.SCHEDULER_CPUwith the number of CPUs for a scheduler, in vCPU units.SCHEDULER_MEMORYwith the amount of memory for a scheduler.SCHEDULER_STORAGEwith the disk size for a scheduler.TRIGGERER_COUNTwith the number of triggerers.TRIGGERER_CPUwith the number of CPUs for a triggerer, in vCPU units.TRIGGERER_MEMORYwith the amount of memory for a triggerer, in GB.WEB_SERVER_CPUwith the number of CPUs for the web server, in vCPU units.WEB_SERVER_MEMORYwith the amount of memory for the web server.WEB_SERVER_STORAGEwith the amount of memory for the web server.WORKER_CPUwith the number of CPUs for a worker, in vCPU units.WORKER_MEMORYwith the amount of memory for a worker.WORKER_STORAGEwith the disk size for a worker.WORKERS_MINwith the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a lower number of workers can handle the load.WORKERS_MAXwith the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.
Example:
gcloud composer environments create example-environment \
--location us-central1 \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "
example-account@example-project.iam.gserviceaccount.com
" \
--environment-size small \
--scheduler-count 1 \
--scheduler-cpu 0.5 \
--scheduler-memory 2.5GB \
--scheduler-storage 2GB \
--triggerer-count 1 \
--triggerer-cpu 0.5 \
--triggerer-memory 0.5GB \
--web-server-cpu 1 \
--web-server-memory 2.5GB \
--web-server-storage 2GB \
--worker-cpu 1 \
--worker-memory 2GB \
--worker-storage 2GB \
--min-workers 2 \
--max-workers 4
API
When you create an environment, in the Environment > EnvironmentConfig > WorkloadsConfig resource, specify environment scale and performance parameters.
{
"name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
"config": {
"workloadsConfig": {
"scheduler": {
"cpu": SCHEDULER_CPU,
"memoryGb": SCHEDULER_MEMORY,
"storageGb": SCHEDULER_STORAGE,
"count": SCHEDULER_COUNT
},
"triggerer": {
"count": TRIGGERER_COUNT,
"cpu": TRIGGERER_CPU,
"memoryGb": TRIGGERER_MEMORY
},
"webServer": {
"cpu": WEB_SERVER_CPU,
"memoryGb": WEB_SERVER_MEMORY,
"storageGb": WEB_SERVER_STORAGE
},
"worker": {
"cpu": WORKER_CPU,
"memoryGb": WORKER_MEMORY,
"storageGb": WORKER_STORAGE,
"minCount": WORKERS_MIN,
"maxCount": WORKERS_MAX
}
},
"environmentSize": "ENVIRONMENT_SIZE",
"nodeConfig": {
"serviceAccount": "SERVICE_ACCOUNT"
}
}
}
Replace:
SCHEDULER_CPUwith the number of CPUs for a scheduler, in vCPU units.SCHEDULER_MEMORYwith the amount of memory for a scheduler, in GB.SCHEDULER_STORAGEwith the disk size for a scheduler, in GB.SCHEDULER_COUNTwith the number of schedulers.TRIGGERER_COUNTwith the number of triggerers. The default value is0. You need triggerers if you want to use deferrable operators in your DAGs.- For standard resilience environments, use a value between
0and10. - For highly resilient environments, use
0or a value between2and10.
If you use at least one triggerer, you must also specify the
TRIGGERER_CPU, andTRIGGERER_MEMORYparameters:- For standard resilience environments, use a value between
TRIGGERER_CPUspecifies the number of CPUs for a triggerer, in vCPU units. Allowed values:0.5,0.75,1.TRIGGERER_MEMORYconfigures the amount of memory for a triggerer. The minimum required memory is equal to the number of CPUs allocated for the triggerers. The maximum allowed value is equal to the number of triggerer CPUs multiplied by 6.5.For example, if you set the
TRIGGERER_CPUto1, the minimum value forTRIGGERER_MEMORYis1and the maximum value is6.5.WEB_SERVER_CPUwith the number of CPUs for the web server, in vCPU units.WEB_SERVER_MEMORYwith the amount of memory for the web server, in GB.WEB_SERVER_STORAGEwith the disk size for the web server, in GB.WORKER_CPUwith the number of CPUs for a worker, in vCPU units.WORKER_MEMORYwith the amount of memory for a worker, in GB.WORKER_STORAGEwith the disk size for a worker, in GB.WORKERS_MINwith the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a lower number of workers can handle the load.WORKERS_MAXwith the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.ENVIRONMENT_SIZEwith the environment size,ENVIRONMENT_SIZE_SMALL,ENVIRONMENT_SIZE_MEDIUM, orENVIRONMENT_SIZE_LARGE.
Example:
// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments
{
"name": "projects/example-project/locations/us-central1/environments/example-environment",
"config": {
"workloadsConfig": {
"scheduler": {
"cpu": 2.5,
"memoryGb": 2.5,
"storageGb": 2,
"count": 1
},
"triggerer": {
"cpu": 0.5,
"memoryGb": 0.5,
"count": 1
},
"webServer": {
"cpu": 1,
"memoryGb": 2.5,
"storageGb": 2
},
"worker": {
"cpu": 1,
"memoryGb": 2,
"storageGb": 2,
"minCount": 2,
"maxCount": 4
}
},
"environmentSize": "ENVIRONMENT_SIZE_SMALL",
"nodeConfig": {
"serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Terraform
When you create an environment, following arguments control the scale and performance parameters of your environment.
In the
configblock:- The
environment_sizefield controls the environment size.
- The
In the
workloads_configblock:- The
scheduler.cpufield specifies the number of CPUs for an Airflow scheduler. - The
scheduler.memory_gbfield specifies the amount of memory for an Airflow scheduler. - The
scheduler.storage_gbfield specifies the amount of disk space for a scheduler. - The
scheduler.countfield specifies the number of schedulers in your environment. - The
triggerer.cpufield specifies the number of CPUs for an Airflow triggerer. - The
triggerer.memory_gbfield specifies the amount of memory for an Airflow triggerer. The
triggerer.countfield specifies the number of triggerers in your environment.The
web_server.cpufield specifies the number of CPUs for the Airflow web server.The
web_server.memory_gbfield specifies the amount of memory for the Airflow web server.The
web_server.storage_gbfield specifies the amount of disk space for the Airflow web server.The
worker.cpufield specifies the number of CPUs for an Airflow worker.The
worker.memory_gbfield specifies the amount of memory for an Airflow worker.The
worker.storage_gbfield specifies the amount of disk space for an Airflow worker.The
worker.min_countfield specifies the minimum number of workers in your environment.The
worker.max_countfield specifies the maximum number of workers in your environment.
- The
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
workloads_config {
scheduler {
cpu = SCHEDULER_CPU
memory_gb = SCHEDULER_MEMORY
storage_gb = SCHEDULER_STORAGE
count = SCHEDULER_COUNT
}
triggerer {
count = TRIGGERER_COUNT
cpu = TRIGGERER_CPU
memory_gb = TRIGGERER_MEMORY
}
web_server {
cpu = WEB_SERVER_CPU
memory_gb = WEB_SERVER_MEMORY
storage_gb = WEB_SERVER_STORAGE
}
worker {
cpu = WORKER_CPU
memory_gb = WORKER_MEMORY
storage_gb = WORKER_STORAGE
min_count = WORKERS_MIN
max_count = WORKERS_MAX
}
}
environment_size = "ENVIRONMENT_SIZE"
node_config {
service_account = "SERVICE_ACCOUNT"
}
}
}
Replace:
ENVIRONMENT_NAMEwith the name of the environment.LOCATIONwith the region where the environment is located.SERVICE_ACCOUNTwith the service account for your environment.SCHEDULER_CPUwith the number of CPUs for a scheduler, in vCPU units.SCHEDULER_MEMORYwith the amount of memory for a scheduler, in GB.SCHEDULER_STORAGEwith the disk size for a scheduler, in GB.SCHEDULER_COUNTwith the number of schedulers.TRIGGERER_COUNTwith the number of triggerers.TRIGGERER_CPUwith the number of CPUs for a triggerer, in vCPU units.TRIGGERER_MEMORYwith the amount of memory for a triggerer, in GB.WEB_SERVER_CPUwith the number of CPUs for the web server, in vCPU units.WEB_SERVER_MEMORYwith the amount of memory for the web server, in GB.WEB_SERVER_STORAGEwith the disk size for the web server, in GB.WORKER_CPUwith the number of CPUs for a worker, in vCPU units.WORKER_MEMORYwith the amount of memory for a worker, in GB.WORKER_STORAGEwith the disk size for a worker, in GB.WORKERS_MINwith the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a lower number of workers can handle the load.WORKERS_MAXwith the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.ENVIRONMENT_SIZEwith the environment size,ENVIRONMENT_SIZE_SMALL,ENVIRONMENT_SIZE_MEDIUM, orENVIRONMENT_SIZE_LARGE.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
workloads_config {
scheduler {
cpu = 2.5
memory_gb = 2.5
storage_gb = 2
count = 1
}
triggerer {
count = 1
cpu = 0.5
memory_gb = 0.5
}
dag_processor {
cpu = 1
memory_gb = 2
storage_gb = 1
count = 1
}
web_server {
cpu = 1
memory_gb = 2.5
storage_gb = 2
}
worker {
cpu = 1
memory_gb = 2
storage_gb = 2
min_count = 2
max_count = 4
}
}
environment_size = "ENVIRONMENT_SIZE_SMALL"
node_config {
service_account = "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Step 5. (Optional) Enable high resilience mode
Highly resilient (Highly Available) Cloud Composer environments are environments that use built-in redundancy and failover mechanisms that reduce the environment's susceptibility to zonal failures and single point of failure outages.
A highly resilient environment is multi-zonal and runs across at least two zones of a selected region. The following components run in separate zones:
Exactly two Airflow schedulers
At least two triggerers (if the number of triggerers isn't set to
0)Two web servers
The minimum number of workers is set to two, and your environment's cluster distributes worker instances between zones. In case of a zonal outage, affected worker instances are rescheduled in a different zone. The Cloud SQL component of a highly resilient environment has a primary instance and a standby instance that are distributed between zones.
Console
On the Create environment page:
In the Resilience mode section, select High resilience.
In the Environment resources section, select scale parameters for a highly resilient environment. Highly resilient environments require exactly two schedulers, zero or between two and ten triggerers, and at least two workers:
Click Custom.
In the Number of schedulers drop-down list, select
2.In the Number of triggerers drop-down list, select
0, or a value between2and10. Configure the CPU and Memory allocation for your triggerers.In the Minimum number of workers drop-down list, select
2or more, depending on the required number of workers.
In the Network configuration section:
In the Networking type, select Private IP environment.
If required, specify other networking parameters.
gcloud
When you create an environment, the --enable-high-resilience argument
enables the high resilience mode.
Set the following arguments:
--enable-high-resilience--enable-private-environment, and other networking parameters for a Private IP environment, if required--scheduler-countto2--triggerer-countto0or a value between2and10. If you use triggerers, the--triggerer-cpu and--triggerer-memory` flags are also required for environment creation.For more information about
--triggerer-count,--triggerer-cpu, and--triggerer-memoryflags, see Configure environment scale and performance parameters.--min-workersto2or more
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "SERVICE_ACCOUNT" \
--enable-high-resilience \
--enable-private-environment \
--scheduler-count 2 \
--triggerer-count 2 \
--triggerer-cpu 0.5 \
--triggerer-memory 0.5 \
--min-workers 2
API
When you create an environment, in the Environment > EnvironmentConfig resource, enable the high resilience mode.
{
"name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
"config": {
"resilience_mode": "HIGH_RESILIENCE",
"nodeConfig": {
"serviceAccount": "SERVICE_ACCOUNT"
}
}
}
Example:
// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments
{
"name": "projects/example-project/locations/us-central1/environments/example-environment",
"config": {
"resilience_mode": "HIGH_RESILIENCE",
"nodeConfig": {
"serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Terraform
When you create an environment, the resilience_mode field in the config
block enables the high resilience mode.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
resilience_mode = "HIGH_RESILIENCE"
node_config {
service_account = "SERVICE_ACCOUNT"
}
}
}
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
resilience_mode = "HIGH_RESILIENCE"
node_config {
service_account = "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Step 6. (Optional) Specify a zone for the environment's database
You can specify a preferred Cloud SQL zone when creating a standard resilience environment.
Console
On the Create environment page:
In the Advanced configuration section, expand the Show advanced configuration item.
In the Airflow database zone list, select a preferred Cloud SQL zone.
gcloud
When you create an environment, the --cloud-sql-preferred-zone argument
specifies a preferred Cloud SQL zone.
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "SERVICE_ACCOUNT" \
--cloud-sql-preferred-zone SQL_ZONE
Replace the following:
SQL_ZONE: preferred Cloud SQL zone. This zone must be located in the region where the environment is located.
Example:
gcloud composer environments create example-environment \
--location us-central1 \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "
example-account@example-project.iam.gserviceaccount.com
" \
--cloud-sql-preferred-zone us-central1-a
API
When you create an environment, in the Environment > DatabaseConfig resource, specify the preferred Cloud SQL zone.
{
"name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
"config": {
"databaseConfig": {
"zone": "SQL_ZONE"
},
"nodeConfig": {
"serviceAccount": "SERVICE_ACCOUNT"
}
}
}
Replace the following:
SQL_ZONE: preferred Cloud SQL zone. This zone must be located in the region where the environment is located.
Example:
// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments
{
"name": "projects/example-project/locations/us-central1/environments/example-environment",
"config": {
"databaseConfig": {
"zone": "us-central1-a"
},
"nodeConfig": {
"serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Terraform
When you create an environment, the zone field in the database_config
block specifies the preferred Cloud SQL zone.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
database_config {
zone = "SQL_ZONE"
}
node_config {
service_account = "SERVICE_ACCOUNT"
}
}
}
Replace the following:
SQL_ZONE: preferred Cloud SQL zone. This zone must be located in the region where the environment is located.
Step 7. (Optional) Configure your environment's networking
Networking parameters depend on the type of environment that you want to create:
Public IP environment. Use the default networking parameters.
Private IP environment (with PSC). In this configuration, your environment uses Private Service Connect for connectivity.
Configure your Private IP environment:
- Configure your project's networking for Private IP environments.
- Configure Private Service Connect when you create your environment.
- Specify other parameters for your Private IP environment, as described further in this section.
For a Private IP environment with PSC, you need to know:
- Your VPC network ID
- Your VPC subnetwork ID
Two secondary IP ranges in your VPC subnetwork:
- Secondary IP range for pods
- Secondary IP range for services
IP ranges for the components of the environment:
- GKE Control Plane IP range. IP range for the GKE control plane.
- Cloud Composer connection subnetwork. IP range for the Cloud Composer connection subnetwork.
Private IP environment (VPC peerings). In this configuration, your environment uses VPC peerings for connectivity.
Configure your Private IP environment:
- Configure your project's networking for Private IP environments.
- Specify other parameters for your Private IP environment, as described further in this section.
For a Private IP environment with VPC peerings, you need to know:
- Your VPC network ID
- Your VPC subnetwork ID
Two secondary IP ranges in your VPC subnetwork:
- Secondary IP range for pods
- Secondary IP range for services
IP ranges for the components of the environment:
IP range for the GKE control plane.
IP range for VPC peering to export from the internal Cloud Composer network to the selected network. Cloud Composer infrastructure components use IP addresses from this range.
IP range for the Cloud SQL instance.
For a Shared VPC environment, you must do additional networking setup for the host project, then create a Public or a Private IP environment in a service project. Follow the instructions on the Configuring Shared VPC page.
For a Shared VPC environment, you need to know:
- Your host project VPC network ID
Your host project VPC subnetwork ID
Two secondary IP ranges in your host project VPC subnetwork:
- Secondary IP range for pods
- Secondary IP range for services
When creating a Public IP Shared VPC environment, you still need to specify your host project VPC network, subnetwork, and secondary IP ranges for pods and services.
To create a VPC SC environment, you must create a service perimeter and then create Private IP environments inside this perimeter. Follow the instructions outlined in Configuring VPC Service Controls.
Additional networking options for environments are:
- Privately used public IP addresses. If you want to use more IP addresses, your environment can privately use certain public IP address ranges as internal, subnet IP address ranges for pods and services.
- Authorized networks. If you want to access the control plane of your Private IP environment using HTTPS, you can use authorized networks to specify CIDR ranges that can do so.
- IP Masquerade agent. By using environments with the IP Masquerade agent, you can use many-to-one IP address translations in your environment's networking configurations. For more information about creating environments with IP Masquerade agent, see Enable the IP Masquerade agent.
Console
To create a Private IP environment:
Make sure that your networking is configured for the type of environment that you want to create.
In the Network configuration section, expand the Show network configuration item.
In the Network drop-down list, select your VPC network ID.
In the Subnetwork drop-down list, select your VPC subnetwork ID.
In the Secondary IP range for pods section, select or specify the secondary IP range for pods. You can use an existing secondary range in your VPC network, or choose to use an auto-created range.
In the Secondary IP range for services section, select or specify the secondary IP range for services. You can use an existing secondary range in your VPC network, or choose to use an auto-created range.
In the Networking type section, select the Private IP environment option to create a Private IP environment.
In the Composer connectivity section, select the networking type for your environment and specify IP ranges for environment components:
For an environment that uses Private Service Connect:
Select Private Service Connect for an environment that uses Private Service Connect.
In the Composer connection subnetwork section specify an IP range for the Cloud Composer connection subnetwork. The address for the PSC endpoint is selected from this range. You can specify a custom range, or choose to use the default one.
For an environment that uses VPC peerings:
Select VPC peerings for an environment that uses VPC peerings.
In the IP range for Composer tenant network section specify an IP range for the Cloud Composer tenant network. This network hosts the SQL proxy component of your environment. You can specify a custom range, or choose to use the default one.
In the IP range for Cloud SQL network section, specify an IP range for the Cloud SQL instance. You can specify a custom range, or choose to use the default one.
In the IP range for GKE control plane network section, specify an IP range for the GKE control plane:
To use the default IP range for the region where your environment is located, select Default IP range.
To specify a custom IP range, select Custom IP range and enter a range in the CIDR notation in the GKE cluster master private IP field.
Select the level access for the GKE control plane. The control plane has two endpoints. One endpoint is private, for use by cluster nodes and VMs. Another endpoint is public. You can specify the level of access for the public endpoint:
To enable access to the public endpoint from authorized networks, select the Access cluster control plane endpoint using its external IP address checkbox.
Using this option sets the level of access for the control plane to "Public endpoint access enabled, authorized networks enabled". This provides restricted access to the control plane from authorized networks. By default, no source IP addresses are specified. You can add authorized networks to the cluster.
To disable access to the public endpoint from authorized networks, clear the Access cluster control plane endpoint using its external IP address checkbox.
Using this option sets the level of access for the control plane to "Public endpoint access disabled". This prevents all internet access to the control plane.
gcloud
Make sure that your networking is configured for the type of environment that you want to create.
When you create an environment, the following arguments control the networking parameters. If you omit a parameter, the default value is used.
--enable-private-environmentenables a Private IP environment.--networkspecifies your VPC network ID.--subnetworkspecifies your VPC subnetwork ID.--cluster-secondary-range-nameor--cluster-ipv4-cidrconfigures the the secondary range for pods.--services-secondary-range-nameor--services-ipv4-cidrto configure the secondary range for services.--master-ipv4-cidrspecifies a range for the GKE control plane.
(Environments with PSC)
--connection-subnetworkspecifies a range for the Cloud Composer connection subnetwork, which hosts the PSC endpoint.(Environments with VPC peerings)
--connection-type=VPC_PEERINGspecifies that an environment must use VPC peerings.(Environments with VPC peerings)
--composer-network-ipv4-cidrspecifies a range for the Cloud Composer tenant network. This network hosts the SQL proxy component of your environment.(Environments with VPC peerings)
--cloud-sql-ipv4-cidrspecifies a range for the Cloud SQL instance.
--enable-private-endpointcontrols the level access for the GKE control plane. The control plane has two endpoints. One endpoint is private, for use by cluster nodes and VMs. Another endpoint is public. You can specify the level of access for the public endpoint:To enable access to the public endpoint from authorized networks, omit the
--enable-private-endpointargument.Using this option sets the level of access for the control plane to "Public endpoint access enabled, authorized networks enabled". This provides restricted access to the control plane from authorized networks. By default, no source IP addresses are specified. You can add authorized networks to the cluster.
To disable access to the public endpoint from authorized networks, specify the
--enable-private-endpointargument.Using this option sets the level of access for the control plane to "Public endpoint access disabled". This prevents all internet access to the control plane.
--enable-master-authorized-networksand--master-authorized-networksarguments configure authorized networks for your environment.--enable-privately-used-public-ipsconfigures privately used public IP addresses for your environment.--enable-ip-masq-agentenables the IP Masquerade agent.
Example (Private IP environment )
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "SERVICE_ACCOUNT" \
--enable-private-environment \
--network NETWORK_ID \
--subnetwork SUBNETWORK_ID \
--cluster-ipv4-cidr PODS_RANGE \
--services-ipv4-cidr SERVICES_RANGE \
--master-ipv4-cidr CONTROL_PLANE_RANGE \
--connection-subnetwork COMPOSER_PSC_RANGE \
Replace:
NETWORK_IDwith your VPC network ID.SUBNETWORK_IDwith your VPC subnetwork ID.PODS_RANGEwith the secondary range for pods.SERVICES_RANGEwith the secondary range for services.CONTROL_PLANE_RANGEwith the secondary range for the GKE control plane.COMPOSER_PSC_RANGEwith the range for the Cloud Composer connection subnetwork.
Step 8. (Optional) Add network tags
Network tags are applied to all node VMs in your environment's cluster. Tags are used to identify valid sources or targets for network firewalls. Each tag within the list must comply with RFC 1035.
For example, you might want to add network tags if you plan to restrict traffic for a Private IP environment with firewall rules.
Console
On the Create environment page:
- Locate the Network configuration section.
- In the Network tags field, enter network tags for your environment.
gcloud
When you create an environment, following arguments control network tags:
--tagsspecifies a comma-separated list of network tags applied to all node VMs.
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "SERVICE_ACCOUNT" \
--tags TAGS
Replace:
TAGSwith a comma-separated list of network tags.
Example:
gcloud composer environments create example-environment \
--location us-central1 \
--image-version composer-2.14.4-airflow-2.10.5 \
--tags group1,production
API
When you create an environment, in the Environment > EnvironmentConfig resource, specify network tags for your environment.
{
"name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
"config": {
"nodeConfig": {
"tags": [
"TAG"
],
"serviceAccount": "SERVICE_ACCOUNT"
}
}
}
Replace:
TAGwith a network tag.
Example:
// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments
{
"name": "projects/example-project/locations/us-central1/environments/example-environment",
"config": {
"nodeConfig": {
"tags": [
"group1",
"production"
],
"serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Terraform
When you create an environment, following fields define network tags for your environment:
tagsfield in thenode_configblock specifies a comma-separated list of network tags applied to all node VMs.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
node_config {
tags = ["TAGS"]
service_account = "SERVICE_ACCOUNT"
}
}
}
Replace:
TAGSwith a comma-separated list of network tags.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
node_config {
tags = ["group1","production"]
service_account = "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Step 9. (Optional) Configure web server network access
The Airflow web server access parameters do not depend on the type of your environment. Instead, you can configure web server access separately. For example, a Private IP environment can still have the Airflow UI accessible from the internet.
You can't configure the allowed IP ranges using private IP addresses.
Console
On the Create environment page:
In the Network configuration section, expand the Show network configuration item.
In the Web server network access control section:
To provide access to the Airflow web server from all IP addresses, select Allow access from all IP addresses.
To restrict access only to specific IP ranges, select Allow access only from specific IP addresses. In the IP range field, specify an IP range in the CIDR notation. In the Description field, specify an optional description for this range. If you want to specify more than one range, click Add IP range.
To forbid access for all IP addresses, select Allow access only from specific IP addresses and click Delete item next to the empty range entry.
gcloud
When you create an environment, following arguments control web server access level:
--web-server-allow-allprovides access to Airflow from all IP addresses. This is the default option.--web-server-allow-iprestricts access only to specific source IP ranges. To specify several IP ranges, use this argument multiple times.--web-server-deny-allforbids access for all IP addresses.
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--web-server-allow-ip ip_range=WS_IP_RANGE,description=WS_RANGE_DESCRIPTION
Replace:
WS_IP_RANGEwith the IP range, in the CIDR notation, that can access Airflow UI.WS_RANGE_DESCRIPTIONwith the description of the IP range.
Example:
gcloud composer environments create example-environment \
--location us-central1 \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "
example-account@example-project.iam.gserviceaccount.com
" \
--web-server-allow-ip ip_range=192.0.2.0/24,description="office net 1" \
--web-server-allow-ip ip_range=192.0.4.0/24,description="office net 3"
API
When you create an environment, in the Environment > EnvironmentConfig resource, specify web server access parameters.
To provide access to the Airflow web server from all IP addresses, omit
webServerNetworkAccessControl.To restrict access only to specific IP ranges, specify one or more ranges in
allowedIpRanges.To forbid access for all IP addresses, add
allowedIpRangesand make it an empty list. Do not specify IP ranges in it.
{
"name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
"config": {
"webServerNetworkAccessControl": {
"allowedIpRanges": [
{
"value": "WS_IP_RANGE",
"description": "WS_RANGE_DESCRIPTION"
}
]
},
"nodeConfig": {
"serviceAccount": "SERVICE_ACCOUNT"
}
}
}
Replace:
WS_IP_RANGEwith the IP range, in the CIDR notation, that can access Airflow UI.WS_RANGE_DESCRIPTIONwith the description of the IP range.
Example:
// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments
{
"name": "projects/example-project/locations/us-central1/environments/example-environment",
"config": {
"webServerNetworkAccessControl": {
"allowedIpRanges": [
{
"value": "192.0.2.0/24",
"description": "office net 1"
},
{
"value": "192.0.4.0/24",
"description": "office net 3"
}
]
},
"nodeConfig": {
"serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Terraform
When you create an environment, the allowed_ip_range block in the
web_server_network_access_control block contains IP ranges that can access
web server.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
web_server_network_access_control {
allowed_ip_range {
value = "WS_IP_RANGE"
description = "WS_RANGE_DESCRIPTION"
}
}
node_config {
service_account = "SERVICE_ACCOUNT"
}
}
}
Replace:
WS_IP_RANGEwith the IP range, in the CIDR notation, that can access Airflow UI.WS_RANGE_DESCRIPTIONwith the description of the IP range.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
web_server_network_access_control {
allowed_ip_range {
value = "192.0.2.0/24"
description = "office net 1"
},
allowed_ip_range {
value = "192.0.4.0/24"
description = "office net 3"
}
}
node_config {
service_account = "
example-account@example-project.iam.gserviceaccount.com
"
}
}
Step 10. (Optional) Specify Airflow configuration overrides and environment variables
You can set up Airflow configuration overrides and environment variables when you create an environment. As an alternative, you can do it later, after your environment is created.
Some Airflow configuration options are blocked and you can't override them.
For the list of available Airflow configuration options, see Configuration reference for Airflow 2 and Airflow 1.10.*
To specify Airflow configuration overrides and environment variables:
Console
On the Create environment page:
In the Environment variables section, click Add environment variable.
Enter the Name and Value for the environment variable.
In the Airflow configuration overrides section, click Add Airflow configuration override.
Enter the Section, Key, and Value for the configuration option override.
For example:
Section Key Value webserverdag_orientationTB
gcloud
When you create an environment, following arguments control environment variables and Airflow configuration overrides:
--env-variablesspecifies a comma-separated list of environment variables.Variable names may contain upper and lowercase letters, digits, and underscores, but they may not begin with a digit.
--airflow-configsspecifies a comma-separated list of keys and values for Airflow configuration overrides.
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "SERVICE_ACCOUNT" \
--env-variables ENV_VARS \
--airflow-configs CONFIG_OVERRIDES
Replace:
ENV_VARSwith a list of comma-separatedNAME=VALUEpairs for environment variables.CONFIG_OVERRIDESwith a list of comma-separatedSECTION-KEY=VALUEpairs for configuration overrides. Separate the name of the configuration section with a-symbol, followed by the key name. For example:core-dags_are_paused_at_creation.
Example:
gcloud composer environments create example-environment \
--location us-central1 \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "
example-account@example-project.iam.gserviceaccount.com
" \
--env-variables SENDGRID_MAIL_FROM=user@example.com,SENDGRID_API_KEY=example-key \
--airflow-configs core-dags_are_paused_at_creation=True,webserver-dag_orientation=TB
API
When you create an environment, in the Environment > EnvironmentConfig resource, specify environment variables and Airflow configuration overrides.
{
"name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
"config": {
"softwareConfig": {
"airflowConfigOverrides": {
"SECTION-KEY": "OVERRIDE_VALUE"
},
"envVariables": {
"VAR_NAME": "VAR_VALUE",
}
},
"nodeConfig": {
"serviceAccount": "SERVICE_ACCOUNT"
}
}
}
Replace:
SECTIONwith the section in the configuration file where the Airflow configuration option is located.KEYwith the name of the Airflow configuration option.OVERRIDE_VALUEwith a value of the Airflow configuration option.VAR_NAMEwith the name of the environment variable.VAR_VALUEwith the value of the environment variable.
Example:
// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments
{
"name": "projects/example-project/locations/us-central1/environments/example-environment",
"config": {
"softwareConfig": {
"airflowConfigOverrides": {
"core-dags_are_paused_at_creation": "True",
"webserver-dag_orientation": "TB"
},
"envVariables": {
"SENDGRID_MAIL_FROM": "user@example.com",
"SENDGRID_API_KEY": "example-key"
}
},
"nodeConfig": {
"serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Terraform
When you create an environment, following blocks control environment variables and Airflow configuration overrides:
env_variablesblock in thesoftware_configblock specifies environment variables.Variable names may contain upper and lowercase letters, digits, and underscores, but they may not begin with a digit.
airflow_config_overridesblock in thesoftware_configblock specifies Airflow configuration overrides.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
software_config {
airflow_config_overrides = {
SECTION-KEY = "OVERRIDE_VALUE"
}
env_variables = {
VAR_NAME = "VAR_VALUE"
}
}
node_config {
service_account = "SERVICE_ACCOUNT"
}
}
}
Replace:
SECTIONwith the section in the configuration file where the Airflow configuration option is located.KEYwith the name of the Airflow configuration option.OVERRIDE_VALUEwith a value of the Airflow configuration option.VAR_NAMEwith the name of the environment variable.VAR_VALUEwith the value of the environment variable.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
software_config {
airflow_config_overrides = {
core-dags_are_paused_at_creation = "True"
webserver-dag_orientation = "TB"
}
env_variables = {
SENDGRID_MAIL_FROM = "user@example.com"
SENDGRID_API_KEY = "example-key"
}
}
node_config {
service_account = "
example-account@example-project.iam.gserviceaccount.com
"
}
}
}
Step 11. (Optional) Specify maintenance windows
Default maintenance windows in Cloud Composer 2 are from 00:00:00 to 04:00:00 (GMT) on Friday, Saturday, and Sunday every week.
To specify custom maintenance windows for your environment:
Console
On the Create environment page
Locate the Maintenance windows section.
In the Timezone drop-down list, choose a time zone for maintenance windows.
Set Start time, Days, and Length, so that:
At least 12 hours of time is allocated in a single week.
You can use several time slots, but each slot duration must be at least 4 hours.
For example, a period of 4 hours every Monday, Wednesday, and Friday provides the required amount of time.
gcloud
The following arguments define maintenance windows parameters:
--maintenance-window-startsets the start time of a maintenance window.--maintenance-window-endsets the end time of a maintenance window.--maintenance-window-recurrencesets the maintenance window recurrence.
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "SERVICE_ACCOUNT" \
--maintenance-window-start 'DATETIME_START' \
--maintenance-window-end 'DATETIME_END' \
--maintenance-window-recurrence 'MAINTENANCE_RECURRENCE'
Replace:
ENVIRONMENT_NAMEwith the name of the environment.DATETIME_STARTwith the start date and time in the date/time input format. Only the specified time of the day is used, the specified date is ignored.DATETIME_ENDwith the end date and time in the date/time input format. Only the specified time of the day is used, the specified date is ignored. The specified date and time must be after the start date.MAINTENANCE_RECURRENCEwith an RFC 5545 RRULE for maintenance windows recurrence. Cloud Composer supports two formats:The
FREQ=DAILYformat specifies a daily recurrence.The
FREQ=WEEKLY;BYDAY=SU,MO,TU,WE,TH,FR,SAformat specifies a recurrence on selected days of the week.
The following example specifies a 6-hour maintenance window between 01:00 and 07:00 (UTC) on Wednesdays, Saturdays, and Sundays. The 1 January, 2023 date is ignored.
gcloud composer environments create example-environment \
--location us-central1 \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "
example-account@example-project.iam.gserviceaccount.com
" \
--maintenance-window-start '2023-01-01T01:00:00Z' \
--maintenance-window-end '2023-01-01T07:00:00Z' \
--maintenance-window-recurrence 'FREQ=WEEKLY;BYDAY=SU,WE,SA'
API
When you create an environment, in the Environment > EnvironmentConfig resource, specify maintenance windows parameters:
{
"name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
"config": {
"maintenanceWindow": {
"startTime": "DATETIME_START",
"endTime": "DATETIME_END",
"recurrence": "MAINTENANCE_RECURRENCE"
},
"nodeConfig": {
"serviceAccount": "SERVICE_ACCOUNT"
}
}
}
Replace:
DATETIME_STARTwith the start date and time in the date/time input format. Only the specified time of the day is used, the specified date is ignored.DATETIME_ENDwith the end date and time in the date/time input format. Only the specified time of the day is used, the specified date is ignored. The specified date and time must be after the start date.MAINTENANCE_RECURRENCEwith an RFC 5545 RRULE for maintenance windows recurrence. Cloud Composer supports two formats:The
FREQ=DAILYformat specifies a daily recurrence.The
FREQ=WEEKLY;BYDAY=SU,MO,TU,WE,TH,FR,SAformat specifies a recurrence on selected days of the week.
The following example specifies a 6-hour maintenance window between 01:00 and 07:00 (UTC) on Wednesdays, Saturdays, and Sundays. The 1 January, 2023 date is ignored.
Example:
// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments
{
"name": "projects/example-project/locations/us-central1/environments/example-environment",
"config": {
"maintenanceWindow": {
"startTime": "2023-01-01T01:00:00Z",
"endTime": "2023-01-01T07:00:00Z",
"recurrence": "FREQ=WEEKLY;BYDAY=SU,WE,SA"
},
"nodeConfig": {
"serviceAccount": "SERVICE_ACCOUNT"
}
}
}
Terraform
The maintenance_window block specifies the maintenance windows for your
environment:
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
maintenance_window {
start_time = "DATETIME_START"
end_time = "DATETIME_END"
recurrence = "MAINTENANCE_RECURRENCE"
}
node_config {
service_account = "SERVICE_ACCOUNT"
}
}
}
Replace:
DATETIME_STARTwith the start date and time in the date/time input format. Only the specified time of the day is used, the specified date is ignored.DATETIME_ENDwith the end date and time in the date/time input format. Only the specified time of the day is used, the specified date is ignored. The specified date and time must be after the start date.MAINTENANCE_RECURRENCEwith an RFC 5545 RRULE for maintenance windows recurrence. Cloud Composer supports two formats:- The
FREQ=DAILYformat specifies a daily recurrence. - The
FREQ=WEEKLY;BYDAY=SU,MO,TU,WE,TH,FR,SAformat specifies a recurrence on selected days of the week.
- The
The following example specifies a 6-hour maintenance window between 01:00 and 07:00 (UTC) on Wednesdays, Saturdays, and Sundays. The 1 January, 2023 date is ignored.
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
maintenance_window {
start_time = "2023-01-01T01:00:00Z"
end_time = "2023-01-01T07:00:00Z"
recurrence = "FREQ=WEEKLY;BYDAY=SU,WE,SA"
}
}
}
Step 12. (Optional) Data lineage integration
Data lineage is a Dataplex Universal Catalog feature that lets you track data movement.
Data lineage integration is available in Cloud Composer 2 versions 2.1.2 and later with Airflow versions 2.2.5 and later.Data lineage integration is automatically enabled in a new Cloud Composer environment if the following conditions are met:
Data Lineage API is enabled in your project. For more information, see Enabling Data Lineage API in Dataplex Universal Catalog documentation.
A custom Lineage Backend isn't configured in Airflow.
You can disable data lineage integration when you create an environment. For example, if you want to override the automatic behavior or choose to enable data lineage later, after the environment is created.
Console
To disable Data lineage integration, on the Create environment page:
In the Advanced configuration section, expand the Show advanced configuration item.
In the Dataplex data lineage integration section, select Disable integration with Dataplex data lineage.
gcloud
When you create an environment, the --disable-cloud-data-lineage-integration
argument disables the data lineage integration.
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "SERVICE_ACCOUNT" \
--disable-cloud-data-lineage-integration
Replace:
ENVIRONMENT_NAMEwith the name of the environment.LOCATIONwith the region where the environment is located.
Example:
gcloud composer environments create example-environment \
--location us-central1 \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "
example-account@example-project.iam.gserviceaccount.com
" \
--disable-cloud-data-lineage-integration
Step 13. (Optional) Configure data encryption (CMEK)
By default, data in your environment is encrypted with a key provided by Google.
To use customer-managed encryption keys (CMEK) to encrypt data in your environment, follow the instructions outlined in Using customer-managed encryption keys.
Step 14. (Optional) Use a custom environment's bucket
When you create an environment, Cloud Composer creates a bucket for your environment automatically.
As an alternative, you can specify a custom Cloud Storage bucket from your project. Your environment uses this bucket in the same way as the automatically created bucket.
To use a custom environment bucket, follow the instructions outlined in Use a custom environment's bucket.
Step 15. (Optional) Specify environment labels
You can assign labels to your environments to break down billing costs based on these labels.
Console
On the Create environment page, in the Labels section:
Click Add label.
In Key and Value fields, specify key and value pairs for the environment labels.
gcloud
When you create an environment, the --labels argument specifies a comma-separated list of keys and values with environment labels.
gcloud composer environments create ENVIRONMENT_NAME \
--location LOCATION \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "SERVICE_ACCOUNT" \
--labels LABELS
Replace:
LABELSwith a list of comma-separatedKEY=VALUEpairs for environment labels.
Example:
gcloud composer environments create example-environment \
--location us-central1 \
--image-version composer-2.14.4-airflow-2.10.5 \
--service-account "
example-account@example-project.iam.gserviceaccount.com
" \
--labels owner=engineering-team,env=production
API
When you create an environment, in the Environment resource, specify labels for your environment.
{
"name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
"labels": {
"LABEL_KEY": "LABEL_VALUE"
}
}
Replace:
LABEL_KEYwith a key of the environment label.LABEL_VALUEwith a value of the environment label.
Example:
// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments
{
"name": "projects/example-project/locations/us-central1/environments/example-environment",
"labels": {
"owner": "engineering-team",
"env": "production"
}
}
Terraform
When you create an environment, specify labels in the labels block (
outside of the config block).
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
labels = {
LABEL_KEY = "LABEL_VALUE"
}
}
Replace:
LABEL_KEYwith a key of the environment label.LABEL_VALUEwith a value of the environment label.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
labels = {
owner = "engineering-team"
env = "production"
}
}
What's next
- Troubleshooting environment creation
- Configuring Shared VPC
- Configuring VPC Service Controls
- Adding and updating DAGs
- Accessing Airflow UI
- Updating and deleting environments
- About Cloud Composer versions