Configure scheduling

This page describes scheduler options and how to configure default pod scheduling constraints in your Google Distributed Cloud software-only for bare metal clusters.

Google Distributed Cloud provides a number of standard Kubernetes features that you can use to control scheduling pods, such as:

For information on pod topology spread constraints in Kubernetes, see Kubernetes Scheduler in the Kubernetes documentation.

Before you begin

Before configuring default pod spread, make sure each node in your cluster has the correct topology labels. You can use the Nodepool.Spec.TaintsAndLabels API to apply labels. Manually labeling nodes with kubectl label offers more flexibility, but requires manual labeling when you add a new node to the cluster.

Configure default custom scheduler {#:config-default}

Label nodes

  1. Add topology labels to your cluster and nodepool YAML files. The following example assumes two worker nodepools are in different racks, and control plane nodes are in rack1.

    apiVersion: baremetal.cluster.gke.io/v1
    kind: Cluster
    metadata:
      name: abm-cluster
      namespace: cluster-abm-cluster
    spec:
      controlPlane:
        nodePoolSpec:
          labels:
            topology.k8s.io/rack: rack1
    ---
    apiVersion: baremetal.cluster.gke.io/v1
    kind: NodePool
    metadata:
      name: nodepool-rack1
      namespace: cluster-abm-cluster
    spec:
      labels:
        topology.k8s.io/rack: rack1
    ---
    apiVersion: baremetal.cluster.gke.io/v1
    kind: NodePool
    metadata:
      name: nodepool-rack2
      namespace: cluster-abm-cluster
    spec:
      labels:
        topology.k8s.io/rack: rack2
    
  2. Apply the updated cluster configuration.

    bmctl update cluster -c CLUSTER_NAME
    

    Replace CLUSTER_NAME with the name of your cluster.

  3. Wait for the topology.k8s.io/rack label to propagate to all nodes in the cluster.

Enable default pod spread constraints

  1. Add the preview.baremetal.cluster.gke.io/custom-scheduler-configuration:enable annotation to your cluster YAML file.

  2. Add the schedulerConfiguration section under cluster.spec.controlPlane in your cluster YAML file.

    apiVersion: baremetal.cluster.gke.io/v1
    kind: Cluster
    metadata:
      name: abm-cluster
      namespace: cluster-abm-cluster
      annotations:
        preview.baremetal.cluster.gke.io/custom-scheduler-configuration: enable
    spec:
      controlPlane:
        schedulerConfiguration:
          defaultTopologySpreadConstraint:
            defaultConstraints:
            - topologyKey: topology.k8s.io/rack
              whenUnsatisfiable: DoNotSchedule
              maxSkew: 1
            defaultingType: List
    
  3. Apply the updated cluster configuration.

    bmctl update cluster -c CLUSTER_NAME
    

    Replace CLUSTER_NAME with the name of your cluster.

  4. Wait for the cluster reconciliation to complete. Monitor cluster.status.clusterState until it shows Running. A control-plane-update job runs for each control plane node during this process.

Verify pod spread configuration

  1. Create a test deployment with five replicas.

  2. Observe the pod distribution. The difference in the number of pods on nodepool-rack1 and nodepool-rack2 should be exactly one.

  3. Verify the kube-scheduler-profile.config file on each Control Plane node. The file, located at /etc/kubernetes/kube-scheduler-profile.config, must contain the topology spread configuration from cluster.spec.

Troubleshoot

To diagnose and resolve issues with default pod spread, check the following:

  1. Review the BareMetalMachine.Status.ControlPlaneComponents for the feature's status.
  2. Examine logs from the cluster-operator and cap-controller-manager for relevant events.
  3. If kube-scheduler static pods crash, check that the scheduler configuration is correct in the cluster YAML file.

What's next