Start learning about Kubernetes

This document describes the basic fundamentals of the open source container orchestration platform Kubernetes. Many components of Google Distributed Cloud (GDC) air-gapped are based on Kubernetes, and a lot of the documentation assumes that you're already familiar with basic Kubernetes concepts and terminology. If you're not familiar with Kubernetes, use this document as a reference for recommended reading to get you started.

Learning Kubernetes basics is essential for designing, deploying, and managing GDC applications and underlying system components. The following technical professionals must understand Kubernetes to successfully operate GDC:

  • Operators within the infrastructure operator group responsible for installing and maintaining the underlying GDC software and hardware infrastructure.
  • Administrators within the platform administrator group responsible for designing resilient infrastructure and application architectures on GDC.
  • Developers within the application operator group responsible for building applications.

For more information, see Audiences for GDC air-gapped documentation.

About Kubernetes

Kubernetes is an open source container orchestration platform. At its core, a Kubernetes cluster is a set of worker machines called nodes that run containerized applications. The entire cluster is managed by the control plane, which includes components like the API server, scheduler, and etcd database, and is responsible for maintaining the state of the cluster.

Applications are packaged into pods, the smallest deployable units in Kubernetes, which can contain one or more containers and run on the nodes. To organize resources within a cluster, often for different teams or environments, Kubernetes uses namespaces.

The lifecycle and state of pods are managed by controllers. For example, Deployment objects manage rolling updates and ReplicaSet objects ensure a specific number of pod replicas are running. To provide stable network endpoints, such as IP addresses and DNS names for accessing pods, Kubernetes uses services.

Since container storage is ephemeral by default, Kubernetes offers various storage abstractions, such as volumes and persistent volumes to manage data.

To secure access to cluster resources and the Kubernetes API, Kubernetes uses role-based access control (RBAC) to define roles, cluster roles, and bindings to grant specific permissions to users and service accounts.

Key concepts

The following are some key concepts that we use throughout the GDC documentation. This is not an exhaustive list of Kubernetes concepts. You can find much more to read and explore in the provided topics from the Kubernetes documentation and our recommended reading.

Nodes and clusters

All Kubernetes workloads run on nodes. In GDC, a node is a virtual machine (VM). On other Kubernetes platforms, a node could be either a physical or virtual machine. Each node is managed by the Kubernetes control plane and has all the necessary components to run pods. A cluster is a set of nodes that can be treated together as a single entity, on which you deploy a containerized application.

Learn more in the Kubernetes documentation:

Kubernetes control plane

The Kubernetes control plane is a set of system components that manage the overall state of your cluster, including the Kubernetes API server that lets you interact with your clusters and applications by using the kubectl CLI and other tools, a scheduler to schedule pods on available nodes, and the controllers that track and manage cluster state. The control plane is provided and managed by GDC.

For more information, see the Kubernetes Control plane components documentation.

Pods

In Kubernetes, containerized applications run inside a pod. A pod is the smallest deployable unit of computing that you can create and manage in Kubernetes. A pod has one or more containers. When a pod runs multiple containers, such as an application server and a proxy server, the containers are managed as a single entity and share the pod's resources.

Learn more in the Kubernetes documentation:

Namespaces

Kubernetes namespaces provide a mechanism for further grouping and selecting resources, such as pods and services, within a cluster. For example, if you have multiple application teams running workloads on a single cluster.

For more information, see the Kubernetes Namespaces documentation.

Controllers

Kubernetes controllers track and manage the state of your clusters and workloads, based on the state that you specify (for example, "I would like to run three of this Pod on this cluster, with this container in each Pod"). Different controllers track different Kubernetes resource types, including the following:

Kubernetes controllers track and manage the state of your clusters and workloads, based on the state that you specify. For example, you could set the behavior for a cluster to run three pods, with a designated container in each pod. Different controllers track different Kubernetes resource types, including the following:

  • Deployment: A Deployment custom resource is a Kubernetes object that represents one or more identical pods, called replicas. A deployment runs multiple replicas of the pods distributed among the nodes of a cluster. A deployment automatically replaces any pods that fail or become unresponsive.
  • StatefulSet: A StatefulSet custom resource is like a deployment but maintains a persistent unique identity for each of its pods. StatefulSet resources can be useful in applications with persistent state, like stateful applications.
  • DaemonSet: A DaemonSet custom resource lets you add default pods to some or all of your nodes. These are often helper services for your workloads, such as a log collection daemon or a monitoring daemon.
  • ReplicaSet: A ReplicaSet custom resource is a set of identical pods. A ReplicaSet is usually managed as part of a Deployment resource.

Learn more in the Kubernetes documentation:

Kubernetes Service

By default, you can't control which cluster node a pod is running on, so pods don't have stable IP addresses. To get an IP address for an application running in Kubernetes, you must define a networking abstraction on top of its pods called a Kubernetes Service. A Kubernetes Service provides a stable networking endpoint for a set of pods. There are several types of services, including LoadBalancer services, that expose an external IP address so that you can reach applications from outside the cluster.

Kubernetes also has a built-in DNS system for internal address resolution, which assigns a DNS name, such as helloserver.default.cluster.local to services. This lets pods within the cluster reach other pods in the cluster using a stable address. You can't use this DNS name outside the cluster, such as from the gdcloud CLI.

For more information, see the Kubernetes Services documentation.

Storage

If your applications need to save data that exists beyond the lifetime of its pod, such as in stateful applications, you can use a Kubernetes PersistentVolume object to provision this storage. You can also choose to use ephemeral storage, which is destroyed when the corresponding pod terminates.

Learn more in the Kubernetes documentation:

Role-based access control

Kubernetes includes a role-based access control (RBAC) mechanism that lets you create authorization policies for accessing your clusters and their resources. When using GDC, you'll often use a combination of Kubernetes RBAC and GDC's Identity and Access Management (IAM) to secure your applications.

For more information, see the Kubernetes Role-based access control documentation.

This section provides links to recommended resources for learning more about Kubernetes. In particular, Kubernetes.io, the official Kubernetes website, has comprehensive, reliable material about all things Kubernetes.

External guides and tutorials

Reference documentation

  • Kubernetes glossary: A comprehensive, standardized list of Kubernetes terminology. If you're not sure about a Kubernetes term, reference the glossary.

What's next