Pod Admission and Scheduling

Introduction

This tutorial explores configuring pod admission and scheduling in Kubernetes. We will cover setting resource limits, utilizing node affinity, and other techniques to control where pods are scheduled and how they consume resources. A working Kubernetes cluster is required. This tutorial is geared towards users preparing for the CKA (Certified Kubernetes Administrator) exam.

Task 1: Setting Resource Limits

Resource limits ensure that pods don’t consume excessive resources (CPU and memory), preventing resource starvation for other pods. We will create a LimitRange to define default and maximum resource requests and limits.

Create a file named limit-range.yaml with the following content:

            NODE_TYPE // yaml
        
apiVersion: v1
kind: LimitRange
metadata:
  name: pod-resource-limits
spec:
  limits:
  - default:
      cpu: 500m
      memory: 512Mi
    defaultRequest:
      cpu: 250m
      memory: 256Mi
    max:
      cpu: "1"
      memory: 1Gi
    min:
      cpu: 100m
      memory: 128Mi
    type: Container

This LimitRange defines the minimum, maximum, default request, and default limit for CPU and memory for containers within a namespace. m stands for millicores (1/1000th of a core), and Mi stands for mebibytes (2^20 bytes).

Apply the LimitRange to your desired namespace (e.g., default):

NODE_TYPE // bash

kubectl apply -f limit-range.yaml -n default

NODE_TYPE // output

limitrange/pod-resource-limits created

Verify the LimitRange:

NODE_TYPE // bash

kubectl get limitrange pod-resource-limits -n default -o yaml

This will output the YAML definition of the LimitRange, confirming its creation.

Create a pod without specifying resource requests and limits in pod-no-limits.yaml:

            NODE_TYPE // yaml
        
apiVersion: v1
kind: Pod
metadata:
  name: pod-no-limits
spec:
  containers:
  - name: nginx
    image: nginx

Apply the pod:

NODE_TYPE // bash

kubectl apply -f pod-no-limits.yaml -n default

Check pod’s resources. Kubernetes will automatically apply defaults defined by LimitRange.

NODE_TYPE // bash

kubectl get pod pod-no-limits -n default -o yaml

Look for the resources section within the container definition. It will show the default request and limit values set by the LimitRange.

            NODE_TYPE // output
        
...
    resources:
      limits:
        cpu: "500m"
        memory: 512Mi
      requests:
        cpu: 250m
        memory: 256Mi
...

Task 2: Node Affinity

Node affinity allows you to constrain which nodes your pods can be scheduled on, based on node labels. We’ll create a node label and then use node affinity to ensure a pod is scheduled on that node.

Label a node:

            NODE_TYPE // bash
        
kubectl label nodes <node-name> disktype=ssd

Replace <node-name> with the actual name of your node. You can find the node name using kubectl get nodes.

Node labels are key-value pairs attached to nodes, providing metadata that can be used for scheduling decisions.

Create a pod definition with node affinity in pod-with-affinity.yaml:

            NODE_TYPE // yaml
        
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-affinity
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: disktype
            operator: In
            values:
            - ssd
  containers:
  - name: nginx
    image: nginx

requiredDuringSchedulingIgnoredDuringExecution means the scheduler must satisfy the rule for the pod to be scheduled. If it cannot be satisfied, the pod will remain in a pending state. The “IgnoredDuringExecution” part means that if the node label changes after the pod is scheduled, the pod will continue running on the node. There is another option preferredDuringSchedulingIgnoredDuringExecution.

Apply the pod definition:

NODE_TYPE // bash

kubectl apply -f pod-with-affinity.yaml -n default

Verify that the pod is running on the labeled node:

NODE_TYPE // bash

kubectl get pod pod-with-affinity -o wide -n default

The NODE column in the output should display the name of the node you labeled.

NODE_TYPE // output

NAME                READY   STATUS    RESTARTS   AGE   IP          NODE           NOMINATED NODE   READINESS GATES
pod-with-affinity   1/1     Running   0          10s   10.1.2.34   <node-name>   <none>           <none>

Task 3: Pod Disruption Budgets (PDBs)

Pod Disruption Budgets (PDBs) limit the number of concurrent disruptions to your applications. This ensures that a certain number of replicas are always available, even during voluntary disruptions like node maintenance or upgrades.

Create a deployment:

            NODE_TYPE // yaml
        
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-app
        image: nginx

NODE_TYPE // bash

kubectl apply -f deployment.yaml -n default

Create a PDB in pdb.yaml:

            NODE_TYPE // yaml
        
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

minAvailable: 2 means that at least 2 pods with the label app: my-app must be available at all times.

Apply the PDB:

NODE_TYPE // bash

kubectl apply -f pdb.yaml -n default

Simulate a disruption (e.g., draining a node). Attempting to drain a node that would violate the PDB will result in an error:

NODE_TYPE // bash

kubectl drain <node-name> --ignore-daemonsets --force

If draining the node would leave fewer than minAvailable replicas running, the drain command will fail.

Check the status of the PDB:

NODE_TYPE // bash

kubectl get pdb my-app-pdb -n default -o yaml

            NODE_TYPE // output
        
...
  status:
    currentHealthy: 3
    desiredHealthy: 2
    disruptablePods: 0
    podDisruptionsAllowed: 3
    totalPods: 3
...

Task 4: Taints and Tolerations

Taints allow you to repel pods from specific nodes, while tolerations allow pods to schedule onto those tainted nodes. This is often used to dedicate nodes to specific workloads.

Taint a node:

            NODE_TYPE // bash
        
kubectl taint nodes <node-name> dedicated=special-workload:NoSchedule

This taints the node <node-name> with the key dedicated, value special-workload, and effect NoSchedule. NoSchedule means that no pods will be scheduled on this node unless they have a matching toleration. Other effects are PreferNoSchedule (Kubernetes will try to avoid scheduling pods without the toleration) and NoExecute (pods without the toleration will be evicted).

Create a pod with a toleration in pod-with-toleration.yaml:

            NODE_TYPE // yaml
        
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-toleration
spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "special-workload"
    effect: "NoSchedule"
  containers:
  - name: nginx
    image: nginx

Apply the pod:

NODE_TYPE // bash

kubectl apply -f pod-with-toleration.yaml -n default

Verify that the pod is running on the tainted node:

NODE_TYPE // bash

kubectl get pod pod-with-toleration -o wide -n default

The NODE column in the output should display the name of the tainted node.

NODE_TYPE // output

NAME                  READY   STATUS    RESTARTS   AGE   IP          NODE           NOMINATED NODE   READINESS GATES
pod-with-toleration   1/1     Running   0          10s   10.1.2.35   <node-name>   <none>           <none>

Conclusion

In this tutorial, you learned how to configure pod admission and scheduling in Kubernetes using LimitRanges, node affinity, Pod Disruption Budgets, and taints and tolerations. These techniques provide fine-grained control over resource consumption, pod placement, and application availability, essential for managing Kubernetes clusters effectively. These concepts are crucial for the CKA exam and for real-world Kubernetes deployments.

Pod Admission and Scheduling

Introduction

Task 1: Setting Resource Limits

Task 2: Node Affinity

Task 3: Pod Disruption Budgets (PDBs)

Task 4: Taints and Tolerations

Conclusion

Next Topic

Application Deployments and Rolling Updates

Configuring ConfigMaps and Secrets

Primitives for Robust Deployments