Technical Theory

Pod Admission and Scheduling

Introduction

This tutorial explores configuring pod admission and scheduling in Kubernetes. We will cover setting resource limits, utilizing node affinity, and other techniques to control where pods are scheduled and how they consume resources. A working Kubernetes cluster is required. This tutorial is geared towards users preparing for the CKA (Certified Kubernetes Administrator) exam.

Task 1: Setting Resource Limits

Resource limits ensure that pods don’t consume excessive resources (CPU and memory), preventing resource starvation for other pods. We will create a LimitRange to define default and maximum resource requests and limits.

  1. Create a file named limit-range.yaml with the following content:
NODE_TYPE // yaml
apiVersion: v1
kind: LimitRange
metadata:
  name: pod-resource-limits
spec:
  limits:
  - default:
      cpu: 500m
      memory: 512Mi
    defaultRequest:
      cpu: 250m
      memory: 256Mi
    max:
      cpu: "1"
      memory: 1Gi
    min:
      cpu: 100m
      memory: 128Mi
    type: Container
This LimitRange defines the minimum, maximum, default request, and default limit for CPU and memory for containers within a namespace. m stands for millicores (1/1000th of a core), and Mi stands for mebibytes (2^20 bytes).
  1. Apply the LimitRange to your desired namespace (e.g., default):
NODE_TYPE // bash
kubectl apply -f limit-range.yaml -n default
NODE_TYPE // output
limitrange/pod-resource-limits created
  1. Verify the LimitRange:
NODE_TYPE // bash
kubectl get limitrange pod-resource-limits -n default -o yaml

This will output the YAML definition of the LimitRange, confirming its creation.

  1. Create a pod without specifying resource requests and limits in pod-no-limits.yaml:
NODE_TYPE // yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod-no-limits
spec:
  containers:
  - name: nginx
    image: nginx
  1. Apply the pod:
NODE_TYPE // bash
kubectl apply -f pod-no-limits.yaml -n default
  1. Check pod’s resources. Kubernetes will automatically apply defaults defined by LimitRange.
NODE_TYPE // bash
kubectl get pod pod-no-limits -n default -o yaml

Look for the resources section within the container definition. It will show the default request and limit values set by the LimitRange.

NODE_TYPE // output
...
    resources:
      limits:
        cpu: "500m"
        memory: 512Mi
      requests:
        cpu: 250m
        memory: 256Mi
...

Task 2: Node Affinity

Node affinity allows you to constrain which nodes your pods can be scheduled on, based on node labels. We’ll create a node label and then use node affinity to ensure a pod is scheduled on that node.

  1. Label a node:
NODE_TYPE // bash
kubectl label nodes <node-name> disktype=ssd

Replace <node-name> with the actual name of your node. You can find the node name using kubectl get nodes.

Node labels are key-value pairs attached to nodes, providing metadata that can be used for scheduling decisions.
  1. Create a pod definition with node affinity in pod-with-affinity.yaml:
NODE_TYPE // yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-affinity
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: disktype
            operator: In
            values:
            - ssd
  containers:
  - name: nginx
    image: nginx
requiredDuringSchedulingIgnoredDuringExecution means the scheduler must satisfy the rule for the pod to be scheduled. If it cannot be satisfied, the pod will remain in a pending state. The “IgnoredDuringExecution” part means that if the node label changes after the pod is scheduled, the pod will continue running on the node. There is another option preferredDuringSchedulingIgnoredDuringExecution.
  1. Apply the pod definition:
NODE_TYPE // bash
kubectl apply -f pod-with-affinity.yaml -n default
  1. Verify that the pod is running on the labeled node:
NODE_TYPE // bash
kubectl get pod pod-with-affinity -o wide -n default

The NODE column in the output should display the name of the node you labeled.

NODE_TYPE // output
NAME                READY   STATUS    RESTARTS   AGE   IP          NODE           NOMINATED NODE   READINESS GATES
pod-with-affinity   1/1     Running   0          10s   10.1.2.34   <node-name>   <none>           <none>

Task 3: Pod Disruption Budgets (PDBs)

Pod Disruption Budgets (PDBs) limit the number of concurrent disruptions to your applications. This ensures that a certain number of replicas are always available, even during voluntary disruptions like node maintenance or upgrades.

  1. Create a deployment:
NODE_TYPE // yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-app
        image: nginx
NODE_TYPE // bash
kubectl apply -f deployment.yaml -n default
  1. Create a PDB in pdb.yaml:
NODE_TYPE // yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app
minAvailable: 2 means that at least 2 pods with the label app: my-app must be available at all times.
  1. Apply the PDB:
NODE_TYPE // bash
kubectl apply -f pdb.yaml -n default
  1. Simulate a disruption (e.g., draining a node). Attempting to drain a node that would violate the PDB will result in an error:
NODE_TYPE // bash
kubectl drain <node-name> --ignore-daemonsets --force

If draining the node would leave fewer than minAvailable replicas running, the drain command will fail.

  1. Check the status of the PDB:
NODE_TYPE // bash
kubectl get pdb my-app-pdb -n default -o yaml
NODE_TYPE // output
...
  status:
    currentHealthy: 3
    desiredHealthy: 2
    disruptablePods: 0
    podDisruptionsAllowed: 3
    totalPods: 3
...

Task 4: Taints and Tolerations

Taints allow you to repel pods from specific nodes, while tolerations allow pods to schedule onto those tainted nodes. This is often used to dedicate nodes to specific workloads.

  1. Taint a node:
NODE_TYPE // bash
kubectl taint nodes <node-name> dedicated=special-workload:NoSchedule

This taints the node <node-name> with the key dedicated, value special-workload, and effect NoSchedule. NoSchedule means that no pods will be scheduled on this node unless they have a matching toleration. Other effects are PreferNoSchedule (Kubernetes will try to avoid scheduling pods without the toleration) and NoExecute (pods without the toleration will be evicted).

  1. Create a pod with a toleration in pod-with-toleration.yaml:
NODE_TYPE // yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-toleration
spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "special-workload"
    effect: "NoSchedule"
  containers:
  - name: nginx
    image: nginx
  1. Apply the pod:
NODE_TYPE // bash
kubectl apply -f pod-with-toleration.yaml -n default
  1. Verify that the pod is running on the tainted node:
NODE_TYPE // bash
kubectl get pod pod-with-toleration -o wide -n default

The NODE column in the output should display the name of the tainted node.

NODE_TYPE // output
NAME                  READY   STATUS    RESTARTS   AGE   IP          NODE           NOMINATED NODE   READINESS GATES
pod-with-toleration   1/1     Running   0          10s   10.1.2.35   <node-name>   <none>           <none>

Conclusion

In this tutorial, you learned how to configure pod admission and scheduling in Kubernetes using LimitRanges, node affinity, Pod Disruption Budgets, and taints and tolerations. These techniques provide fine-grained control over resource consumption, pod placement, and application availability, essential for managing Kubernetes clusters effectively. These concepts are crucial for the CKA exam and for real-world Kubernetes deployments.

Next Topic