Workload Autoscaling
Introduction
This tutorial provides a hands-on guide to configuring workload autoscaling in Kubernetes, focusing on concepts and practical skills relevant to the Certified Kubernetes Administrator (CKA) exam. We will cover Horizontal Pod Autoscaling (HPA), which automatically scales the number of pods in a deployment, replication controller, replica set, or stateful set based on observed CPU utilization, memory utilization, or custom metrics.
Prerequisites:
- A running Kubernetes cluster (minikube, kind, or a cloud-based cluster).
- kubectl configured to connect to your cluster.
- Basic understanding of Kubernetes deployments and services.
Task 1: Deploying a Sample Application
First, we’ll deploy a simple application that we can scale. We’ll use a basic HTTP server.
-
Create a deployment configuration file named
php-apache.yaml:NODE_TYPE // yamlapiVersion: apps/v1 kind: Deployment metadata: name: php-apache spec: selector: matchLabels: run: php-apache replicas: 1 template: metadata: labels: run: php-apache spec: containers: - name: php-apache image: k8s.gcr.io/hpa-example ports: - containerPort: 80 resources: requests: cpu: 200m limits: cpu: 400m --- apiVersion: v1 kind: Service metadata: name: php-apache labels: run: php-apache spec: ports: - port: 80 protocol: TCP selector: run: php-apache type: LoadBalancerThek8s.gcr.io/hpa-exampleimage is a simple PHP-based web server designed for demonstrating autoscaling. The resources requests and limits are crucial for HPA to function correctly. -
Apply the deployment and service:
NODE_TYPE // bashkubectl apply -f php-apache.yamlNODE_TYPE // outputdeployment.apps/php-apache created service/php-apache created -
Verify the deployment and service are running:
NODE_TYPE // bashkubectl get deployment php-apache kubectl get service php-apacheNODE_TYPE // outputNAME READY UP-TO-DATE AVAILABLE AGE php-apache 1/1 1 1 <age> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE php-apache LoadBalancer <cluster-ip> <pending> 80:30713/TCP <age>It may take a few minutes for the service to obtain an EXTERNAL-IP, especially in cloud environments. For minikube, you can useminikube service php-apacheto access the service.
Task 2: Creating a Horizontal Pod Autoscaler (HPA)
Now, we’ll create an HPA that automatically scales the php-apache deployment based on CPU utilization.
-
Create an HPA using
kubectl autoscale:NODE_TYPE // bashkubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10This command creates an HPA that targets thephp-apachedeployment. It will maintain a CPU utilization of 50% across all pods, scaling between 1 and 10 replicas. -
Verify the HPA:
NODE_TYPE // bashkubectl get hpa php-apacheNODE_TYPE // outputNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache 0%/50% 1 10 1 <age>TheTARGETScolumn might show<unknown>/50%initially. This is because the metrics server needs time to collect CPU utilization data. Ensure the metrics server is properly installed in your cluster.
Task 3: Generating Load and Observing Autoscaling
To trigger the autoscaling, we need to generate load on the php-apache service.
-
Run a load generator in a separate terminal:
NODE_TYPE // bashkubectl run -i --tty load-generator --image=busybox /bin/sh -
Inside the
load-generatorpod, usewgetto generate traffic:NODE_TYPE // bashwhile true; do wget -q -O- <php-apache-external-ip>; doneReplace
<php-apache-external-ip>with the external IP address of yourphp-apacheservice. If you’re using minikube, useminikube service php-apache --urlto get the URL and then extract the IP. -
Observe the HPA scaling the deployment:
NODE_TYPE // bashkubectl get hpa php-apache -wThe
-wflag watches for changes. You should see theREPLICAScolumn increase as the CPU utilization rises above 50%.NODE_TYPE // outputNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache 23%/50% 1 10 1 <age> php-apache Deployment/php-apache 67%/50% 1 10 2 <age> php-apache Deployment/php-apache 81%/50% 1 10 3 <age>Autoscaling may take a few minutes to kick in. Be patient and observe theTARGETSandREPLICAScolumns. -
Verify the number of pods:
NODE_TYPE // bashkubectl get pods -l run=php-apacheYou should see the number of pods increasing.
-
Stop the load generator by exiting the pod. Type
exitin the shell.
Task 4: Cleaning Up
After you’re done experimenting, clean up the resources:
-
Delete the HPA:
NODE_TYPE // bashkubectl delete hpa php-apache -
Delete the deployment and service:
NODE_TYPE // bashkubectl delete -f php-apache.yaml -
Delete the load generator pod:
NODE_TYPE // bashkubectl delete pod load-generator
Conclusion
In this tutorial, you learned how to configure workload autoscaling in Kubernetes using Horizontal Pod Autoscaling (HPA). You deployed a sample application, created an HPA based on CPU utilization, generated load to trigger autoscaling, and observed the scaling process. This hands-on experience provides a solid foundation for understanding and implementing autoscaling in Kubernetes, a crucial skill for the CKA certification. You should now be able to apply these principles to scale your own applications based on various metrics and resource requirements.