Technical Theory

Kubernetes CustomResourceDefinitions and Operators

Introduction

This tutorial demonstrates how to extend Kubernetes using CustomResourceDefinitions (CRDs) and Operators. CRDs allow you to define new resource types within Kubernetes, while Operators provide a way to automate the management of these resources.

Prerequisites:

  • A running Kubernetes cluster (minikube, kind, or a cloud provider).
  • kubectl configured to connect to your cluster.
  • Basic understanding of Kubernetes concepts like Pods, Deployments, and Services.
  • Go programming environment (for building the operator).
  • Operator SDK installed (brew install operator-sdk or follow instructions at https://sdk.operatorframework.io/docs/installation/).

Task 1: Defining a CustomResourceDefinition (CRD)

We’ll start by creating a CRD that defines a new resource type called AppService. This resource will represent an application running in our cluster.

  1. Create a file named appservice.yaml with the following content:
NODE_TYPE // yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: appservices.example.com
spec:
  group: example.com
  versions:
    - name: v1alpha1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                name:
                  type: string
                replicas:
                  type: integer
                  minimum: 1
                image:
                  type: string
  scope: Namespaced
  names:
    plural: appservices
    singular: appservice
    kind: AppService
    shortNames:
    - app
This YAML defines the structure of our AppService custom resource. It includes fields like name, replicas, and image within the spec. The group and names sections define the API group and resource names used to access the resource.
  1. Apply the CRD to your cluster:
NODE_TYPE // bash
kubectl apply -f appservice.yaml
NODE_TYPE // output
customresourcedefinition.apiextensions.k8s.io/appservices.example.com created
  1. Verify the CRD has been created:
NODE_TYPE // bash
kubectl get crds
NODE_TYPE // output
NAME                       CREATED AT
appservices.example.com   2024-10-27T12:00:00Z

Task 2: Creating an Instance of the Custom Resource

Now that we have defined the AppService CRD, let’s create an instance of it.

  1. Create a file named my-app.yaml with the following content:
NODE_TYPE // yaml
apiVersion: example.com/v1alpha1
kind: AppService
metadata:
  name: my-app
spec:
  name: "my-app"
  replicas: 3
  image: "nginx:latest"
This YAML defines an AppService resource named my-app, specifying that it should run 3 replicas of the nginx:latest image.
  1. Apply the AppService instance to your cluster:
NODE_TYPE // bash
kubectl apply -f my-app.yaml
NODE_TYPE // output
appservice.example.com/my-app created
  1. Verify the resource has been created:
NODE_TYPE // bash
kubectl get appservices
NODE_TYPE // output
NAME     AGE
my-app   10s
  1. Describe the resource to see its details:
NODE_TYPE // bash
kubectl describe appservice my-app
NODE_TYPE // output
Name:         my-app
Namespace:    default
...
Spec:
  Image:      nginx:latest
  Name:       my-app
  Replicas:   3
...

Task 3: Building an Operator

Now, let’s build an operator to manage our AppService resources. We’ll use the Operator SDK to scaffold the operator project.

  1. Create a new directory for the operator project:
NODE_TYPE // bash
mkdir appservice-operator && cd appservice-operator
  1. Initialize the operator project using the Operator SDK:
NODE_TYPE // bash
operator-sdk init --domain=example.com --repo=github.com/example/appservice-operator
Replace github.com/example/appservice-operator with your actual repository path. This is important for Go module management.
  1. Create the AppService API using the Operator SDK:
NODE_TYPE // bash
operator-sdk create api --group=example.com --version=v1alpha1 --kind=AppService --resource=true --controller=true
  1. Update the config/samples/example_v1alpha1_appservice.yaml file with the same content as our original my-app.yaml file (from Task 2):
NODE_TYPE // yaml
apiVersion: example.com/v1alpha1
kind: AppService
metadata:
  name: my-app
spec:
  name: "my-app"
  replicas: 3
  image: "nginx:latest"
  1. Implement the operator logic. Edit the controllers/appservice_controller.go file to define how the operator should manage AppService resources. This simplified example shows the basics of creating a deployment based on the CRD.
NODE_TYPE // go
package controllers

import (
	"context"
	"fmt"

	appsv1 "k8s.io/api/apps/v1"
	corev1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/apimachinery/pkg/runtime"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/client"
	"sigs.k8s.io/controller-runtime/pkg/log"

	examplev1alpha1 "github.com/example/appservice-operator/api/v1alpha1"
)

// AppServiceReconciler reconciles a AppService object
type AppServiceReconciler struct {
	client.Client
	Scheme *runtime.Scheme
}

//+kubebuilder:rbac:groups=example.com,resources=appservices,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=example.com,resources=appservices/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=example.com,resources=appservices/finalizers,verbs=update
//+kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch

// Reconcile is part of the main kubernetes reconciliation loop which aims to
// move the current state of the cluster closer to the desired state.
// TODO(user): Modify the Reconcile function to compare the state specified by
// the AppService object against the actual cluster state, and then
// perform operations to make the cluster state reflect the state specified by
// the user.
//
// For more details, check Reconcile and its Result here:
// - https://pkg.go.dev/sigs.k8s.io/[email protected]/pkg/reconcile
func (r *AppServiceReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	log := log.FromContext(ctx)

	// Fetch the AppService resource
	var appService examplev1alpha1.AppService
	if err := r.Get(ctx, req.NamespacedName, &appService); err != nil {
		log.Error(err, "unable to fetch AppService")
		// we'll ignore not-found errors, since they can't be fixed by an immediate
		// requeue (we'll need to wait for a new notification), and we can get them
		// on deleted requests.
		return ctrl.Result{}, client.IgnoreNotFound(err)
	}

	// Define a new Deployment object
	deployment := &appsv1.Deployment{
		ObjectMeta: metav1.ObjectMeta{
			Name:      appService.Name + "-deployment",
			Namespace: appService.Namespace,
		},
		Spec: appsv1.DeploymentSpec{
			Replicas: &appService.Spec.Replicas,
			Selector: &metav1.LabelSelector{
				MatchLabels: map[string]string{
					"app": appService.Name,
				},
			},
			Template: corev1.PodTemplateSpec{
				ObjectMeta: metav1.ObjectMeta{
					Labels: map[string]string{
						"app": appService.Name,
					},
				},
				Spec: corev1.PodSpec{
					Containers: []corev1.Container{
						{
							Name:  appService.Spec.Name,
							Image: appService.Spec.Image,
						},
					},
				},
			},
		},
	}

	// Set the owner reference so the deployment is deleted when the AppService is deleted
	if err := ctrl.SetControllerReference(&appService, deployment, r.Scheme); err != nil {
		return ctrl.Result{}, fmt.Errorf("failed to set owner reference: %w", err)
	}

	// Check if the Deployment already exists
	existingDeployment := &appsv1.Deployment{}
	err := r.Get(ctx, client.ObjectKey{Name: deployment.Name, Namespace: deployment.Namespace}, existingDeployment)
	if err != nil && client.IgnoreNotFound(err) != nil {
		log.Error(err, "failed to get Deployment")
		return ctrl.Result{}, err
	} else if err == nil {
		// Deployment already exists, update it
		existingDeployment.Spec = deployment.Spec
		if err := r.Update(ctx, existingDeployment); err != nil {
			log.Error(err, "failed to update Deployment")
			return ctrl.Result{}, err
		}
		log.Info("Deployment updated", "name", deployment.Name)
		return ctrl.Result{}, nil
	}

	// Create the Deployment
	log.Info("Creating a new Deployment", "name", deployment.Name)
	if err := r.Create(ctx, deployment); err != nil {
		log.Error(err, "failed to create Deployment")
		return ctrl.Result{}, err
	}

	// Deployment created successfully
	return ctrl.Result{}, nil
}

// SetupWithManager sets up the controller with the Manager.
func (r *AppServiceReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&examplev1alpha1.AppService{}).
		Owns(&appsv1.Deployment{}).
		Complete(r)
}
This is a simplified example. A production-ready operator would handle updates, scaling, and error conditions more robustly. It would also implement finalizers to ensure resources are cleaned up properly on deletion.
  1. Build and push the operator image:
NODE_TYPE // bash
make docker-build docker-push IMG=<your-image-name>
Replace <your-image-name> with the full image name, including the registry (e.g., docker.io/your-username/appservice-operator:v1). You must have a Docker Hub account or access to another container registry.
  1. Deploy the operator to your cluster. Update the config/manager/manager.yaml file with the correct image name:
NODE_TYPE // yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: controller-manager
  namespace: system
spec:
  ...
  template:
    spec:
      containers:
      - command:
        - /manager
        image: <your-image-name>  # Update this line!
        imagePullPolicy: Always
        name: manager
        ...
  1. Apply the updated manager.yaml file and other necessary manifests:
NODE_TYPE // bash
make deploy IMG=<your-image-name>

Task 4: Verifying the Operator

  1. Check the operator logs to ensure it’s running without errors:
NODE_TYPE // bash
kubectl logs -n system deployment.apps/controller-manager -f
  1. Create the AppService instance (if you haven’t already) by running kubectl apply -f my-app.yaml.

  2. Verify that the operator creates a Deployment for the AppService:

NODE_TYPE // bash
kubectl get deployments
NODE_TYPE // output
NAME                READY   UP-TO-DATE   AVAILABLE   AGE
my-app-deployment   3/3     3            3           1m
  1. Check the details of the deployment to ensure it matches the specifications in the AppService resource:
NODE_TYPE // bash
kubectl describe deployment my-app-deployment

Conclusion

You have successfully created a CustomResourceDefinition, deployed an instance of the custom resource, and built an operator to manage the resource. This provides a foundation for extending Kubernetes with your own domain-specific resources and automation logic. Key learnings:

  • CRDs: Allow you to define new resource types in Kubernetes.
  • Operators: Automate the management of CRDs and other Kubernetes resources.
  • Operator SDK: Simplifies the process of building Kubernetes operators.

Next Topic