Deploy Cluster Autoscaler

Introduction

This guide explains how to deploy the Cluster Autoscaler. Cluster Autoscaler is a tool that automatically adjusts the number of worker nodes in a Kubernetes cluster by:

Scaling up when Pods cannot be scheduled due to insufficient resources.
Scaling down when worker nodes have been underutilized for a prolonged period of time and their Pods can be moved to other worker nodes.

Switch Cloud Kubernetes (SCK) leverages the Cluster API to manage clusters. Therefore, you will use the Cluster API as the cloud provider when deploying the Cluster Autoscaler.

Step 0: Prerequisites

Helm

Ensure that Helm is installed locally. If it isn't, follow the official Helm installation guide.

Download Templates

Download the following templates and store them in a dedicated folder:

Or copy and paste them from below:

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - nginx
            topologyKey: kubernetes.io/hostname
            namespaces:
            - default

values.yaml

autoDiscovery:
  namespace: kube-system
cloudProvider: clusterapi
extraEnv:
  CAPI_GROUP: "cluster.k8s.io"

Step 1: Deploy Cluster Autoscaler

Install Helm Chart

You can deploy the Cluster Autoscaler using the official Helm chart with minimal configuration. To install or upgrade the deployment, run the following command using the provided values.yaml from step 0:

helm upgrade --install cluster-autoscaler cluster-autoscaler \
  --repo https://kubernetes.github.io/autoscaler \
  --namespace kube-system \
  --post-renderer=sed \
  --post-renderer-args="-e s/x-k8s/k8s/" \
  --values values.yaml

This will:

Install or upgrade the Cluster Autoscaler release using the official Helm chart.
Use the kube-system Namespace for installation.
Apply your values.yaml configuration for provider, autodiscovery, and environment variables.
Patch the rendered Kubernetes manifests using sed to replace all occurrences of the x-k8s.io API group with k8s.io. This is required for compatibility with SCK clusters.

Warning

Whenever you update or redeploy the Helm chart, don't forget to use the post-renderer with its arguments.

Example output

Release "cluster-autoscaler" does not exist. Installing it now.
NAME: cluster-autoscaler
LAST DEPLOYED: Mon May  5 16:12:41 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
To verify that cluster-autoscaler has started, run:

  kubectl --namespace=kube-system get pods -l "app.kubernetes.io/name=clusterapi-cluster-autoscaler,app.kubernetes.io/instance=cluster-autoscaler"

As mentioned in the example output above, you can check the status of your installation by running:

kubectl --namespace=kube-system get pods -l "app.kubernetes.io/name=clusterapi-cluster-autoscaler,app.kubernetes.io/instance=cluster-autoscaler"

Example output

NAME                                                              READY   STATUS    RESTARTS   AGE
cluster-autoscaler-clusterapi-cluster-autoscaler-754c64c75djvjl   1/1     Running   0          117s

Enable Autoscaling

In order to enable the automatic scaling of worker nodes, you need to annotate your MachineDeployment.

First, use the below command to find the name of your MachineDeployment:

kubectl get machinedeployments --namespace kube-system

Example output

NAME                 REPLICAS   AVAILABLE-REPLICAS   PROVIDER    OS       KUBELET   AGE
infallible-volhard   2          2                    openstack   ubuntu   1.32.4    9d

Then, annotate your MachineDeployment with the desired minimum and maximum number of worker nodes:

kubectl annotate machinedeployment <md_name> cluster.k8s.io/cluster-api-autoscaler-node-group-min-size="2" --namespace kube-system

This sets the minimum number of worker nodes for the specified MachineDeployment. The Cluster Autoscaler will not scale the MachineDeployment below this number.

kubectl annotate machinedeployment <md_name> cluster.k8s.io/cluster-api-autoscaler-node-group-max-size="5" --namespace kube-system

This sets the maximum number of worker nodes for the specified MachineDeployment. The Cluster Autoscaler will not scale the MachineDeployment above this number.

Step 2: Create Deployment to Verify Autoscaling

You will now deploy a nginx Deployment with multiple replicas. To trigger the Cluster Autoscaler, you will make use of inter-pod anti-affinity rules.

excerpt from deployment.yaml

...
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - nginx
        topologyKey: kubernetes.io/hostname
        namespaces:
        - default

By specifying podAntiAffinity, you make sure that the scheduler will avoid scheduling a Pod onto a node if there is already a Pod with the label app=nginx on that node.

Important

To see the Cluster Autoscaler in action, adjust the number of deployment replicas to the number of worker nodes in your cluster. This guide assumes your cluster initially has two worker nodes and you will deploy three replicas to see the scaling up.

Apply the Deployment:

kubectl apply --filename deployment.yaml

If you check the status of the deployed Pods:

kubectl get pods

You will notice that one of the Pods is in the Pending state:

Example output

NAME                           READY   STATUS    RESTARTS   AGE
nginx-855f88d87-8lpvc          1/1     Running   0          12s
nginx-84bb4654bc-6n9mk         1/1     Running   0          12h
nginx-855f88d87-9hcbn          0/1     Pending   0          12s

This is expected since no node exists on which the Pod can be scheduled.

The Cluster Autoscaler will check the state of the Pods, discover that some are in the Pending state, and try to provision new worker nodes in the cluster. You can observe this behavior in the logs of the Cluster Autoscaler Pod:

kubectl logs --namespace kube-system deployments/cluster-autoscaler-clusterapi-cluster-autoscaler

Example logs

...
I0505 14:34:53.609496       1 klogx.go:87] Pod default/nginx-84bb4654bc-6n9mk is unschedulable
I0505 14:34:54.215635       1 orchestrator.go:185] Best option to resize: MachineDeployment/kube-system/infallible-volhard
I0505 14:34:54.215836       1 orchestrator.go:189] Estimated 1 nodes needed in MachineDeployment/kube-system/infallible-volhard
I0505 14:34:54.216025       1 orchestrator.go:254] Final scale-up plan: [{MachineDeployment/kube-system/infallible-volhard 2->3 (max: 5)}]
I0505 14:34:54.216129       1 executor.go:166] Scale-up: setting group MachineDeployment/kube-system/infallible-volhard size to 3

To observe scaling down you can modify the deployment.yaml to have two replicas, apply the file, and wait approximately ten minutes for the Cluster Autoscaler to remove the no longer needed node.

Conclusion

You have now verified that the Cluster Autoscaler is working as expected. It will increase the size of a cluster if any Pods fail to schedule and decrease the size when some worker nodes are consistently unneeded for a significant amount of time.