Deploy Cluster Autoscaler
Introduction
This guide explains how to deploy the Cluster Autoscaler. Cluster Autoscaler is a tool that automatically adjusts the number of worker nodes in a Kubernetes cluster by:
- Scaling up when Pods cannot be scheduled due to insufficient resources.
- Scaling down when worker nodes have been underutilized for a prolonged period of time and their Pods can be moved to other worker nodes.
Switch Cloud Kubernetes (SCK) leverages the Cluster API to manage clusters. Therefore, you will use the Cluster API as the cloud provider when deploying the Cluster Autoscaler.
Step 0: Prerequisites
Helm
Ensure that Helm is installed locally. If it isn't, follow the official Helm installation guide.
Download Templates
Download the following templates and store them in a dedicated folder:
Or copy and paste them from below:
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- nginx
topologyKey: kubernetes.io/hostname
namespaces:
- default
values.yaml
Step 1: Deploy Cluster Autoscaler
Install Helm Chart
You can deploy the Cluster Autoscaler using the official Helm chart with minimal configuration. To install or upgrade the deployment, run the following command using the provided values.yaml
from step 0:
helm upgrade --install cluster-autoscaler cluster-autoscaler \
--repo https://kubernetes.github.io/autoscaler \
--namespace kube-system \
--post-renderer=sed \
--post-renderer-args="-e s/x-k8s/k8s/" \
--values values.yaml
This will:
- Install or upgrade the Cluster Autoscaler release using the official Helm chart.
- Use the
kube-system
Namespace for installation. - Apply your
values.yaml
configuration for provider, autodiscovery, and environment variables. - Patch the rendered Kubernetes manifests using
sed
to replace all occurrences of thex-k8s.io
API group withk8s.io
. This is required for compatibility with SCK clusters.
Warning
Whenever you update or redeploy the Helm chart, don't forget to use the post-renderer with its arguments.
Example output
Release "cluster-autoscaler" does not exist. Installing it now.
NAME: cluster-autoscaler
LAST DEPLOYED: Mon May 5 16:12:41 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
To verify that cluster-autoscaler has started, run:
kubectl --namespace=kube-system get pods -l "app.kubernetes.io/name=clusterapi-cluster-autoscaler,app.kubernetes.io/instance=cluster-autoscaler"
As mentioned in the example output above, you can check the status of your installation by running:
kubectl --namespace=kube-system get pods -l "app.kubernetes.io/name=clusterapi-cluster-autoscaler,app.kubernetes.io/instance=cluster-autoscaler"
Example output
Enable Autoscaling
In order to enable the automatic scaling of worker nodes, you need to annotate your MachineDeployment.
First, use the below command to find the name of your MachineDeployment:
Example output
Then, annotate your MachineDeployment with the desired minimum and maximum number of worker nodes:
kubectl annotate machinedeployment <md_name> cluster.k8s.io/cluster-api-autoscaler-node-group-min-size="2" --namespace kube-system
This sets the minimum number of worker nodes for the specified MachineDeployment. The Cluster Autoscaler will not scale the MachineDeployment below this number.
kubectl annotate machinedeployment <md_name> cluster.k8s.io/cluster-api-autoscaler-node-group-max-size="5" --namespace kube-system
This sets the maximum number of worker nodes for the specified MachineDeployment. The Cluster Autoscaler will not scale the MachineDeployment above this number.
Step 2: Create Deployment to Verify Autoscaling
You will now deploy a nginx
Deployment with multiple replicas. To trigger the Cluster Autoscaler, you will make use of inter-pod anti-affinity rules.
excerpt from deployment.yaml
By specifying podAntiAffinity
, you make sure that the scheduler will avoid scheduling a Pod onto a node if there is already a Pod with the label app=nginx
on that node.
Important
To see the Cluster Autoscaler in action, adjust the number of deployment replicas to the number of worker nodes in your cluster. This guide assumes your cluster initially has two worker nodes and you will deploy three replicas to see the scaling up.
Apply the Deployment:
If you check the status of the deployed Pods:
You will notice that one of the Pods is in the Pending
state:
Example output
This is expected since no node exists on which the Pod can be scheduled.
The Cluster Autoscaler will check the state of the Pods, discover that some are in the Pending
state, and try to provision new worker nodes in the cluster.
You can observe this behavior in the logs of the Cluster Autoscaler Pod:
Example logs
...
I0505 14:34:53.609496 1 klogx.go:87] Pod default/nginx-84bb4654bc-6n9mk is unschedulable
I0505 14:34:54.215635 1 orchestrator.go:185] Best option to resize: MachineDeployment/kube-system/infallible-volhard
I0505 14:34:54.215836 1 orchestrator.go:189] Estimated 1 nodes needed in MachineDeployment/kube-system/infallible-volhard
I0505 14:34:54.216025 1 orchestrator.go:254] Final scale-up plan: [{MachineDeployment/kube-system/infallible-volhard 2->3 (max: 5)}]
I0505 14:34:54.216129 1 executor.go:166] Scale-up: setting group MachineDeployment/kube-system/infallible-volhard size to 3
To observe scaling down you can modify the deployment.yaml
to have two replicas, apply the file, and wait approximately ten minutes for the Cluster Autoscaler to remove the no longer needed node.
Conclusion
You have now verified that the Cluster Autoscaler is working as expected. It will increase the size of a cluster if any Pods fail to schedule and decrease the size when some worker nodes are consistently unneeded for a significant amount of time.