Killercoda Deployment Scaling Issue

CKA-LOGO I am planning to take the CKA exam in the near future. I work with Kubernetes daily at my job, but am mostly self taught, so it is probably realistic to say that I have some knowledge gaps. This blog is part of my preparation, where I go through all the scenarios on Killercoda.

Deployment Scaling Issue

https://killercoda.com/course-cka/scenario/domain4-deployment-scaling

Create HPA for Deployment

A Deployment api-gateway is running in Namespace team-api with 3 replicas. Create an HPA named api-gateway for this Deployment that scales between 10 and 20 replicas based on 60% average CPU utilisation.

First let’s again check that we have the specified deployment:

$ k get deployments api-gateway -n team-api
NAME          READY   UP-TO-DATE   AVAILABLE   AGE
api-gateway   3/3     3            3           3m26s

In the documentation for the HorizontalPodAutoscaler (HPA) we learn, that they can be used to automatically update a workload resource based on demand. In this case horizontal means, that we create more pods to better spread out demand. In constrast a VerticalPodAutoscaler (VPA) would assign more resources to existing pods to handle load (see here & see here). In the HPA Documentation we also learn, that there is a specific kubectl autoscale command to simplify creation of HPAs. Using this command (see here) we can specify that the deployment api-gateway should scale between minimum 10 and maximum 20 pods (--min 10 & --max 20) to stay within the average cpu utilization of 60% (--cpu 60%):

$ k autoscale  -n team-api deployment api-gateway --min=10 --max=20 --cpu=60%

This now creates a new HPA, which looks as following:

$ k get hpa -n team-api
NAME          REFERENCE                TARGETS              MINPODS   MAXPODS   REPLICAS   AGE
api-gateway   Deployment/api-gateway   cpu: <unknown>/60%   10        20        3          17s


$ k get hpa -n team-api api-gateway -oyaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-gateway
  namespace: team-api
spec:
  maxReplicas: 20
  metrics:
  - resource:
      name: cpu
      target:
        averageUtilization: 60
        type: Utilization
    type: Resource
  minReplicas: 10
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-gateway

And if we look at the deployment, we can see that the number of desired replicas shot up to 10:

$ k get deployments api-gateway -n team-api
NAME          READY   UP-TO-DATE   AVAILABLE   AGE
api-gateway   5/10    5            5           11m

Fix the Scaling Issue

The HPA should have scaled the Deployment to at least 10 replicas, but not all Pods are running. Investigate why the Deployment cannot reach the desired replica count. Fix the issue so that the Namespace can run up to the HPA’s maximum of 20 Pods.

When looking at the deployment we see, that never more that 5 pods end up in the ready state:

$ k get deployments api-gateway -n team-api
NAME          READY   UP-TO-DATE   AVAILABLE   AGE
api-gateway   5/10    5            5           11m

Neither the Deployments nor the HPAs events give us any indication why this could be the case:

k describe deployments api-gateway -n team-api
...
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  11m   deployment-controller  Scaled up replica set api-gateway-bdb7449d9 from 0 to 3
  Normal  ScalingReplicaSet  76s   deployment-controller  Scaled up replica set api-gateway-bdb7449d9 from 3 to 10
$ k describe hpa api-gateway -n team-api
...
Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  4m    horizontal-pod-autoscaler  New size: 10; reason: Current number of replicas below Spec.MinReplicas

However if we look at the ReplicaSets event which is attached to the Deployment, we can see an error Error creating: pods "[...]" is forbidden: exceeded quota: ns-quota:

$ k get rs -n team-api
NAME                    DESIRED   CURRENT   READY   AGE
api-gateway-bdb7449d9   10        5         5       17m

$ k describe rs  api-gateway-bdb7449d9 -n team-api
...
Events:
  Type     Reason            Age                   From                   Message
  ----     ------            ----                  ----                   -------
  Normal   SuccessfulCreate  18m                   replicaset-controller  Created pod: api-gateway-bdb7449d9-dpk44
  Normal   SuccessfulCreate  18m                   replicaset-controller  Created pod: api-gateway-bdb7449d9-4zp5h
  Normal   SuccessfulCreate  18m                   replicaset-controller  Created pod: api-gateway-bdb7449d9-729f8
  Normal   SuccessfulCreate  8m                    replicaset-controller  Created pod: api-gateway-bdb7449d9-kzd9v
  Warning  FailedCreate      8m                    replicaset-controller  Error creating: pods "api-gateway-bdb7449d9-dkvbh" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5
  Normal   SuccessfulCreate  8m                    replicaset-controller  Created pod: api-gateway-bdb7449d9-ffqw9
  Warning  FailedCreate      8m                    replicaset-controller  Error creating: pods "api-gateway-bdb7449d9-lc4lw" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5
  Warning  FailedCreate      8m                    replicaset-controller  Error creating: pods "api-gateway-bdb7449d9-hkdnm" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5
  Warning  FailedCreate      8m                    replicaset-controller  Error creating: pods "api-gateway-bdb7449d9-ln8cb" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5
  Warning  FailedCreate      8m                    replicaset-controller  Error creating: pods "api-gateway-bdb7449d9-6qmft" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5
  Warning  FailedCreate      8m                    replicaset-controller  Error creating: pods "api-gateway-bdb7449d9-bbqk8" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5
  Warning  FailedCreate      8m                    replicaset-controller  Error creating: pods "api-gateway-bdb7449d9-wg9nd" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5
  Warning  FailedCreate      7m59s                 replicaset-controller  Error creating: pods "api-gateway-bdb7449d9-v4ccw" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5
  Warning  FailedCreate      7m58s                 replicaset-controller  Error creating: pods "api-gateway-bdb7449d9-r9cwg" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5
  Warning  FailedCreate      3m8s (x8 over 7m57s)  replicaset-controller  (combined from similar events): Error creating: pods "api-gateway-bdb7449d9-wl8l6" is forbidden: exceeded quota: ns-quota, requested: pods=1, used: pods=5, limited: pods=5

Reading up on the documentation again we can find ResourceQuotas, which define hard resource limits on a namespace level. This could be CPU or memory resources or also hard limits of how many pods are allowed to be started (see here & see here). Checking if we have a ResourceQuota we find one named ns-quota which limits the number of pods to 5:

$ k get resourcequota -A
NAMESPACE   NAME       REQUEST     LIMIT                 AGE
team-api    ns-quota   pods: 5/5   limits.cpu: 250m/10   22m

$ k get resourcequota -n team-api ns-quota -oyaml
apiVersion: v1
kind: ResourceQuota
metadata:
  creationTimestamp: "2026-05-16T09:52:32Z"
  name: ns-quota
  namespace: team-api
  resourceVersion: "6701"
  uid: 45220fd6-3aad-4ba7-af3e-9e93c2f8a3ef
spec:
  hard:
    limits.cpu: "10"
    pods: "5"
status:
  hard:
    limits.cpu: "10"
    pods: "5"
  used:
    limits.cpu: 250m
    pods: "5"

To fix this we can just increase the limit to 20:

k edit resourcequota -n team-api ns-quota -oyaml
apiVersion: v1
kind: ResourceQuota
metadata:
  creationTimestamp: "2026-05-16T09:52:32Z"
  name: ns-quota
  namespace: team-api
  resourceVersion: "8700"
  uid: 45220fd6-3aad-4ba7-af3e-9e93c2f8a3ef
spec:
  hard:
    limits.cpu: "10"
    pods: "20"
status:
  hard:
    limits.cpu: "10"
    pods: "20"
  used:
    limits.cpu: 250m
    pods: "5"

After recreating the ReplicaSet we can now see, that the expected 10 replicas are started:

$ k delete rs  api-gateway-bdb7449d9 -n team-api
replicaset.apps "api-gateway-bdb7449d9" deleted from team-api namespace
root@controlplane:~$ k get rs -n team-api
NAME                    DESIRED   CURRENT   READY   AGE
api-gateway-bdb7449d9   10        10        0       8s
root@controlplane:~$ k get rs -n team-api
NAME                    DESIRED   CURRENT   READY   AGE
api-gateway-bdb7449d9   10        10        6       11s
root@controlplane:~$ k get rs -n team-api
NAME                    DESIRED   CURRENT   READY   AGE
api-gateway-bdb7449d9   10        10        10      12s

Trigger HPA Scale Up

Cause enough CPU load on the Pods so that the HPA scales the Deployment up by at least one replica. The Pods run image polinux/stress which means you can run stress –cpu 1 in them to cause CPU load.

We can do this by executing the stress binary in a few of the containers in the background. If we then check the resource consumption of the pods using kubectl top we can see, that they hit their cpu limit of 50m and new pods are already being started:

$(k exec -n team-api api-gateway-bdb7449d9-dgj2n -- stress --cpu 1) &
$(k exec -n team-api api-gateway-bdb7449d9-bqtms -- stress --cpu 1) &
$(k exec -n team-api api-gateway-bdb7449d9-dgj2n -- stress --cpu 1) &
$(k exec -n team-api api-gateway-bdb7449d9-j78fk -- stress --cpu 1) &

$ k top pod -n team-api
NAME                          CPU(cores)   MEMORY(bytes)   
api-gateway-bdb7449d9-8qcjj   51m          0Mi             
api-gateway-bdb7449d9-bqtms   50m          0Mi             
api-gateway-bdb7449d9-dgj2n   46m          0Mi             
api-gateway-bdb7449d9-j78fk   21m          0Mi             
api-gateway-bdb7449d9-k85nc   0m           0Mi             
api-gateway-bdb7449d9-mgww4   0m           0Mi             
api-gateway-bdb7449d9-pwt5j   0m           0Mi             
api-gateway-bdb7449d9-qqrlt   0m           0Mi             
api-gateway-bdb7449d9-rmbkj   0m           0Mi             
api-gateway-bdb7449d9-s5txg   2m           0Mi             
api-gateway-bdb7449d9-w8f2m   0m           0Mi             
api-gateway-bdb7449d9-z9qnp   0m           0Mi             
api-gateway-bdb7449d9-zxgqv   0m           0Mi   

In the events of the HPA we can also see the Rescale event:

$ k describe hpa api-gateway -n team-api
...
 Normal   SuccessfulRescale             104s   horizontal-pod-autoscaler  New size: 13; reason: cpu resource utilization (percentage of request) above target
 Normal   SuccessfulRescale             89s    horizontal-pod-autoscaler  New size: 20; reason: cpu resource utilization (percentage of request) above target

Updated: