How to Configure Kubernetes Horizontal Pod Autoscaler using Metrics Server

Published on August 24, 2022

How to Configure Kubernetes Horizontal Pod Autoscaler using Metrics Server

Introduction

Kubernetes aims to provide both resilience and scalability. It achieves this by deploying multiple pods with different resource allocations, to provide redundancy for your applications. Although you can grow and shrink your own deployments manually based on your needs, Kubernetes provides first-class support for scaling on-demand, using a feature called Horizontal Pod Autoscaling. It is a closed loop system that automatically grows or shrinks resources (application Pods) based on your current needs. You create a HorizontalPodAutoscaler (or HPA) resource for each application deployment that needs autoscaling, and let it take care of the rest for you automatically.

At a high level, HPA does the following:

It keeps an eye on resource requests metrics coming from your application workloads (Pods), by querying the metrics server.
It compares the target threshold value that you set in the HPA definition with the average resource utilization observed for your application workloads (CPU and memory).
If the target threshold is reached, then HPA will scale up your application deployment to meet higher demands. Otherwise, if below the threshold, it will scale down the deployment. To see what logic HPA uses to scale your application deployment, you can review the algorithm details page from the official documentation.

Under the hood, a HorizontalPodAutoscaler is a CRD (Custom Resource Definition) which drives a Kubernetes control loop implemented via a dedicated controller within the Control Plane of your cluster. You create a HorizontalPodAutoscaler YAML manifest targeting your application Deployment, and then use kubectl to apply the HPA resource in your cluster.

In order to work, HPA needs a metrics server available in your cluster to scrape required metrics, such as CPU and memory utilization. One straightforward option is the Kubernetes Metrics Server. The Metrics Server works by collecting resource metrics from Kubelets and exposing them via the Kubernetes API Server to the Horizontal Pod Autoscaler. The Metrics API can also be accessed via kubectl top if needed.

In this tutorial, you will:

Deploy Metrics Server to your Kubernetes cluster.
Learn how to create Horizontal Pod Autoscalers for your applications.
Test each HPA setup, using two scenarios: constant and variable application load.

If you’re looking for a managed Kubernetes hosting service, check out our simple, managed Kubernetes service built for growth.

Prerequisites

To follow this tutorial, you will need:

A Kubernetes cluster with role-based access control (RBAC) enabled. This setup will use a DigitalOcean Kubernetes cluster, but you could also create a cluster manually. Your Kubernetes version should be between 1.20 and 1.25.
The kubectl command-line tool installed in your local environment and configured to connect to your cluster. You can read more about installing kubectl in the official documentation. If you are using a DigitalOcean Kubernetes cluster, please refer to How to Connect to a DigitalOcean Kubernetes Cluster to learn how to connect to your cluster using kubectl.
The version control tool Git available in your development environment. If you are working in Ubuntu, you can refer to installing Git on Ubuntu 22.04
The Kubernetes Helm package manager also available in your development environment. You can refer to how to install software with Helm to install Helm locally.

Step 1 – Install Metrics Server via Helm

You’ll start by adding the metrics-server repository to your helm package lists. You can use helm repo add:

helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server

Next, use helm repo update to refresh the available packages:

helm repo update metrics-server

OutputHang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "metrics-server" chart repository
Update Complete. ⎈Happy Helming!⎈

Now that you’ve added the repository to helm, you’ll be able to add metrics-server to your Kubernetes deployments. You could write your own deployment configuration here, but this tutorial will follow DigitalOcean’s Kubernetes Starter Kit, which includes a configuration for metrics-server.

To do that, clone the Kubernetes Starter Kit Git repository:

git clone https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers.git

The metrics-server configuration is located in Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/metrics-server-values-v3.8.2.yaml. You can view or edit it by using nano or your favorite text editor:

nano Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/metrics-server-values-v3.8.2.yaml

It contains a few stock parameters. Note that replicas is a fixed value, 2.

metrics-server-values-v3.8.2.yaml

## Starter Kit metrics-server configuration
## Ref: https://github.com/kubernetes-sigs/metrics-server/blob/metrics-server-helm-chart-3.8.2/charts/metrics-server
##

# Number of metrics-server replicas to run
replicas: 2

apiService:
  # Specifies if the v1beta1.metrics.k8s.io API service should be created.
  #
  # You typically want this enabled! If you disable API service creation you have to
  # manage it outside of this chart for e.g horizontal pod autoscaling to
  # work with this release.
  create: true

hostNetwork:
  # Specifies if metrics-server should be started in hostNetwork mode.
  #
  # You would require this enabled if you use alternate overlay networking for pods and
  # API server unable to communicate with metrics-server. As an example, this is required
  # if you use Weave network on EKS
  enabled: false

Refer to the Metrics Server chart page for an explanation of the available parameters for metrics-server.

Note: You need to be fairly careful when matching Kubernetes deployments to your running version of Kubernetes, and the helm charts themselves are also versioned to enforce this. The current upstream helm chart for metrics-server is 3.8.2, which deploys version 0.6.1 of metrics-server itself. From the Metrics Server Compatibility Matrix, you can see that version 0.6.x supports Kubernetes 1.19+.

After you’ve reviewed the file and made any changes, you can proceed with deploying metrics-server, by providing this file along with the helm install command:

HELM_CHART_VERSION="3.8.2"

helm install metrics-server metrics-server/metrics-server --version "$HELM_CHART_VERSION" \
  --namespace metrics-server \
  --create-namespace \
  -f "Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/metrics-server-values-v${HELM_CHART_VERSION}.yaml"

This will deploy metrics-server to your configured Kubernetes cluster:

OutputNAME: metrics-server
LAST DEPLOYED: Wed May 25 11:54:43 2022
NAMESPACE: metrics-server
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
***********************************************************************
* Metrics Server                                                      *
***********************************************************************
  Chart version: 3.8.2
  App version:   0.6.1
  Image tag:     k8s.gcr.io/metrics-server/metrics-server:v0.6.1
***********************************************************************

After deploying, you can use helm ls to verify that metrics-server has been added to your deployment:

helm ls -n metrics-server

OutputNAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
metrics-server  metrics-server  1               2022-02-24 14:58:23.785875 +0200 EET    deployed        metrics-server-3.8.2    0.6.1

Next, you can check the status of all of the Kubernetes resources deployed to the metrics-server namespace:

kubectl get all -n metrics-server

Based on the configuration you deployed with, both the deployment.apps and replicaset.apps values should count 2 available instances.

OutputNAME                                  READY   STATUS    RESTARTS   AGE
pod/metrics-server-694d47d564-9sp5h   1/1     Running   0          8m54s
pod/metrics-server-694d47d564-cc4m2   1/1     Running   0          8m54s

NAME                     TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/metrics-server   ClusterIP   10.245.92.63   <none>        443/TCP   8m54s

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/metrics-server   2/2     2            2           8m55s

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/metrics-server-694d47d564   2         2         2       8m55s

You have now deployed metrics-server into your Kubernetes cluster. In the next step, you’ll review some of the parameters of a HorizontalPodAutoscaler Custom Resource Definition.

Step 2 - Getting to Know HPAs

So far, your configurations have used a fixed value for the number of ReplicaSet instances to deploy. In this step you will learn how to define a HorizontalPodAutoscaler CRD so that this value can dynamically grow or shrink.

A typical HorizontalPodAutoscaler CRD looks like this:

crd.yaml

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app-deployment
  minReplicas: 1
  maxReplicas: 3
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50

The parameters used in this configuration are as follows:

spec.scaleTargetRef: A named reference to the resource being scaled.
spec.minReplicas: The lower limit for the number of replicas to which the autoscaler can scale down.
spec.maxReplicas: The upper limit.
spec.metrics.type: The metric to use to calculate the desired replica count. This example is using the Resource type, which tells the HPA to scale the deployment based on average CPU (or memory) utilization. averageUtilization is set to a threshold value of 50.

You have two options to create an HPA for your application deployment:

Use the kubectl autoscale command on an existing deployment.
Create a HPA YAML manifest, and then use kubectl to apply changes to your cluster.

You’ll try option #1 first, using another configuration from the DigitalOcean Kubernetes Starter Kit. It contains a deployment called myapp-test.yaml which will demonstrate HPA in action by creating some arbitrary CPU load.

You can review that file by using nano or your favorite text editor:

nano Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/myapp-test.yaml

myapp-test.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-test
spec:
  selector:
    matchLabels:
      run: myapp-test
  replicas: 1
  template:
    metadata:
      labels:
        run: myapp-test
    spec:
      containers:
        - name: busybox
          image: busybox
          resources:
            limits:
              cpu: 50m
            requests:
              cpu: 20m
          command: ["sh", "-c"]
          args:
            - while [ 1 ]; do
              echo "Test";
              sleep 0.01;
              done

Note the last few lines of this file. They contain some shell syntax to repeatedly print “Test” a hundred times a second, to simulate load. Once you are done reviewing the file, you can deploy it into your cluster using kubectl:

kubectl apply -f Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/myapp-test.yaml

Next, use kubectl autoscale to create a HorizontalPodAutoscaler targeting the myapp-test deployment:

kubectl autoscale deployment myapp-test --cpu-percent=50 --min=1 --max=3

Note the arguments passed to this command – this means that your deployment will be scaled between 1 and 3 replicas whenever CPU utilization reaches 50 percent.

You can check if the HPA resource was created by running kubectl get hpa:

kubectl get hpa

The TARGETS column of the output will eventually show a figure of current usage%/target usage%.

OutputNAME         REFERENCE                  TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
myapp-test   Deployment/myapp-test      240%/50%   1         3         3          52s

Note: The TARGETS column value will display <unknown>/50% for a while (around 15 seconds). This is normal, because HPA needs to collect average values over time, and it won’t have enough data before the first 15 second interval. By default, HPA checks metrics every 15 seconds.

You can also observe the logged events that a HPA generates by using kubectl describe:

kubectl describe hpa myapp-test

OutputName:                                                  myapp-test
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Mon, 28 May 2022 10:10:50 -0800
Reference:                                             Deployment/myapp-test
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  240% (48m) / 50%
Min replicas:                                          1
Max replicas:                                          3
Deployment pods:                                       3 current / 3 desired
...
Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  17s   horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  37s   horizontal-pod-autoscaler  New size: 3; reason: cpu resource utilization (percentage of request) above target

This is the kubectl autoscale method. In a production scenario, you should usually instead use a dedicated YAML manifest to define each HPA. This way, you can track changes by having the manifest committed to a Git repository, and modify it as needed.

You will walk through an example of this in the last step of this tutorial. Before moving on, delete the myapp-test deployment and corresponding HPA resource:

kubectl delete hpa myapp-test
kubectl delete deployment myapp-test

Step 3 - Scaling Applications Automatically via Metrics Server

In this last step, you’ll experiment with two different ways of generating server load and scaling via a YAML manifest:

An application deployment that creates constant load by performing some CPU intensive computations.
A shell script simulates that external load by performing fast successive HTTP calls for a web application.

Constant Load Test

In this scenario, you will create a sample application implemented using Python, which performs some CPU intensive computations. Similar to the shell script from the last step, this Python code is included in one of the example manifests from the starter kit. You can open the constant-load-deployment-test.yaml using nano or your favorite text editor:

nano Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/constant-load-deployment-test.yaml

constant-load-deployment-test.yaml

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: python-test-code-configmap
data:
  entrypoint.sh: |-
    #!/usr/bin/env python

    import math

    while True:
      x = 0.0001
      for i in range(1000000):
        x = x + math.sqrt(x)
        print(x)
      print("OK!")

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: constant-load-deployment-test
spec:
  selector:
    matchLabels:
      run: python-constant-load-test
  replicas: 1
  template:
    metadata:
      labels:
        run: python-constant-load-test
    spec:
      containers:
        - name: python-runtime
          image: python:alpine3.15
          resources:
            limits:
              cpu: 50m
            requests:
              cpu: 20m
          command:
            - /bin/entrypoint.sh
          volumeMounts:
            - name: python-test-code-volume
              mountPath: /bin/entrypoint.sh
              readOnly: true
              subPath: entrypoint.sh
      volumes:
        - name: python-test-code-volume
          configMap:
            defaultMode: 0700
            name: python-test-code-configmap

The Python code, which repeatedly generates arbitrary square roots, is highlighted above. The deployment will fetch a docker image hosting the required python runtime, and then attach a ConfigMap to the application Pod hosting the sample Python script shown earlier.

First, create a separate namespace for this deployment (for better observation), then deploy it via kubectl:

kubectl create ns hpa-constant-load

kubectl apply -f Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/constant-load-deployment-test.yaml -n hpa-constant-load

Outputconfigmap/python-test-code-configmap created
deployment.apps/constant-load-deployment-test created

Note: The sample deployment also configures resource request limits for the sample application Pods. This is important because HPA logic relies on having resource requests limits set for your Pods. In general, it is advisable to set resource requests limits for all your application Pods, to avoid unpredictable bottlenecks.

Verify that the deployment was created successfully, and that it’s up and running:

kubectl get deployments -n hpa-constant-load

OutputNAME                            READY   UP-TO-DATE   AVAILABLE   AGE
constant-load-deployment-test   1/1     1            1           8s

Next, you’ll need to deploy another HPA to this cluster. There is an example matched to this scenario in constant-load-hpa-test.yaml, which you can open with nano or your favorite text editor:

nano Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/constant-load-hpa-test.yaml -n hpa-constant-load

constant-load-hpa-test.yaml

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: constant-load-test
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: constant-load-deployment-test
  minReplicas: 1
  maxReplicas: 3
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50

Deploy it via kubectl:

kubectl apply -f Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/constant-load-hpa-test.yaml -n hpa-constant-load

This will create a HPA resource, targeting the sample Python deployment. You can check the constant-load-test HPA state via kubectl get hpa:

kubectl get hpa constant-load-test -n hpa-constant-load

Note the REFERENCE column targeting constant-load-deployment-test, as well as the TARGETS column showing current CPU resource requests versus the threshold value, as in the last example.

OutputNAME                 REFERENCE                                  TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
constant-load-test   Deployment/constant-load-deployment-test   255%/50%   1         3         3          49s

You may also notice that the REPLICAS column value increased from 1 to 3 for the sample application deployment, as stated in the HPA CRD spec. This happened very quickly because the application used in this example generates CPU load very quickly. As in the previous example, you can also inspect logged HPA events using kubectl describe hpa -n hpa-constant-load.

External Load Test

A more interesting and realistic scenario is to observe where external load is created. For this final example you’re going to use a different namespace and set of manifests to avoid reusing any data from the previous test.

This example will use the quote of the moment sample server. Every time an HTTP request is made to this server, it sends a different quote as a response. You’ll create load on your cluster by sending HTTP requests every 1ms. This deployment is included in quote_deployment.yaml. Review this file using nano or your favorite text editor:

nano Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/quote_deployment.yaml

quote_deployment.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: quote
spec:
  replicas: 1
  selector:
    matchLabels:
      app: quote
  template:
    metadata:
      labels:
        app: quote
    spec:
      containers:
        - name: quote
          image: docker.io/datawire/quote:0.4.1
          ports:
            - name: http
              containerPort: 8080
          resources:
            requests:
              cpu: 100m
              memory: 50Mi
            limits:
              cpu: 200m
              memory: 100Mi

---
apiVersion: v1
kind: Service
metadata:
  name: quote
spec:
  ports:
    - name: http
      port: 80
      targetPort: 8080
  selector:
    app: quote

Note that the actual HTTP query script is not contained within the manifest this time – this manifest only provisions an app to run the queries for now. When you are done reviewing the file, create the quote namespace and deployment using kubectl:

kubectl create ns hpa-external-load

kubectl apply -f Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/quote_deployment.yaml -n hpa-external-load

Verify that the quote application deployment and services are up and running:

kubectl get all -n hpa-external-load

OutputNAME                             READY   STATUS    RESTARTS   AGE
pod/quote-dffd65947-s56c9        1/1     Running   0          3m5s

NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/quote   ClusterIP   10.245.170.194   <none>        80/TCP    3m5s

NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/quote       1/1     1            1           3m5s

NAME                                   DESIRED   CURRENT   READY   AGE
replicaset.apps/quote-6c8f564ff        1         1         1       3m5s

Next, you’ll create the HPA for the quote deployment. This is configured in quote-deployment-hpa-test.yaml. Review the file in nano or your favorite text editor:

nano Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/quote-deployment-hpa-test.yaml

quote-deployment-hpa-test.yaml

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: external-load-test
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: quote
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 60
  minReplicas: 1
  maxReplicas: 3
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 20

Note that in this case there’s a different threshold value set for the CPU utilization resource metric (20%). There is also a different scaling behavior. This configuration alters the scaleDown.stabilizationWindowSeconds behavior, and sets it to a lower value of 60 seconds. This is not always needed in practice, but in this case you may want to speed up things to see more quickly how the autoscaler performs the scale down action. By default, the HorizontalPodAutoscaler has a cool down period of 5 minutes. This is sufficient in most cases, and should avoid fluctuations when replicas are being scaled.

When you’re ready, deploy it using kubectl:

kubectl apply -f Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/manifests/hpa/metrics-server/quote-deployment-hpa-test.yaml -n hpa-external-load

Now, check if the HPA resource is in place and alive:

kubectl get hpa external-load-test -n hpa-external-load

OutputNAME                 REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
external-load-test   Deployment/quote   1%/20%    1         3         1          108s

Finally, you will run the actual HTTP queries, using the shell script quote_service_load_test.sh. The reason that this shell script was not embedded into the manifest earlier is so that you can observe it running in your cluster while logging directly to your terminal. Review the script using nano or your favorite text editor:

nano Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/scripts/quote_service_load_test.sh

quote_service_load_test.sh

#!/usr/bin/env sh

echo
echo "[INFO] Starting load testing in 10s..."
sleep 10
echo "[INFO] Working (press Ctrl+C to stop)..."
kubectl run -i --tty load-generator \
    --rm \
    --image=busybox \
    --restart=Never \
    -n hpa-external-load \
    -- /bin/sh -c "while sleep 0.001; do wget -q -O- http://quote; done" > /dev/null 2>&1
echo "[INFO] Load testing finished."

For this demonstration, open two separate terminal windows. In the first, run the quote_service_load_test.sh shell script:

Kubernetes-Starter-Kit-Developers/09-scaling-application-workloads/assets/scripts/quote_service_load_test.sh

Next, in the second window, run a kubectl watch command using the -w flag on the HPA resource:

kubectl get hpa -n hpa-external-load -w

You should see the load tick upwards and scale automatically:

OutputNAME                 REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
external-load-test   Deployment/quote   1%/20%    1         3         1          2m49s
external-load-test   Deployment/quote   29%/20%   1         3         1          3m1s
external-load-test   Deployment/quote   67%/20%   1         3         2          3m16s

You can observe how the autoscaler kicks in when load increases, and increments the quote server deployment replica set to a higher value. As soon as the load generator script is stopped, there’s a cool down period, and after 1 minute or so the replica set is lowered to the initial value of 1. You can press Ctrl+C to terminate the running script after navigating back to the first terminal window.

Conclusion

In this tutorial, you deployed and observed the behavior of Horizontal Pod Autoscaling (HPA) using Kubernetes Metrics Server under several different scenarios. HPA is an essential component of Kubernetes that helps your infrastructure handle more traffic on an as-needed basis.

Metrics Server has a significant limitation in that it cannot provide any metrics beyond CPU or memory usage. You can further review Metrics Server documentation to understand how to work within its use cases. If you need to scale using any other metrics (such as disk usage or network load), you can use Prometheus via a special adapter, named prometheus-adapter.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the author(s)

Alex Garnett

Author

Senior DevOps Technical Writer

See author profile

Former Senior DevOps Technical Writer at DigitalOcean. Expertise in topics including Ubuntu 22.04, Linux, Rocky Linux, Debian 11, and more.

See author profile

Bikram Gupta

Author

Category:

Tags:

Still looking for an answer?

Ask a question Search for more help

Was this helpful?

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

dashcrush

August 26, 2022

The Horizontal Pod Autoscaler is a built-in Kubernetes feature that allows to horizontally scale applications based on one or more monitored metrics. Horizontal scaling means increasing and decreasing the number of replicas. Vertical scaling means increasing and decreasing the compute resources of a single replica.

This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.

Become a contributor for community

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

DigitalOcean Documentation

Full documentation for every DigitalOcean product.

Learn more

Resources for startups and SMBs

The Wave has everything you need to know about building a business, from raising funding to marketing your product.

Learn more

Get our newsletter

Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.

New accounts only. By submitting your email you agree to our Privacy Policy

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

View all products

Get started for free

Get started

*This promotional offer applies to new accounts only.

Report this

How to Configure Kubernetes Horizontal Pod Autoscaler using Metrics Server

Introduction

Prerequisites

Step 1 – Install Metrics Server via Helm

Step 2 - Getting to Know HPAs

Step 3 - Scaling Applications Automatically via Metrics Server

Constant Load Test

External Load Test

Conclusion

About the author(s)

Still looking for an answer?

Join the Tech Talk

Deploy on DigitalOcean

Become a contributor for community

DigitalOcean Documentation

Resources for startups and SMBs

Get our newsletter

The developer cloud

Get started for free