Product Security Hardening Guides for Kubernetes

Table of Contents

  1. Ensure the DOKS Cluster Has at Least One Active Admission Controller Using a Policy Engine
  2. Ensure the DOKS Cluster Restricts the Admission of Privileged Pods
  3. Ensure the DOKS Cluster Does Not Allow Containers to Start with the allowPrivilegeEscalation Flag Set
  4. Ensure DOKS Clusters are Upgraded

Ensure the DOKS Cluster Has at Least One Active Admission Controller Using a Policy Engine

Kubernetes admission controllers are plugins that govern and enforce how the cluster is used. They can be thought of as a gatekeeper that intercepts (authenticated) API requests and may change the request object or deny the request altogether. For more information, visit: https://kubernetes.io/blog/2019/03/21/a-guide-to-kubernetes-admission-controllers/

Creating admission controllers may involve editing kube-apiserver or setting up a webhook service. You cannot directly access or modify the kube-apiserver configuration in DOKS because the control plane is fully managed by DigitalOcean.

Unlike setting up a webhook service, using policy engines offer a Kubernetes-native and declarative approach to defining and enforcing admission policies. It eliminates the need to build, deploy, and maintain webhook servers.

Rational

Admission controllers are important for several reasons:

  • Security: Admission controllers can increase security by mandating a reasonable security baseline across an entire namespace or cluster.
  • Governance: Admission controllers allow you to enforce the adherence to certain practices such as having good labels, annotations, resource limits, or other settings.
  • Configuration management: Admission controllers allow you to validate the configuration of the objects running in the cluster and prevent any obvious misconfigurations from hitting your cluster. Admission controllers can be useful in detecting and fixing images deployed without semantic tags

Impact

While admission controllers are essential for maintaining policy enforcement, resource management, and security within Kubernetes clusters, they can introduce challenges such as performance overhead, unintended resource denials, complexity, and potential for outages. Proper design, testing, documentation, and error handling can mitigate many of these adverse effects.

Audit Procedure

  1. To list all the ClusterPolicy objects in your cluster, run:
  1. kubectl get clusterpolicy
  1. To see the Yaml definition of a specific policy, run:
  1. kubectl get clusterpolicy \<policy-name\> \-o yaml

Remediation Procedure

The following instructions use Kyverno, a policy engine designed specifically for Kubernetes.

  1. Install Kyverno: Kyverno Installation Guide
  2. Install Kustomize:
  1. brew install kustomize
  1. Apply all Pod Standard Security Policies:
  1. kustomize build https://github.com/kyverno/policies/pod-security | kubectl apply -f -

These Kyverno policies are based on the Kubernetes Pod Security Standards definitions.

Back to the top


Ensure the DOKS Cluster Restricts the Admission of Privileged Pods

According to Kubernetes documentation, the Pod Security Standards define three different policies to broadly cover the security spectrum. These policies are cumulative and range from highly-permissive to highly-restrictive. The Privileged policy is purposely-open, and entirely unrestricted. It is typically aimed at system- and infrastructure-level workloads managed by privileged, trusted users.

The Privileged policy is defined by an absence of restrictions. If you define a Pod where the Privileged security policy applies, the Pod you define is able to bypass typical container isolation mechanisms. For example, you can define a Pod that has access to the node’s host network.

Rational

Restricting the admission of privileged pods to DOKS clusters is important for several reasons:

  • Containers are meant to be isolated from the host system, but privileged containers can bypass this isolation, making it possible for attackers to escape the container sandbox and execute commands on the host.
  • Privileged pods can perform sensitive operations like loading kernel modules or changing networking settings. This opens up more potential vulnerabilities that attackers can exploit​.
  • Allowing privileged pods increases the risk of developers or administrators accidentally creating privileged containers, leading to unintentional security vulnerabilities. Misconfigurations in production environments can lead to catastrophic breaches​.

Impact

Restricting the admission of privileged pods in DOKS clusters can have some negative impacts, especially in use cases where elevated permissions are necessary.

Audit Procedure

Ensure the DOKS cluster is restricting privileged containers by running:

  1. kubectl get clusterpolicy

This will list all cluster policies. Please see the table for the desired output:

NAME ADMISSION BACKGROUND VALIDATE ACTION READY AGE MESSAGE
disallow-privileged-containers true true Audit True 60m Ready

Remediation Procedure

The following instructions use Kyverno, a policy engine designed specifically for Kubernetes.

  1. Install Kyverno: Kyverno Installation Guide

  2. Install Kustomize:

  1. brew install kustomize
  1. Apply all Pod Standard Security Policies:
  1. kustomize build https://github.com/kyverno/policies/pod-security | kubectl apply -f -
  1. Follow the Audit Procedure to ensure disallow-privileged-containers is enabled

Back to the top


Ensure DOKS Cluster Does not Allow Containers to Start with the allowPrivilegeEscalation Flag Set.

A container running with the `allowPrivilegeEscalation` flag set to `true` may have processes that can gain more privileges than their parent.

There should be at least one admission control policy defined which does not permit containers to allow privilege escalation. The option exists (and is defaulted to true) to permit set-user-id binaries to run.

If you need to run containers which use set-user-id binaries or require privilege escalation, this should be defined in a separate policy and you should carefully check to ensure that only limited service accounts and users are given permission to use that policy.

Rational

Restricting privilege escalation is important for many reasons, including:

  • Protecting the Host System: When containers can escalate privileges, they can potentially gain root access on the host system. This level of access could allow an attacker to modify system-level configurations or install malicious software on the host, among other things
  • Minimizing the Attack Surface: Containers should follow the principle of least privilege. By limiting privilege escalation, you reduce the potential attack surface that a malicious actor or compromised container can exploit.

Impact

Disabling privilege escalation may limit the ability of applications to perform certain functions that require elevated privileges.

Audit Procedure

Ensure the DOKS cluster is disallowing privilege escalation by running:

  1. kubectl get clusterpolicy

This will list all cluster policies. Please see the table for the desired output:

NAME ADMISSION BACKGROUND VALIDATE ACTION READY AGE MESSAGE
disallow-privilege-escalation true true Audit True 60m Ready

Remediation Procedure

The following instructions use Kyverno, a policy engine designed specifically for Kubernetes.

  1. Install Kyverno: Kyverno Installation Guide
  2. Install Kustomize:
  1. brew install kustomize
  1. Apply all Pod Standard Security Policies:
  1. kustomize build https://github.com/kyverno/policies/pod-security | kubectl apply -f -

Follow the Audit Procedure to ensure disallow-privilege-escalation is enabled.

Back to the top


Ensure DOKS Clusters are Upgraded

During an upgrade, the control plane (Kubernetes main) is replaced with a new control plane running the new version of Kubernetes. This process takes a few minutes, during which API access to the cluster is unavailable but workloads are not impacted.

Once the control plane is replaced, the worker nodes are replaced in a rolling fashion, one worker pool at a time. DOKS uses the following replacement process for the worker nodes:

  1. Identify a number of nodes to drain.
  2. Perform the following steps for each node concurrently:
    • Generate the list of pods running on it. This does not include DaemonSets or mirrored pods.
    • Mark the beginning of the drain start time as an annotation on the node. Eviction timeout is 15 minutes and drain (node deletion) timeout is 30 minutes.
    • Evict as many pods concurrently as the PodDisruptionBudget (PDB) policies allow. If the process hits the eviction timeout while draining a node, it switches to deleting the pods. If it hits the drain timeout while draining a node, it switches to deleting the node.
    • Wait a bit to allow for the pod disruption budget to recover.
    • Repeat the above steps until all pods are drained.

As nodes are upgraded, workloads may experience downtime if there is no additional capacity to host the node’s workload during the replacement. If you enable surge upgrades, then up to 10 new nodes for a given node pool are created up front before the existing nodes of that node pool start getting drained. Since everything happens concurrently, one node stalling the drain process doesn’t stop the other nodes from proceeding. However, since one pool is upgraded at a time, it means that DOKS doesn’t move to the next node pool until the current node pool finishes. When you enable surge upgrades, Kubernetes reschedules each worker node’s workload, then replaces the node with a new node running the new version and reattaches any DigitalOcean Volumes Block Storage to the new nodes. The new worker nodes have new IP addresses

Rational

Upgrading a DOKS cluster is important for several reasons, including:

  • Security Fixes: Kubernetes and its associated components often release updates that address security vulnerabilities. Running an outdated cluster can leave your environment exposed to known security threats, such as privilege escalation vulnerabilities or container escapes.
  • API Lifecycle: Older API versions and objects are often deprecated in favor of newer, more efficient ones. Running an outdated cluster means you may be relying on deprecated APIs, which can lead to compatibility issues when these APIs are eventually removed in future versions.
  • Compliance: Upgrading the cluster helps maintain compliance with security standards and best practices, especially in environments that require adherence to regulations like GDPR, HIPAA, or PCI-DSS.

Impact

Upgrades may create downtime. We recommend enabling surge upgrades on existing clusters. Any data stored on the local disks of the worker nodes are lost in the upgrade process. We recommend using persistent volumes for data storage, and not relying on local disk for anything other than temporary data.

Audit Procedure

Visit the Overview tab of the cluster in the control panel. You will see a View Available Upgrade button if there is a new version available for your cluster.

Remediation Procedure

Review the How to Upgrade DOKS Clusters to Newer Versions documentation for on demand and automated upgrading.

Back to the top

Get started for free

Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

*This promotional offer applies to new accounts only.