The author selected Girst Who Code to receive a donation as part of the Write for DOnations program.
Deploying a PostgreSQL database on a Kubernetes cluster has become a popular approach for managing scalable, resilient, and dynamic database environments. Kubernetes has container orchestration capabilities that offer a robust framework for deploying and managing applications, including databases like PostgreSQL, in a distributed environment. This integration provides significant scalability, resilience, and efficient resource utilization advantages. By leveraging Kubernetes features such as scalability, automated deployment, and self-healing capabilities, users can ensure the seamless operation of their PostgreSQL databases in a containerized environment.
This guide will explore the step-by-step process of deploying PostgreSQL on a Kubernetes cluster. Whether you are a developer, DevOps engineer, or system administrator looking to deploy PostgreSQL in a Kubernetes environment effectively, this guide aims to provide comprehensive insights and practical steps to successfully set up and manage PostgreSQL databases within a Kubernetes cluster.
Before you begin this tutorial, you will need the following:
A development server or local machine from which you will deploy the PostgreSQL.
The kubectl command-line tool is installed on your development machine. To install this, follow this guide from the official Kubernetes documentation.
A Kubernetes cluster. You can provision a DigitalOcean Kubernetes cluster by following our Kubernetes Quickstart guide .
In Kubernetes, a ConfigMap is an API object that stores configuration data in key-value pairs, which pods or containers can use in a cluster. ConfigMaps helps decouple configuration details from the application code, making it easier to manage and update configuration settings without changing the application’s code.
Let’s create a ConfigMap configuration file to store PostgreSQL connection details such as hostname, database name, username, and other settings.
Add the following configuration. Define the default database name, user, and password.
Let’s break down the above configuration:
apiVersion: v1 specifies the Kubernetes API version used for this ConfigMap.
kind: ConfigMap defines the Kubernetes resource type.
Under metadata, the name field specifies the name of the ConfigMap, set as “postgres-secret.” Additionally, labels are applied to the ConfigMap to help identify and organize resources.
The data section contains the configuration data as key-value pairs.
POSTGRES_DB: Specify the default database name for PostgreSQL.
POSTGRES_USER: Specify the default username for PostgreSQL.
POSTGRES_PASSWORD: Specify the default password for the PostgreSQL user.
Storing sensitive data in a ConfigMap is not recommended due to security concerns. When handling sensitive data within Kubernetes, it’s essential to use Secrets and follow security best practices to ensure the protection and confidentiality of your data.
Save and close the file, then apply the ConfigMap configuration to the Kubernetes.
You can verify the ConfigMap deployment using the following command.
Output.
PersistentVolume (PV) and PersistentVolumeClaim (PVC) are Kubernetes resources that provide and claim persistent storage in a cluster. A PersistentVolume provides storage resources in the cluster, while a PersistentVolumeClaim allows pods to request specific storage resources.
First, create a YAML file for PersistentVolume.
Add the following configuration.
Here is the explanation of each component:
storageClassName: manual specifies the StorageClass for this PersistentVolume. The StorageClass named “manual” indicates that provisioning of the storage is done manually.
Capacity specifies the desired capacity of the PersistentVolume.
accessModes defines the access modes that the PersistentVolume supports. In this case, it is set to ReadWriteMany, allowing multiple Pods to read and write to the volume simultaneously.
hostPath is the volume type created directly on the node’s filesystem. It is a directory on the host machine’s filesystem (path: “/data/postgresql”) that will be used as the storage location for the PersistentVolume. This path refers to a location on the host where the data for the PersistentVolume will be stored.
Save the file, then apply the above configuration to the Kubernetes.
Next, create a YAML for PersistentVolumeClaim.
Add the following configurations.
Let’s break down the components:
kind: PersistentVolumeClaim indicates that this YAML defines a PersistentVolumeClaim resource.
storageClassName: manual specifies the desired StorageClass for this PersistentVolumeClaim.
accessModes specifies the access mode required by the PersistentVolumeClaim.
Resources define the requested resources for the PersistentVolumeClaim:
The requests section specifies the amount of storage requested.
Save the file, then apply the configuration to the Kubernetes.
Now, use the following command to list all the PersistentVolumes created in your Kubernetes cluster:
This command will display details about each PersistentVolume, including its name, capacity, access modes, status, reclaim policy, and storage class.
To list all the PersistentVolumeClaims in the cluster, use the following command:
This command will show information about the PersistentVolumeClaims, including their names, statuses, requested storage, bound volumes, and their corresponding PersistentVolume if they are bound.
Creating a PostgreSQL deployment in Kubernetes involves defining a Deployment manifest to orchestrate the PostgreSQL pods.
Create a YAML file ps-deployment.yaml
to define the PostgreSQL Deployment.
Add the following content.
Here is a brief explanation of each parameter:
replicas: 3 specifies the desired number of replicas.
selector specifies how the Deployment identifies which Pods it manages.
template defines the Pod template used for creating new Pods controlled by this Deployment. Under metadata, the labels field assigns labels to the Pods created from this template, with app: postgres.
containers specify the containers within the Pod.
name: postgres is the name assigned to the container.
image: postgres:14 specifies the Docker image for the PostgreSQL database.
imagePullPolicy: “IfNotPresent” specifies the policy for pulling the container image.
ports specify the ports that the container exposes.
envFrom allows the container to load environment variables from a ConfigMap.
volumeMounts allows mounting volumes into the container.
volumes define the volumes that can be mounted into the Pod.
name: postgresdata specifies the name of the volume.
persistentVolumeClaim refers to a PersistentVolumeClaim named “postgres-volume-claim”. This claim is likely used to provide persistent storage to the PostgreSQL container so that data is retained across Pod restarts or rescheduling.
Save and close the file, then apply the deployment.
This command creates the PostgreSQL Deployment based on the specifications provided in the YAML file.
To check the status of the created deployment:
The following output confirms that the PostgreSQL Deployment has been successfully created.
To check the running pods, run the following command.
You will see the running pods in the following output.
In Kubernetes, a Service is used to define a logical set of Pods that enable other Pods within the cluster to communicate with a set of Pods without needing to know the specific IP addresses of those Pods.
Let’s create a service manifest file to expose PostgreSQL internally within the Kubernetes cluster:
Add the following configuration.
Save the file, then apply this YAML configuration to Kubernetes.
Once the service is created, other applications or services within the Kubernetes cluster can communicate with the PostgreSQL database using the Postgres name and port 5432 as the entry point.
You can verify the service deployment using the following command.
Output.
First, list the available Pods in your namespace to find the PostgreSQL Pod:
You will see the running pods in the following output.
Locate the name of the PostgreSQL Pod from the output.
Once you have identified the PostgreSQL Pod, use the kubectl exec command to connect the PostgreSQL pod.
postgres-665b7554dc-cddgq: This is the pod’s name where the PostgreSQL container is running.
ps_user: Specifies the username that will be used to connect to the PostgreSQL database.
–password: Prompts for the password interactively.
ps_db: Specifies the database name to connect to once authenticated with the provided user.
You will be asked to provide a password for Postgres users. After the successful authentication, you will get into the Postgres shell.
Next, verify the PostgreSQL connection using the following command.
You will see the following output.
You can exit from the PostgreSQL shell using the following command.
Scaling a PostgreSQL deployment in Kubernetes involves adjusting the number of replicas in the Deployment or StatefulSet that manages the PostgreSQL Pods.
First, check the current state of your PostgreSQL deployment:
Output.
To scale the PostgreSQL deployment to 5 replicas, use the kubectl scale command:
Replace 5 with the number of replicas you want for your PostgreSQL deployment.
Next, recheck the status of your deployment to ensure that the scaling operation was successful:
You will see that the number of pods increased to 5:
You can back up a PostgreSQL database running in a Kubernetes Pod using the kubectl exec command in conjunction with the pg_dump tool directly within the Pod.
First, List all Pods to find the name of your PostgreSQL Pod:
Next, use the kubectl exec command to run the pg_dump command inside the PostgreSQL Pod:
This command dumps the database and redirects the output to a file named db_backup.sql
in the local directory.
To restore the database back to the Kubernetes pod, you will need the SQL dump file and the use of the psql command to execute the restore process.
First, use the kubectl cp command to copy the SQL dump file from your local machine into the PostgreSQL Pod:
Next, connect to the PostgreSQL pod using the following command.
Next, run the psql command to restore the backup from the dump file.
This guide outlined the fundamental steps required to set up PostgreSQL successfully within a Kubernetes environment. By leveraging Kubernetes’ orchestration capabilities, organizations can efficiently manage PostgreSQL instances, dynamically scale resources, ensure high availability, and streamline maintenance operations.
If you want to learn more about Kubernetes and Helm, please check out our community page’s Kubernetes section.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
I think something is missing here. The replicas will not have the same state of the db so you actually need to create an actual replication.
It looks like all the replicas will write to the same volume without sync, right? It doesn’t look like the right way to do it.
This is my firs time working with kubernetes, after following this guide step by step y get this error y alredy made this double to double chekc everything its fine:
I have a few questions:
**persistentVolumeClaim** refers to a PersistentVolumeClaim named “postgres-volume-claim”. This claim is likely used to provide persistent storage to...
Are you not sure?? I’m guessing this article was made with ChatGPT haha.Why would you use a Deployment here? All pods will end up using the same storage and interfere with each other. Do yourself a favour and don’t follow this “guide”.
AFAIK this is NOT the correct procedure to scale up PG databases. This will likely corrupt your data.