The author selected Girst Who Code to receive a donation as part of the Write for DOnations program.
Deploying a PostgreSQL database on a Kubernetes cluster has become a popular approach for managing scalable, resilient, and dynamic database environments. Kubernetes has container orchestration capabilities that offer a robust framework for deploying and managing applications, including databases like PostgreSQL, in a distributed environment. This integration provides significant scalability, resilience, and efficient resource utilization advantages. By leveraging Kubernetes features such as scalability, automated deployment, and self-healing capabilities, users can ensure the seamless operation of their PostgreSQL databases in a containerized environment.
This guide will explore the step-by-step process of deploying PostgreSQL on a Kubernetes cluster. Whether you are a developer, DevOps engineer, or system administrator looking to deploy PostgreSQL in a Kubernetes environment effectively, this guide aims to provide comprehensive insights and practical steps to successfully set up and manage PostgreSQL databases within a Kubernetes cluster.
Before you begin this tutorial, you will need the following:
A development server or local machine from which you will deploy the PostgreSQL.
The kubectl command-line tool is installed on your development machine. To install this, follow this guide from the official Kubernetes documentation.
A Kubernetes cluster. You can provision a DigitalOcean Kubernetes cluster by following our Kubernetes Quickstart guide .
In Kubernetes, a ConfigMap is an API object that stores configuration data in key-value pairs, which pods or containers can use in a cluster. ConfigMaps helps decouple configuration details from the application code, making it easier to manage and update configuration settings without changing the application’s code.
Let’s create a ConfigMap configuration file to store PostgreSQL connection details such as hostname, database name, username, and other settings.
nano postgres-configmap.yaml
Add the following configuration. Define the default database name, user, and password.
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-secret
labels:
app: postgres
data:
POSTGRES_DB: ps_db
POSTGRES_USER: ps_user
POSTGRES_PASSWORD: SecurePassword
Let’s break down the above configuration:
apiVersion: v1 specifies the Kubernetes API version used for this ConfigMap.
kind: ConfigMap defines the Kubernetes resource type.
Under metadata, the name field specifies the name of the ConfigMap, set as “postgres-secret.” Additionally, labels are applied to the ConfigMap to help identify and organize resources.
The data section contains the configuration data as key-value pairs.
POSTGRES_DB: Specify the default database name for PostgreSQL.
POSTGRES_USER: Specify the default username for PostgreSQL.
POSTGRES_PASSWORD: Specify the default password for the PostgreSQL user.
Storing sensitive data in a ConfigMap is not recommended due to security concerns. When handling sensitive data within Kubernetes, it’s essential to use Secrets and follow security best practices to ensure the protection and confidentiality of your data.
Save and close the file, then apply the ConfigMap configuration to the Kubernetes.
kubectl apply -f postgres-configmap.yaml
You can verify the ConfigMap deployment using the following command.
kubectl get configmap
Output.
NAME DATA AGE
kube-root-ca.crt 1 116s
postgres-secret 3 12s
PersistentVolume (PV) and PersistentVolumeClaim (PVC) are Kubernetes resources that provide and claim persistent storage in a cluster. A PersistentVolume provides storage resources in the cluster, while a PersistentVolumeClaim allows pods to request specific storage resources.
First, create a YAML file for PersistentVolume.
nano psql-pv.yaml
Add the following configuration.
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-volume
labels:
type: local
app: postgres
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
hostPath:
path: /data/postgresql
Here is the explanation of each component:
storageClassName: manual specifies the StorageClass for this PersistentVolume. The StorageClass named “manual” indicates that provisioning of the storage is done manually.
Capacity specifies the desired capacity of the PersistentVolume.
accessModes defines the access modes that the PersistentVolume supports. In this case, it is set to ReadWriteMany, allowing multiple Pods to read and write to the volume simultaneously.
hostPath is the volume type created directly on the node’s filesystem. It is a directory on the host machine’s filesystem (path: “/data/postgresql”) that will be used as the storage location for the PersistentVolume. This path refers to a location on the host where the data for the PersistentVolume will be stored.
Save the file, then apply the above configuration to the Kubernetes.
kubectl apply -f psql-pv.yaml
Next, create a YAML for PersistentVolumeClaim.
nano psql-claim.yaml
Add the following configurations.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-volume-claim
labels:
app: postgres
spec:
storageClassName: manual
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
Let’s break down the components:
kind: PersistentVolumeClaim indicates that this YAML defines a PersistentVolumeClaim resource.
storageClassName: manual specifies the desired StorageClass for this PersistentVolumeClaim.
accessModes specifies the access mode required by the PersistentVolumeClaim.
Resources define the requested resources for the PersistentVolumeClaim:
The requests section specifies the amount of storage requested.
Save the file, then apply the configuration to the Kubernetes.
kubectl apply -f psql-claim.yaml
Now, use the following command to list all the PersistentVolumes created in your Kubernetes cluster:
kubectl get pv
This command will display details about each PersistentVolume, including its name, capacity, access modes, status, reclaim policy, and storage class.
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
postgres-volume 10Gi RWX Retain Bound default/postgres-volume-claim manual 34s
To list all the PersistentVolumeClaims in the cluster, use the following command:
kubectl get pvc
This command will show information about the PersistentVolumeClaims, including their names, statuses, requested storage, bound volumes, and their corresponding PersistentVolume if they are bound.
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
postgres-volume-claim Bound postgres-volume 10Gi RWX manual 22s
Creating a PostgreSQL deployment in Kubernetes involves defining a Deployment manifest to orchestrate the PostgreSQL pods.
Create a YAML file ps-deployment.yaml
to define the PostgreSQL Deployment.
nano ps-deployment.yaml
Add the following content.
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
spec:
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: 'postgres:14'
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5432
envFrom:
- configMapRef:
name: postgres-secret
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgresdata
volumes:
- name: postgresdata
persistentVolumeClaim:
claimName: postgres-volume-claim
Here is a brief explanation of each parameter:
replicas: 3 specifies the desired number of replicas.
selector specifies how the Deployment identifies which Pods it manages.
template defines the Pod template used for creating new Pods controlled by this Deployment. Under metadata, the labels field assigns labels to the Pods created from this template, with app: postgres.
containers specify the containers within the Pod.
name: postgres is the name assigned to the container.
image: postgres:14 specifies the Docker image for the PostgreSQL database.
imagePullPolicy: “IfNotPresent” specifies the policy for pulling the container image.
ports specify the ports that the container exposes.
envFrom allows the container to load environment variables from a ConfigMap.
volumeMounts allows mounting volumes into the container.
volumes define the volumes that can be mounted into the Pod.
name: postgresdata specifies the name of the volume.
persistentVolumeClaim refers to a PersistentVolumeClaim named “postgres-volume-claim”. This claim is likely used to provide persistent storage to the PostgreSQL container so that data is retained across Pod restarts or rescheduling.
Save and close the file, then apply the deployment.
kubectl apply -f ps-deployment.yaml
This command creates the PostgreSQL Deployment based on the specifications provided in the YAML file.
To check the status of the created deployment:
kubectl get deployments
The following output confirms that the PostgreSQL Deployment has been successfully created.
NAME READY UP-TO-DATE AVAILABLE AGE
postgres 3/3 3 3 17s
To check the running pods, run the following command.
kubectl get pods
You will see the running pods in the following output.
NAME READY STATUS RESTARTS AGE
postgres-665b7554dc-cddgq 1/1 Running 0 28s
postgres-665b7554dc-kh4tr 1/1 Running 0 28s
postgres-665b7554dc-mgprp 1/1 Running 1 (11s ago) 28s
In Kubernetes, a Service is used to define a logical set of Pods that enable other Pods within the cluster to communicate with a set of Pods without needing to know the specific IP addresses of those Pods.
Let’s create a service manifest file to expose PostgreSQL internally within the Kubernetes cluster:
nano ps-service.yaml
Add the following configuration.
apiVersion: v1
kind: Service
metadata:
name: postgres
labels:
app: postgres
spec:
type: NodePort
ports:
- port: 5432
selector:
app: postgres
Save the file, then apply this YAML configuration to Kubernetes.
kubectl apply -f ps-service.yaml
Once the service is created, other applications or services within the Kubernetes cluster can communicate with the PostgreSQL database using the Postgres name and port 5432 as the entry point.
You can verify the service deployment using the following command.
kubectl get svc
Output.
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6m6s
postgres NodePort 10.98.119.102 <none> 5432:30344/TCP 6s
First, list the available Pods in your namespace to find the PostgreSQL Pod:
kubectl get pods
You will see the running pods in the following output.
NAME READY STATUS RESTARTS AGE
postgres-665b7554dc-cddgq 1/1 Running 0 28s
postgres-665b7554dc-kh4tr 1/1 Running 0 28s
postgres-665b7554dc-mgprp 1/1 Running 1 (11s ago) 28s
Locate the name of the PostgreSQL Pod from the output.
Once you have identified the PostgreSQL Pod, use the kubectl exec command to connect the PostgreSQL pod.
kubectl exec -it postgres-665b7554dc-cddgq -- psql -h localhost -U ps_user --password -p 5432 ps_db
postgres-665b7554dc-cddgq: This is the pod’s name where the PostgreSQL container is running.
ps_user: Specifies the username that will be used to connect to the PostgreSQL database.
–password: Prompts for the password interactively.
ps_db: Specifies the database name to connect to once authenticated with the provided user.
You will be asked to provide a password for Postgres users. After the successful authentication, you will get into the Postgres shell.
Password:
psql (14.10 (Debian 14.10-1.pgdg120+1))
Type "help" for help.
ps_db=#
Next, verify the PostgreSQL connection using the following command.
ps_db=# \conninfo
You will see the following output.
You are connected to database "ps_db" as user "ps_user" on host "localhost" (address "::1") at port "5432".
You can exit from the PostgreSQL shell using the following command.
exit
Scaling a PostgreSQL deployment in Kubernetes involves adjusting the number of replicas in the Deployment or StatefulSet that manages the PostgreSQL Pods.
First, check the current state of your PostgreSQL deployment:
kubectl get pods -l app=postgres
Output.
postgres-665b7554dc-cddgq 1/1 Running 0 2m12s
postgres-665b7554dc-kh4tr 1/1 Running 0 2m12s
postgres-665b7554dc-mgprp 1/1 Running 1 (115s ago) 2m12s
To scale the PostgreSQL deployment to 5 replicas, use the kubectl scale command:
kubectl scale deployment --replicas=5 postgres
Replace 5 with the number of replicas you want for your PostgreSQL deployment.
Next, recheck the status of your deployment to ensure that the scaling operation was successful:
kubectl get pods -l app=postgres
You will see that the number of pods increased to 5:
NAME READY STATUS RESTARTS AGE
postgres-665b7554dc-cddgq 1/1 Running 0 3m56s
postgres-665b7554dc-ftxbl 1/1 Running 0 10s
postgres-665b7554dc-g2nh6 1/1 Running 0 10s
postgres-665b7554dc-kh4tr 1/1 Running 0 3m56s
postgres-665b7554dc-mgprp 1/1 Running 1 (3m39s ago) 3m56s
You can back up a PostgreSQL database running in a Kubernetes Pod using the kubectl exec command in conjunction with the pg_dump tool directly within the Pod.
First, List all Pods to find the name of your PostgreSQL Pod:
kubectl get pods
Next, use the kubectl exec command to run the pg_dump command inside the PostgreSQL Pod:
kubectl exec -it postgres-665b7554dc-cddgq -- pg_dump -U ps_user -d ps_db > db_backup.sql
This command dumps the database and redirects the output to a file named db_backup.sql
in the local directory.
To restore the database back to the Kubernetes pod, you will need the SQL dump file and the use of the psql command to execute the restore process.
First, use the kubectl cp command to copy the SQL dump file from your local machine into the PostgreSQL Pod:
kubectl cp db_backup.sql postgres-665b7554dc-cddgq:/tmp/db_backup.sql
Next, connect to the PostgreSQL pod using the following command.
kubectl exec -it postgres-665b7554dc-cddgq -- /bin/bash
Next, run the psql command to restore the backup from the dump file.
psql -U ps_user -d ps_db -f /tmp/db_backup.sql
This guide outlined the fundamental steps required to set up PostgreSQL successfully within a Kubernetes environment. By leveraging Kubernetes’ orchestration capabilities, organizations can efficiently manage PostgreSQL instances, dynamically scale resources, ensure high availability, and streamline maintenance operations.
If you want to learn more about Kubernetes and Helm, please check out our community page’s Kubernetes section.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
AFAIK this is NOT the correct procedure to scale up PG databases. This will likely corrupt your data.
Why would you use a Deployment here? All pods will end up using the same storage and interfere with each other. Do yourself a favour and don’t follow this “guide”.
**persistentVolumeClaim** refers to a PersistentVolumeClaim named “postgres-volume-claim”. This claim is likely used to provide persistent storage to...
Are you not sure?? I’m guessing this article was made with ChatGPT haha.I have a few questions:
This is my firs time working with kubernetes, after following this guide step by step y get this error y alredy made this double to double chekc everything its fine:
It looks like all the replicas will write to the same volume without sync, right? It doesn’t look like the right way to do it.
I think something is missing here. The replicas will not have the same state of the db so you actually need to create an actual replication.