POSTS
PostgreSQL on Kubernetes (with Istio)
- 4 minutes read - 807 wordsIntroduction
Although there is debate as to whether to run stateful workloads such as PostgreSQL databases on the container orchestrator Kubernetes, it’s definitely possible to do so and for it work well, especially on cloud providers that offer block storage that can be reattached to different nodes, such as AWS EBS or Google Cloud Storage. When running Kubernetes on-premises, storage becomes a new concern that we have to deal with ourselves. A solution such as OpenEBS can work well in place of cloud solutions. Finally, we can use Istio to provide routing, additional security, and observability for our cluster services.
OpenEBS
OpenEBS can be easily installed using its Helm package. They recommend using the newest cStor storage engine, which offers high-availability options and snapshotting.
Once OpenEBS is installed, you can list available block devices using
kubectl get blockdevice. You can combine these block devices into an
OpenEBS
StoragePoolClaim.
You should have a block device on each node for high-availability.
You can reference the StoragePoolClaim in a new StorageClass,
high-availability storage class by setting a ReplicaCount of at least 3.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
cas.openebs.io/config: |
- name: StoragePoolClaim
value: openebs-cstor-storage-pool-claim
- name: ReplicaCount
value: '3'
openebs.io/cas-type: cstor
name: openebs-cstor-ha
provisioner: openebs.io/provisioner-iscsi
reclaimPolicy: Retain
volumeBindingMode: Immediate
Deploying PostgreSQL
Although you might consider deploying PostgreSQL using a
StatefulSet,
we’ll instead use a
Deployment
because we want the container to be able to move between nodes. We must
create a PersistentVolumeClaim referencing our StorageClass, and have the
Pod definition reference it using a volume, and mount that volume to the
PostgreSQL data directory of the container, /var/lib/postgresql/data. Since
we have used the high-availability StorageClass, the data will be available
on at least 3 of the nodes. The container can run on any of these.
We’ll also want a Service which will allow other containers
to communicate with the PostgreSQL container.
Istio Routing
Assuming we have Istio installed and use it for external routing
using something like istio-cni, we must create some Istio
objects to configure the routing, and make sure that the PostgreSQL
Pod has an istio-proxy sidecar container.
We can install Istio using its Helm chart. To route
PostgreSQL traffic, we create a VirtualService and Gateway.
Since we are routing TCP traffic, for the hosts matchers we
use a wildcart *. We’ll need to add another port to the
service/istio-ingressgateway in the istio-system namespace.
This port will be used exclusively for this instance of PostgreSQL.
It’ll need to be opened in the firewall.
Complications
Although you’ll be able to query Postgres now, there’s a problem.
If you connect to Postgres using the postgres user using a command like:
psql --user postgres -h my-postgres.svc.cluster.local
# or
psql --user postgres -h external-address -p <port>
you’ll be able to connect without a password. It’s because by default,
the postgres container image uses trust authentication for local connections.
The istio-proxy container shares the same network as the postgres
container so connections to postgres through the proxy seem to come
from localhost.
To fix this, we need to modify the pg_hba.conf file in the container.
Although a common way to do this is to mount a file from a ConfigMap,
that’s not the best approach with the postgres container if we want it
to automatically initialize from an empty data directory. Instead, we’ll
rely on the Initialization Scripts feature of the container, which will
run a *.sh script present in /docker-entrypoint-initdb.d. We can either:
- Create a new container that incorporates a script to change the authentication settings
- Mount the script into the container later by mounting a
ConfigMapas a volume.
Instead of altering the Pod configuration for all time for some change
that only takes affect when the database is initialized, we’ll modify
the container.
First, we need a script that changes the authentication settings. The default
pg_hba.conf looks like:
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
# IPv6 local connections:
host all all ::1/128 trust
# ...
We can just change all instances of trust to md5, which means that
a password will be required. The disable_trust.sh script can
use a simple sed command (but keep in mind this also erroneously changes comments):
sed -i 's/trust/md5/' /var/lib/postgresql/data/pg_hba.conf
Then we rebuild the container with this new script added. Here’s the
Dockerfile:
FROM postgres:11.2
ADD disable_trust.sh /docker-entrypoint-initdb.d/disable_trust.sh
RUN chmod u+x /docker-entrypoint-initdb.d/disable_trust.sh
Once we have built the new image, we push it to some image repository,
and use the new image in our Deployment.
Conclusion
The process of deploying PostgreSQL on Kubernetes is made more
difficult by its persistence requirements. OpenEBS can solve these
needs for on-premises clusters. If you’re using Istio for routing,
the istio-proxy introduces security issues. This can be addressed
by adding an initialization script to the PostgreSQL container.