POSTS
PostgreSQL on Kubernetes (with Istio)
- 4 minutes read - 807 wordsIntroduction
Although there is debate as to whether to run stateful workloads such as PostgreSQL databases on the container orchestrator Kubernetes, it’s definitely possible to do so and for it work well, especially on cloud providers that offer block storage that can be reattached to different nodes, such as AWS EBS or Google Cloud Storage. When running Kubernetes on-premises, storage becomes a new concern that we have to deal with ourselves. A solution such as OpenEBS can work well in place of cloud solutions. Finally, we can use Istio to provide routing, additional security, and observability for our cluster services.
OpenEBS
OpenEBS can be easily installed using its Helm package. They recommend using the newest cStor storage engine, which offers high-availability options and snapshotting.
Once OpenEBS is installed, you can list available block devices using
kubectl get blockdevice
. You can combine these block devices into an
OpenEBS
StoragePoolClaim
.
You should have a block device on each node for high-availability.
You can reference the StoragePoolClaim
in a new StorageClass
,
high-availability storage class by setting a ReplicaCount
of at least 3.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
cas.openebs.io/config: |
- name: StoragePoolClaim
value: openebs-cstor-storage-pool-claim
- name: ReplicaCount
value: '3'
openebs.io/cas-type: cstor
name: openebs-cstor-ha
provisioner: openebs.io/provisioner-iscsi
reclaimPolicy: Retain
volumeBindingMode: Immediate
Deploying PostgreSQL
Although you might consider deploying PostgreSQL using a
StatefulSet
,
we’ll instead use a
Deployment
because we want the container to be able to move between nodes. We must
create a PersistentVolumeClaim
referencing our StorageClass
, and have the
Pod
definition reference it using a volume
, and mount that volume to the
PostgreSQL data directory of the container, /var/lib/postgresql/data
. Since
we have used the high-availability StorageClass
, the data will be available
on at least 3 of the nodes. The container can run on any of these.
We’ll also want a Service
which will allow other containers
to communicate with the PostgreSQL container.
Istio Routing
Assuming we have Istio installed and use it for external routing
using something like istio-cni
, we must create some Istio
objects to configure the routing, and make sure that the PostgreSQL
Pod
has an istio-proxy
sidecar container.
We can install Istio using its Helm chart. To route
PostgreSQL traffic, we create a VirtualService
and Gateway
.
Since we are routing TCP traffic, for the hosts
matchers we
use a wildcart *
. We’ll need to add another port to the
service/istio-ingressgateway
in the istio-system
namespace.
This port will be used exclusively for this instance of PostgreSQL.
It’ll need to be opened in the firewall.
Complications
Although you’ll be able to query Postgres now, there’s a problem.
If you connect to Postgres using the postgres
user using a command like:
psql --user postgres -h my-postgres.svc.cluster.local
# or
psql --user postgres -h external-address -p <port>
you’ll be able to connect without a password. It’s because by default,
the postgres
container image uses trust
authentication for local connections.
The istio-proxy
container shares the same network as the postgres
container so connections to postgres
through the proxy seem to come
from localhost.
To fix this, we need to modify the pg_hba.conf
file in the container.
Although a common way to do this is to mount a file from a ConfigMap
,
that’s not the best approach with the postgres
container if we want it
to automatically initialize from an empty data directory. Instead, we’ll
rely on the Initialization Scripts feature of the container, which will
run a *.sh
script present in /docker-entrypoint-initdb.d
. We can either:
- Create a new container that incorporates a script to change the authentication settings
- Mount the script into the container later by mounting a
ConfigMap
as a volume.
Instead of altering the Pod
configuration for all time for some change
that only takes affect when the database is initialized, we’ll modify
the container.
First, we need a script that changes the authentication settings. The default
pg_hba.conf
looks like:
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
# IPv6 local connections:
host all all ::1/128 trust
# ...
We can just change all instances of trust
to md5
, which means that
a password will be required. The disable_trust.sh
script can
use a simple sed command (but keep in mind this also erroneously changes comments):
sed -i 's/trust/md5/' /var/lib/postgresql/data/pg_hba.conf
Then we rebuild the container with this new script added. Here’s the
Dockerfile
:
FROM postgres:11.2
ADD disable_trust.sh /docker-entrypoint-initdb.d/disable_trust.sh
RUN chmod u+x /docker-entrypoint-initdb.d/disable_trust.sh
Once we have built the new image, we push it to some image repository,
and use the new image in our Deployment
.
Conclusion
The process of deploying PostgreSQL on Kubernetes is made more
difficult by its persistence requirements. OpenEBS can solve these
needs for on-premises clusters. If you’re using Istio for routing,
the istio-proxy
introduces security issues. This can be addressed
by adding an initialization script to the PostgreSQL container.