Rootless podman with MariaDB Galera Cluster

Posted by Karl Levik on 2023-03-17, last modified on 2022-04-16

Bitnami is a library for packages and installers now owned by VMWare. This library includes a convenient Docker image for MariaDB Galera Cluster. Let's see if we can make it work with Podman.

Sea lion and seals

Sea lions, meet the seals!

Environment and rootless podman

I'm running this on Fedora 37 with podman version 4.2.2. Your mileage may vary on other platforms and older versions.

As noted in my previous articles about rootless podman 1 2, a few one-off configuration changes are typically needed, specifically to set suitable values for your max_user_namespaces kernel parameter and your subuid and subgid ranges. Please refer to those articles if you need to make these changes.

Create a network

You can just use the default podman network, but for fun let's create a dedicated network:

podman network create mdbnet

(This defaults to using the bridge network driver.)

Create and start the bootstrap node

MariaDB version 10.11 is the most recent GA release, and the most recent for which there is a mariadb-galera Bitnami image. So, let's use that.

This creates and starts a container mdb1, using the mdbnet network we just created:

podman create --name mdb1 \
  --hostname mdb1 \
  --network mdbnet \
  -p 127.0.0.1:3306:3306 \
  -e MARIADB_GALERA_CLUSTER_NAME=mdb_cluster \
  -e MARIADB_GALERA_MARIABACKUP_USER=backup \
  -e MARIADB_GALERA_MARIABACKUP_PASSWORD=mypass \
  -e MARIADB_ROOT_PASSWORD=mypass \
  -e MARIADB_GALERA_CLUSTER_BOOTSTRAP=yes \
  -e MARIADB_USER=dbdemon \
  -e MARIADB_PASSWORD=mypass \
  -e MARIADB_DATABASE=netherworld \
  -e MARIADB_REPLICATION_USER=replication \
  -e MARIADB_REPLICATION_PASSWORD=mypass \
  docker.io/bitnami/mariadb-galera:10.11
podman start mdb1

Create the other nodes

Create and start container mdb2, using the mdbnet network and refer to mdb1 in the MARIADB_GALERA_CLUSTER_ADDRESS variable (wsrep_cluster_address):

podman create --name mdb2 \
  --hostname mdb2 \
  --network mdbnet \
  -p 127.0.0.1:3307:3306 \
  -e MARIADB_GALERA_CLUSTER_NAME=mdb_cluster \
  -e MARIADB_GALERA_CLUSTER_ADDRESS=gcomm://mdb1:4567,0.0.0.0:4567 \
  -e MARIADB_GALERA_MARIABACKUP_USER=backup \
  -e MARIADB_GALERA_MARIABACKUP_PASSWORD=mypass \
  -e MARIADB_ROOT_PASSWORD=mypass \
  -e MARIADB_REPLICATION_USER=replication \
  -e MARIADB_REPLICATION_PASSWORD=mypass \
  docker.io/bitnami/mariadb-galera:10.11
podman start mdb2

Create and start container mdb3, using the mdbnet network and refer to mdb1 and mdb2 in the MARIADB_GALERA_CLUSTER_ADDRESS variable (wsrep_cluster_address):

podman create --name mdb3 \
  --hostname mdb3 \
  --network mdbnet \
  -p 127.0.0.1:3308:3306 \
  -e MARIADB_GALERA_CLUSTER_NAME=mdb_cluster \
  -e MARIADB_GALERA_CLUSTER_ADDRESS=gcomm://mdb1:4567,mdb2:4567,0.0.0.0:4567 \
  -e MARIADB_GALERA_MARIABACKUP_USER=backup \
  -e MARIADB_GALERA_MARIABACKUP_PASSWORD=mypass \
  -e MARIADB_ROOT_PASSWORD=mypass \
  -e MARIADB_REPLICATION_USER=replication \
  -e MARIADB_REPLICATION_PASSWORD=mypass \
  docker.io/bitnami/mariadb-galera:10.11
podman start mdb3

Verify success

podman ps
CONTAINER ID  IMAGE                                   COMMAND               CREATED             STATUS             PORTS                     NAMES
a671313e6a9b  docker.io/bitnami/mariadb-galera:10.11  /opt/bitnami/scri...  About a minute ago  Up About a minute  127.0.0.1:3306->3306/tcp  mdb1
fd11d0857a3b  docker.io/bitnami/mariadb-galera:10.11  /opt/bitnami/scri...  27 seconds ago      Up 21 seconds      127.0.0.1:3307->3306/tcp  mdb2
6a28e869134a  docker.io/bitnami/mariadb-galera:10.11  /opt/bitnami/scri...  9 seconds ago       Up 3 seconds       127.0.0.1:3308->3306/tcp  mdb3

Log in to MariaDB using the mariadb client on mdb3:

podman exec -it mdb3 mariadb -uroot -pmypass

... or alternatively, if you have a local mariadb client:

mariadb -uroot -pmypass -P3308

Let's check the status of the cluster:

SELECT @@version;
+---------------------+
| @@version           |
+---------------------+
| 10.11.2-MariaDB-log |
+---------------------+
1 row in set (0.000 sec)

SHOW GLOBAL STATUS WHERE variable_name IN ('wsrep_cluster_status', 'wsrep_ready', 
  'wsrep_connected', 'wsrep_cluster_size', 'wsrep_local_state_comment');
+---------------------------+---------+
| Variable_name             | Value   |
+---------------------------+---------+
| wsrep_local_state_comment | Synced  |
| wsrep_cluster_size        | 3       |
| wsrep_cluster_status      | Primary |
| wsrep_connected           | ON      |
| wsrep_ready               | ON      |
+---------------------------+---------+
5 rows in set (0.002 sec)

Later versions of MariaDB has the WSREP_INFO plugin which exposes many of these status variables in new information_schema tables:

INSTALL SONAME 'wsrep_info';
Query OK, 0 rows affected (0.014 sec)

SELECT * FROM information_schema.WSREP_STATUS\G
*************************** 1. row ***************************
         NODE_INDEX: 1
        NODE_STATUS: synced
     CLUSTER_STATUS: primary
       CLUSTER_SIZE: 3
 CLUSTER_STATE_UUID: db0611fa-c12b-11ed-ad48-2a180912c29b
CLUSTER_STATE_SEQNO: 17
    CLUSTER_CONF_ID: 3
   PROTOCOL_VERSION: 4
1 row in set (0.001 sec)

SELECT * FROM information_schema.WSREP_MEMBERSHIP\G
*************************** 1. row ***************************
  INDEX: 0
   UUID: 0cc4024b-c12c-11ed-9205-d7a09c86dbf4
   NAME: fd11d0857a3b
ADDRESS: 10.89.0.10:3306
*************************** 2. row ***************************
  INDEX: 1
   UUID: 178707ba-c12c-11ed-9ca7-96e053f95790
   NAME: 6a28e869134a
ADDRESS: 10.89.0.11:3306
*************************** 3. row ***************************
  INDEX: 2
   UUID: dcf09f0b-c12b-11ed-bea8-b6d6ef34f6be
   NAME: a671313e6a9b
ADDRESS: 10.89.0.8:3306
3 rows in set (0.001 sec)

A note about shutting down and restarting

The Bitnami instructions caution us to always ensure shutting down the bootstrap node last, so as not to lose any writes that may have occurred while the nodes were being stopped. However, it doesn't tell us how to get the cluster running again if you fail to heed this advice for whatever reason.

If you try to just do podman start mdb1 to start the boostrap node again, it will fail and podman logs mdb1 will show this error:

It is not safe to bootstrap form this node ('safe_to_bootstrap=0' is set in 'grastate.dat'). If you want to force bootstrap, set the environment variable MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP=yes

We already created the container with MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP=yes and at this point there is no way that I know about to set it again for an existing container. However, if we don't mind potentially losing data written to the cluster after mdb1 was shut down, we can simply modify the grastate.dat file and change safe_to_bootstrap: 0 to safe_to_bootstrap: 1:

podman cp mdb1:/bitnami/mariadb/data/grastate.dat /tmp/.
sed -i -e 's/safe_to_bootstrap: 0/safe_to_bootstrap: 1/' /tmp/grastate.dat
podman cp /tmp/grastate.dat mdb1:/bitnami/mariadb/data/grastate.dat

Now we can start mdb1 and then the other nodes one by one, and everything should be fine.

Connecting to the nodes via their IP addresses

If for some reason we don't want to bind the nodes to ports on the host - and thereby having to use non-standard port numbers for at least two nodes (3307 and 3308 in our case), there is a way to connect via the IP addresses.

We can get the IP addresses of the nodes e.g. with commands such as:

podman exec -u root -it mdb1 bash -c 'ip addr'
podman exec -u root -it mdb2 bash -c 'ip addr'
podman exec -u root -it mdb3 bash -c 'ip addr'

Then we join the rootless network namespace:

podman unshare --rootless-netns bash

From here we can then connect to the nodes, e.g.:

mariadb -h10.89.0.2 -uroot -pmypass

Prompt

If you do something like export MYSQL_PS1='\U [\d]> ' or export MYSQL_PS1='\u@\H [\d]> ' (or put the equivalent in your ~/.my.cnf client file) before connecting then your mariadb prompt will also display the host IP address, which can be helpful in some cases.

End-of-file

While I'm unsure if I would use these Bitnami images in production, they provide an easy way for me to play with MariaDB Galera Cluster on my laptop. And that is super useful in itself.

With some tweaks it should be possible to use MariaDB's own container images and achieve the same result, but that would entail quite a bit of manual work which can be avoided by using the Bitnami images. (Hopefully, one day similar automation will be availale in the official images from MariaDB Foundation.)

An alternative tool for running MySQL-like database systems in sandboxes is dbdeployer, and while this does support MySQL and Percona MySQL clusters it unfortunately doesn't support MariaDB Galera Cluster yet.