Your submission was sent successfully! Close

You have successfully unsubscribed! Close

How to build a Ceph backed Kubernetes cluster

1. Overview

In this tutorial, you will learn how to deploy a 3 node Charmed Kubernetes cluster that uses Ceph storage. We will use Juju and MAAS to deploy our cluster.

What is Kubernetes?

Kubernetes clusters host containerised applications in a reliable and scalable way. Having DevOps in mind, Kubernetes makes maintenance tasks such as upgrades and security patching simple.

What is Ceph?

Ceph is a software-defined storage solution designed to address the object, block, and file storage needs of data centres adopting open source as the new norm for high-growth block storage, object stores and data lakes. Ceph provides enterprise scalable storage while keeping CAPEX and OPEX costs in line with underlying bulk commodity disk prices.

What you’ll learn

  • How to deploy Charmed Kubernetes and Ceph with Juju
  • How to create Ceph pools to be used with Kubernetes with Juju
  • How to create PersistentVolumeClaims that use Ceph StorageClasses

What you’ll need

  • 3 nodes with at least 2 disks and 1 network interface
  • Access to a MAAS environment setup with the 3 nodes in the ‘Ready’ state
  • A Juju controller setup to use the above MAAS cloud
  • The kubectl client installed
  • The bundle.yaml saved to a file

2. Edit bundle.yaml to contain the correct OSD devices

Before deploying our bundle.yaml, we must ensure that our Ceph charm is configured to use the correct OSD devices.

    charm: cs:ceph-osd
    num_units: 3
      osd-devices: /dev/sdb /dev/sdc
      source: distro
      "": oam-space
    - 1001
    - 1002
    - 1003

Notice that the osd-devices configuration above matches the Available disks and partitions section of the node in the image below:

3. Deploy the bundle.yaml

Deploy the bundle with:
$ juju deploy ./bundle.yaml

A successful deployment should look similar to the following juju status output:

$ juju status
Model  Controller            Cloud/Region          Version  SLA          Timestamp
k8s    orangebox100-default  OrangeBox100/default  2.8.6    unsupported  18:22:29-08:00

App                    Version  Status  Scale  Charm                  Store       Rev  OS      Notes
ceph-mon               15.2.7   active      3  ceph-mon               jujucharms   51  ubuntu  
ceph-osd               15.2.7   active      3  ceph-osd               jujucharms  306  ubuntu  
containerd             1.3.3    active      5  containerd             jujucharms   97  ubuntu  
easyrsa                3.0.1    active      1  easyrsa                jujucharms  339  ubuntu  
etcd                   3.4.5    active      3  etcd                   jujucharms  544  ubuntu  
flannel                0.11.0   active      5  flannel                jujucharms  513  ubuntu  
kubeapi-load-balancer  1.18.0   active      1  kubeapi-load-balancer  jujucharms  753  ubuntu  exposed
kubernetes-control-plane    1.19.6   active      2  kubernetes-control-plane      jujucharms  912  ubuntu  
kubernetes-worker      1.19.6   active      3  kubernetes-worker      jujucharms  713  ubuntu  exposed

Unit                      Workload  Agent  Machine  Public address  Ports           Message
ceph-mon/0                active    idle   0/lxd/0                  Unit is ready and clustered
ceph-mon/1*               active    idle   1/lxd/0                  Unit is ready and clustered
ceph-mon/2                active    idle   2/lxd/0                  Unit is ready and clustered
ceph-osd/0*               active    idle   0                  Unit is ready (2 OSD)
ceph-osd/1                active    idle   1                  Unit is ready (2 OSD)
ceph-osd/2                active    idle   2                  Unit is ready (2 OSD)
easyrsa/0*                active    idle   0/lxd/1                  Certificate Authority connected.
etcd/0                    active    idle   0/lxd/2  2379/tcp        Healthy with 3 known peers
etcd/1*                   active    idle   1/lxd/1  2379/tcp        Healthy with 3 known peers
etcd/2                    active    idle   2/lxd/1  2379/tcp        Healthy with 3 known peers
kubeapi-load-balancer/0*  active    idle   0/lxd/3  443/tcp         Loadbalancer ready.
kubernetes-control-plane/0*      active    idle   1/lxd/2  6443/tcp        Kubernetes control-plane running.
  containerd/4            active    idle                    Container runtime available
  flannel/4               active    idle                    Flannel subnet
kubernetes-control-plane/1       active    idle   2/lxd/2  6443/tcp        Kubernetes control-plane running.
  containerd/3            active    idle                    Container runtime available
  flannel/3               active    idle                    Flannel subnet
kubernetes-worker/0*      active    idle   0  80/tcp,443/tcp  Kubernetes worker running.
  containerd/1            active    idle                    Container runtime available
  flannel/1               active    idle                    Flannel subnet
kubernetes-worker/1       active    idle   1  80/tcp,443/tcp  Kubernetes worker running.
  containerd/0*           active    idle                    Container runtime available
  flannel/0*              active    idle                    Flannel subnet
kubernetes-worker/2       active    idle   2  80/tcp,443/tcp  Kubernetes worker running.
  containerd/2            active    idle                    Container runtime available
  flannel/2               active    idle                    Flannel subnet

Machine  State    DNS             Inst id              Series  AZ       Message
0        started  node05ob100          focal   default  Deployed
0/lxd/0  started  juju-1be73e-0-lxd-0  focal   default  Container started
0/lxd/1  started  juju-1be73e-0-lxd-1  focal   default  Container started
0/lxd/2  started  juju-1be73e-0-lxd-2  focal   default  Container started
0/lxd/3  started  juju-1be73e-0-lxd-3  focal   default  Container started
1        started  node07ob100          focal   default  Deployed
1/lxd/0  started  juju-1be73e-1-lxd-0  focal   default  Container started
1/lxd/1  started  juju-1be73e-1-lxd-1  focal   default  Container started
1/lxd/2  started  juju-1be73e-1-lxd-2  focal   default  Container started
2        started  node06ob100          focal   default  Deployed
2/lxd/0  started  juju-1be73e-2-lxd-0  focal   default  Container started
2/lxd/1  started  juju-1be73e-2-lxd-1  focal   default  Container started
2/lxd/2  started  juju-1be73e-2-lxd-2  focal   default  Container started

The deployment should reach the above state in about 10 minutes (depending on hardware).
Congrats, we have a kubernetes cluster up and running at this point!

4. Verify that Ceph StorageClasses were created

Copy the kubeconfig file from a kubernetes-control-plane node

$ mkdir -p .kube
$ juju scp kubernetes-control-plane/0:~/config .kube/

Read Kubernetes StorageClasses

$ kubectl get sc
ext4-pool     Delete          Immediate           true                   5d12h
xfs-pool (default)   Delete          Immediate           true                   5d12h

Great, our storageClasses were setup as expected! Now we will need to create Ceph pools to match our storageClasses so that we can use them with our kubernetes workloads.

5. Create Ceph pools

List Ceph pools

$ juju run-action --wait ceph-mon/leader list-pools
  UnitId: ceph-mon/1
  id: "2"
    message: |
      1 device_health_metrics
  status: completed
    completed: 2020-12-29 19:08:31 +0000 UTC
    enqueued: 2020-12-29 19:08:30 +0000 UTC
    started: 2020-12-29 19:08:30 +0000 UTC

Create the Ceph xfs-pool

$ juju run-action --wait ceph-mon/leader create-pool name=xfs-pool
  UnitId: ceph-mon/1
  id: "5"
    Stderr: |
      pool 'xfs-pool' created
      set pool 2 size to 3
      set pool 2 target_size_ratio to 0.1
      enabled application 'unknown' on pool 'xfs-pool'
  status: completed
    completed: 2020-12-29 19:42:26 +0000 UTC
    enqueued: 2020-12-29 19:42:19 +0000 UTC
    started: 2020-12-29 19:42:19 +0000 UTC

List the Ceph pools again to verify that the new pool is created:

$ juju run-action --wait ceph-mon/leader list-pools
  UnitId: ceph-mon/1
  id: "9"
    message: |
      1 device_health_metrics
      2 xfs-pool
  status: completed
    completed: 2020-12-29 19:50:14 +0000 UTC
    enqueued: 2020-12-29 19:50:13 +0000 UTC
    started: 2020-12-29 19:50:13 +0000 UTC 

Congratulations, we have created our new Ceph pool and now we are ready to use them with kubernetes!

6. Verify Ceph backed PersistentVolumeClaim functionality

Create a PersistentVolumeClaim

Use the following claim.json file to create a PersistentVolumeClaim:

$ cat claim.json
  "kind": "PersistentVolumeClaim",
  "apiVersion": "v1",
  "metadata": {
     "name": "myvol"
  "spec": {
      "accessModes": [
      "resources": {
          "requests": {
             "storage": "4Gi"
      "storageClassName": "ceph-xfs"
$ kubectl apply -f claim.json

Check the status of the PersistentVolumeClaim

$ kubectl get pvc
NAME    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
myvol   Bound    pvc-8987ad38-3888-4dd8-94c1-39868792c37e   4Gi        RWO            ceph-xfs       35m

Create a ReplicationController that uses the Ceph backed PVC

Use the following pod.yaml file to create a ReplicationController:

$ cat pod.yaml
apiVersion: v1
kind: ReplicationController
  name: server
  replicas: 1
      role: server
          role: server
        - name: server
          image: nginx
            - mountPath: /var/lib/www/html
              name: myvol
          - name: myvol
              claimName: myvol
  $ kubectl apply -f pod.yaml

Check the status of the ReplicationController pod

$ kubectl get pods
NAME                                         READY   STATUS    RESTARTS   AGE
csi-rbdplugin-hsnjl                          3/3     Running   0          6d9h
csi-rbdplugin-md8zd                          3/3     Running   0          6d9h
csi-rbdplugin-nhc6t                          3/3     Running   0          6d9h
csi-rbdplugin-provisioner-549c6b54c6-2ts2x   6/6     Running   0          6d9h
csi-rbdplugin-provisioner-549c6b54c6-8f7v9   6/6     Running   0          6d9h
csi-rbdplugin-provisioner-549c6b54c6-l59nr   6/6     Running   1          6d9h
server-48g2s                                 1/1     Running   0          39m
$ kubectl describe pod server-48g2s 
Name:         server-48g2s
Namespace:    default
Priority:     0
Node:         node06ob100/
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  myvol
    ReadOnly:   false
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-bptwd
    Optional:    false
  Type    Reason                  Age   From                     Message
  ----    ------                  ----  ----                     -------
  Normal  Scheduled               41m   default-scheduler        Successfully assigned default/server-48g2s to node06ob100
  Normal  SuccessfulAttachVolume  41m   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-8987ad38-3888-4dd8-94c1-39868792c37e"
  Normal  Pulling                 41m   kubelet                  Pulling image "nginx"
  Normal  Pulled                  41m   kubelet                  Successfully pulled image "nginx" in 7.936656052s
  Normal  Created                 41m   kubelet                  Created container server
  Normal  Started                 41m   kubelet                  Started container server

Log in to the container and check that the volume is mounted

$ kubectl exec -it server-48g2s -- bash
root@server-48g2s:/# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/rbd0       4.0G   33M  4.0G   1% /var/lib/www/html
root@server-48g2s:/# exit

Now our pod has an RBD mount!

7. Cleanup the ReplicationController

$ kubectl delete replicationcontrollers/server
replicationcontroller "server" deleted
$ kubectl get replicationcontrollers
No resources found.

:warning: To delete the myvol pvc the replicationcontroller server must be deleted beforehand!

8. Wrap-up

Congratulations, you now have a highly-available multi-node Kubernetes with Ceph backed storage to orchestrate your containers.

ⓘ To test your understanding of this tutorial, complete the following steps by yourself:

  1. Create the ext4-pool
  2. Create a PersistentVolumeClaim that is backed by the ext4-pool
  3. Create a ReplicationController that uses the ext4-pool backed PVC