So, your shiny new Kubernetes cluster is up and running. After getting a few simple tests running, you realise that writing YAML files describing the same objects over and over again is a boring, thankless task. It particularly grates when it’s for common components like databases or monitoring tools. Welcome to your first day two problem – it’s now time for you to investigate the wealth of tooling available for installing and managing Kubernetes apps.
Don’t let the “apps” analogy mislead you though; there are large differences between installing software on a cluster and your mobile phone. Unlike a phone app, Kubernetes apps typically need to be tailored to the system and use case in question, and they also need to evolve and be updated over time. Perhaps more importantly the stakes are fundamentally different. If your phone app breaks, it’s usually a small annoyance; if your Kubernetes app breaks, it’s usually downtime and loss of revenue.
So, we still need a way to install common components on Kubernetes without repeating complex work all the time. Helm has become the de facto standard here. Helm bills itself as the package manager for Kubernetes, and it makes installing new applications as simple as
helm install stable/mysql. The Helm project has a curated set of stable “charts” (Helm’s name for packages) plus more than 2,000 community charts available on the Artifact Hub. It is currently the place to go for Kubernetes packages.
When Helm is no longer enough
If we dig a little deeper into Helm usage, we start seeing some problems. Helm is based on templating — effectively, it takes a bunch of values and plugs them into YAML files to produce a unified output that can be applied to a Kubernetes cluster. Running
helm install will take care of both creating the unified output and applying it to the cluster.
Templating is a neat and simple solution to a complex problem and, in some cases, where there’s minimal Day-2 operations, it’s an optimal solution. However, as soon as you need complex Day-2 operations that are reliable in production environments, Helm falls short. Because Helm doesn’t really understand Kubernetes resources, it will sometimes try to do impossible tasks, like updating immutable resources (see the Role Binding and ClusterIP examples in this blog post). Another complaint is that charts don’t always provide ways to change values required in certain use cases, but the real problem is there are so many potential variables — think RBAC, PodSecurityPolicies, image versions and annotations, and that’s on top of application specific config — that authors can’t expose and maintain them all whilst still providing a usable solution.
The most common complaints about Helm that we heard in the community are difficulties around tracking changes; small changes to Helm config can result in large changes to the cluster and it can be difficult to trace cluster changes back to the edit that caused them. Changing a single value in the Helm config can result in dramatically different YAML being applied to the cluster. The uncertainty of the output has led to many users separating the steps for generating the YAML and applying it to the cluster. By running
helm template we can get the generated YAML, which can then be tweaked (addressing the earlier complaint about not exposing certain variables) and version-controlled before being applied to the cluster. But now we’ve lost a lot of the magic of Helm — it may still be helping us by generating YAML, but we have to now understand and manage its output.
The Kustomize Alternative
Kustomize offers an alternative way to install and share Kubernetes applications. Rather than use templates, Kustomize effectively “layers” YAML resources. So a Kustomize installation will start with a base set of YAML that can be overridden to provide specialised installations. The base files are never changed by the end user, who instead uses separate files to describe patches and additions. If the creator of the base files updates them, those updates can be pulled in without losing user changes — which will still be layered over the top. The outcome is a declarative solution that can be fully modified by end users, and whose primary output is simple Kubernetes manifests.
As Kustomize is built into
kubectl — the Kubernetes client binary — it’s as close as we have to an “official” solution. The advantages and disadvantages of Kustomize contrast starkly with Helm. Kustomize is a solution that is designed to allow arbitrary levels of specialisation and whose changes can be easily tracked and change-controlled.
But unlike Helm, it requires users to have an in-depth understanding of Kubernetes primitives before it can be used effectively. The two approaches can be combined — the output of the
helm template command can be used as the input to Kustomize to create specialist templates. In some ways, though, this is the worst of both worlds — I now need to have a deep understanding of Helm, Kustomize, Kubernetes and how they combine to effectively manage and maintain my system.
Furthermore, both solutions were not designed for ongoing management of the applications in production environments. Both Kustomize and Helm support updating running applications, but they have limited support for maintenance operations—such as backing up databases, handling disaster recovery, or swapping out TLS certificates.
To handle this, engineers started looking into Operators, which are designed to automate such tasks — Operators effectively embody the specialist knowledge used by system administrators. But it turns out that Operators don’t necessarily play well with Helm. The Banzai Cloud blog on Helm 3 explains that upgrading or deleting CRDs isn’t possible with Helm, making management difficult and often requiring deleting and reinstalling the chart.
Even if that is fixed, underlying tension remains between the two solutions; exactly where should the line between the responsibilities of Helm and Operators be drawn? Currently it’s common to see Operators deploy resources and Helm hooks start database migrations. The situation is blurred enough that Googling “Helm vs. Operators” returns multiple results, indicating a lot of confused users.
Charmed Operator Lifecycle Manager – A Different Approach?
So what could we do differently? One solution that brings some new (old) ideas to the table is the open source project, the Charmed Operator Lifecycle Manager (OLM) which includes a Charmed Operator SDK. Rather than seeing installation and maintenance as something to be done with separate tools, it handles both. So we match the simplicity of helm install with
juju deploy, for example:
$ juju deploy -n3 cs:~postgresql-charmers/postgresql-k8s
This command will deploy the PostgreSQL charm (the OLM’s term for a packaged application) to the cluster. The
-n3 will deploy 3 nodes in a highly available (HA) configuration. And we can also use the Charmed OLM for operations tasks normally associated with an operator. For example, adding an extra node to this cluster:
$ juju add-unit postgresql
The cluster will also support automatic failover — another task that would typically fall to an operator in Kubernetes. The Charmed operator really does cover both maintenance and installation, and hence cuts down on the ongoing administrative burden associated with Kubernetes.
We’ve also avoided the need to manage the hundreds of YAML files that Helm and Kustomize typically generate and can result in virtually unmaintainable systems. Instead we’ve given control to the Charmed OLMwhich requires much less configuration and verbosity, while still enabling users to make the required changes when needed.
The Charmed OLM also holds another major advantage over Helm. Helm charts are typically stand-alone; they will install the package and all its required dependencies. This makes sense until you find you have installed 10 charts and eight of them are running their own installation of Prometheus, rather than sharing a single, central installation. In some cases this can be corrected through configuration options in the Helm chart, but this requires bespoke editing for each chart.
In contrast, the Charmed OLM solves this problem by using relations — a way of joining separate charms together and taking care of cross-service dependencies. So using an existing Prometheus installation would be as simple as:
$ juju relate prometheus mycharm
Assuming the “mycharm” charm has been written to support the Prometheus charm, it will configure itself appropriately. Even better, it will be notified of changes to the Prometheus installation, so if the connection details or similar change, “mycharm” will be able to automatically reconfigure and restart itself. This also captures the “intent” at the appropriate level; combining components should be done at the top level rather than hidden in configuration details as it is in Helm.
The Charmed OLM clearly demonstrates that there are advantages to an integrated, managed approach to application installation and maintenance. The level of abstraction is different; we’re no longer thinking about generating YAML to create and update resources, instead we’re thinking on a higher level about how our applications scale and are composed together. The Charmed OLM takes care of the lower level details.
There is a lot to think about here. When installing an application outside of development, we have to think about both ease of installation and ongoing maintenance. Having both handled by the same tooling seems sensible. This implies that the tooling needs to fundamentally understand Kubernetes—a simple generation of YAML isn’t going to be enough when trying to handle sensitive operations like backing up databases or scaling an etcd cluster.
At first glance, it might look like Helm has solved package installation and management. The reality is that there are still some hard issues to solve, especially around dealing with changes and maintenance. The Charmed Operator Lifecycle Managermight not be a solution for everyone, but it’s different approach serves as an inspiration and guide for what the future looks like.