Using bcache for performance gains on the Launchpad Database Servers

Tom Haddon

on 10 December 2015

landscapeScreenshot_600px

Launchpad is the code hosting, bug tracking and build system for the Ubuntu distribution itself, and is used by many other software projects, including OpenStack, Inkscape & MySQL.

There are two main data stores in Launchpad. The first is the Librarian, which is a 22TB object store using OpenStack’s Swift as its backend. The second is an 800GB+ PostgreSQL database, and the performance of this is key to the overall responsiveness of Launchpad itself.

Maintaining performance over time as the data set grows for this database has typically been a case of periodically refreshing the hardware to take advantage of the upgrade in RAM and CPU brought by a few years of Moore’s Law.

However when we came to refresh hardware last time around we realised we needed to take a slightly different approach. After looking at a number of options, we decided to use bcache. Bcache is a Linux kernel block layer cache, allowing one or more SSDs to act as a cache for slower hard disk drives.

Our new PostgreSQL database server is composed of 6 disks plus 1 SSD, in comparison to the previous generation of server which had 4 disks in a RAID 10 array for the root partition, plus 25 more disks in a RAID 10 array for /srv where all the PostgreSQL data was stored.

Overview of specific implementation

Our particular workload here is PostgreSQL which is mostly composed of 1GB files on disk, and requires mostly random access reads. As a result, we’ve tuned the server in question by setting /sys/block/bcache0/bcache/sequential_cutoff, congested_read_threshold_us and congested_write_threshold_us for the bcache filesystem to 0.

After some unpromising early experiences with Linux 3.13, the version shipped with Ubuntu 14.04 LTS, research revealed that Linux 3.16 was the minimum viable version for this use case, and so we switched to the Canonical Kernel Team’s HWE kernel.

This has been deployed using a combination of MAAS, Juju, OpenStack and Mojo. The Juju state server was provisioned into our production OpenStack instance to match our other production environments and to give us the flexibility to add other services to this environment using Juju. Given the tight performance constraints of the database servers, we wanted to run PostgreSQL on bare metal, so we used MAAS to install the machines, and then used manual provisioning to add them to the Juju environment.

The deployment itself was done using Mojo, which gives us a predictable and repeatable deployment methodology and also allows us to run daily tests in our CI environment to confirm that if for any reason we needed to redeploy the service from scratch we’d be able to do so.

Some performance numbers

In terms of real-world performance, this has led to a drop in server errors (primarily timeouts) on the launchpad application servers of just over three times, which is a significant improvement, even taking into account faster CPUs and more RAM, especially when considering the significantly smaller number of disks used.

This isn’t the only place we’ve used bache at Canonical. On archive.ubuntu.com we’ve been able to sustain 9Gb/s+ and 20k requests per second from a single machine with 10Gb/s NIC, and for our OpenStack deployments, adding bcache to our nova-compute hosts has greatly increased the number of instances they can run while maintaining performance.

Future Deployments

As of version 1.9 MAAS will have the ability to configure any storage layout during node deployment, including bcache, RAID and LVM. This means future deployments of bcache will as simple as setting some configuration options to MAAS.

Want to know more about deploying bare metal infrastructures using MAAS? Head over to Maas.io to get started with the latest version of MAAS.

Ubuntu cloud

Ubuntu offers all the training, software infrastructure, tools, services and support you need for your public and private clouds.

Newsletter signup

Select topics you’re
interested in

In submitting this form, I confirm that I have read and agree to Canonical’s Privacy Notice and Privacy Policy.

Related posts

Announcing the new IBM LinuxONE III with Ubuntu

This is a guest blog by Kara Todd, Director, Linux, IBM Z and LinuxONE Enterprises today need the most secure, and flexible system to support their...

Hardware discovery and kernel auto-configuration in MAAS

In this blog, we are going to explore how to leverage MAAS for hardware discovery and kernel auto-configuration using tags. In many cases, certain pieces of...

Multi-tenancy in MAAS

In this blog post, we are going to introduce the concept of multi-tenancy in MAAS. This allows operators to have different groups of users own a group of...