Your submission was sent successfully! Close

You have successfully unsubscribed! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates about Ubuntu and upcoming events where you can meet our team.Close

PostgreSQL high availability made charmingly easy

In a previous blog, we talked about patterns to run a database in a highly available manner. In this blog, we present our open source recipe for PostgreSQL high availability.

Photo from https://unsplash.com/photos/WfZ4WCuNtlg

Benefits of a highly available PostgreSQL 

A good recipe should always start by exposing the benefits of the concerned meal. Therefore, we will start this section with a brief introduction to the benefits of using PostgreSQL and why high availability is, often, a mandate.

Benefits of consuming the elephant’s services

PostgreSQL is one of the most widely used databases, as the latest survey from StackOverflow shows. PostgreSQL has been successfully used for more than 35 years for a variety of workloads like Online Transaction Processing and Online Analytical Processing ones provides a wide range of features that make it suitable for a large number of industries, including finance and healthcare ones as TimeScale’s 2022 state of PostgreSQL report explains.   

Now that we have an overview of PostgreSQL, we will discuss why running only one instance of PostgreSQL is generally a bad idea.

Why one is not enough  

Reports like ManageForce’s or Phenom’s estimated the cost of an outage to be around $500,000 per hour, on average. A database outage often cascades to its consuming applications and causes a degradation on the dependent services. Therefore, a database outage is one of the worst scenarios for any company.

Unfortunately, there is more  to outages than impact on revenue. According to  IDC’s 2021 Worldwide State of Data Protection and DR Survey, an outage might also lead to loss of productivity, data loss and reputation damage. 

This is why running a highly available PostgreSQL deployment is beneficial for your company. So how can you achieve it? In the next section, we will discuss the components we choose – at Canonical – to automate the deployment of H.A. PostgreSQL clusters.

The open-source ingredients

The following table provides an overview of the components we use to provide a highly available PostgreSQL deployment:

ComponentRole/FunctionsVersion (major)
PostgreSQLDatabase server14
PgBouncerConnection pooling and failover1
PatroniAvailability and replication Manager3
pgBackRestBackup and restore2
Charmed operatorOperator for PostgreSQL
It automates a number of management tasks like deployment and scaling
latest/stable
Juju Operator framework It allows the user to run an operator on top of a variety of clouds including K8s, OpenStack, AWS, Azure and GCP.3.1

Let’s cover how to install these components.

Our recipe for PostgreSQL high availability

Preparing the environment

The first step is to install Juju, our operator framework:

sudo snap install juju --channel=3.1/stable

Next, we will install LXD. LXD is a system container manager that will allow us to emulate a cloud environment in our local machine. 

sudo snap install lxd --channel=5.0/stable

The next step is to bootstrap Juju configuration by performing the following commands:

juju bootstrap
Clouds
aws
aws-china
aws-gov
azure
azure-china
equinix
google
localhost
oracle

Select a cloud [localhost]: localhost

Enter a name for the Controller [localhost-localhost]: demo-postgres

Deploying our first PostgreSQL instance

Finally, we can start deploying a single PostgreSQL instance: 

juju add-model mymodel
juju deploy postgresql --channel edge

Please note that the first time you run the above commands, it might take Juju several minutes to download the required charm (i.e. Juju-based operator or application) and dependencies. The subsequent retries should be faster.

In order to check for the status of the deployment, you can use the following command:

juju status 
# You can type ‘juju status --watch 1s’ to continuously monitor the output

For more details around what is happening in the background you can type:

juju debug-log --tail --replay

After a few minutes,  (your mileage may vary), you should get an output similar to the following after typing juju status:

Model    Controller     Cloud/Region         Version  SLA          Timestamp
mymodel  demo-postgres  localhost/localhost  3.1.0    unsupported  15:00:31+02:00

App         Version  Status  Scale  Charm       Channel  Rev  Exposed  Message
postgresql           active      1  postgresql  edge     281  no       

Unit           Workload  Agent  Machine  Public address  Ports  Message
postgresql/0*  active    idle   0        …           

Machine  State    Address      Inst id        Base          AZ  Message
0        started  …          juju-3d56ca-0  ubuntu@22.04      Running

And our first elephant is ready !

Scaling the number of instances

As we said earlier, running a single instance is not a good idea. So we will now explore another neat feature of our charms, on-demand scaling. Adding replicas to PostgreSQL is as simple as running the following command:

juju add-unit -n 2 postgresql

After some minutes, running  juju status should yield an output similar to the following:

Model    Controller     Cloud/Region         Version  SLA          Timestamp
mymodel  demo-postgres  localhost/localhost  3.1.0    unsupported  15:14:14+02:00

App         Version  Status  Scale  Charm       Channel  Rev  Exposed  Message
postgresql           active      3  postgresql  edge     281  no       

Unit           Workload  Agent      Machine  Public address  Ports  Message
postgresql/0*  active    executing  0        …                    
postgresql/1   active    executing  1        …                    
postgresql/2   active    executing  2        …                    

Machine  State    Address       Inst id        Base          AZ  Message
0        started  …           juju-3d56ca-0  ubuntu@22.04      Running
1        started  …           juju-3d56ca-1  ubuntu@22.04      Running
2        started  …           juju-3d56ca-2  ubuntu@22.04      Running

Deploying our database proxy

Let’s deploy the PgBouncer component next, using the following command:

juju deploy pgbouncer --channel edge

After some minutes, running  juju status should show a new application:

Model    Controller     Cloud/Region         Version  SLA          Timestamp
mymodel  demo-postgres  localhost/localhost  3.1.0    unsupported  15:16:36+02:00

App         Version  Status   Scale  Charm       Channel  Rev  Exposed  Message
pgbouncer            unknown      0  pgbouncer   edge      25  no       
postgresql           active       3  postgresql  edge     281  no       

Unit           Workload  Agent  Machine  Public address  Ports  Message
postgresql/0*  active    idle   0        …                     
postgresql/1   active    idle   1        …                     
postgresql/2   active    idle   2        …                     

Machine  State    Address       Inst id        Base          AZ  Message
0        started  …           juju-3d56ca-0  ubuntu@22.04      Running
1        started  …           juju-3d56ca-1  ubuntu@22.04      Running
2        started  …           juju-3d56ca-2  ubuntu@22.04      Running

As you might have noticed, PgBouncer shows unknown as the Status. You should not worry, as it is expected. PgBouncer is actually a subordinate charm that is deployed inside the same system container. Therefore, PgBouncer will only be invoked when it is used.

Deploying a test application

In order to emulate an application that uses PostgreSQL, we will use the Data Integrator charm:

juju deploy data-integrator --channel edge --config database-name=test-database

After some minutes (or hours if you go snacking like me 🙂), running  juju status should yield an output similar to the following:

Model    Controller     Cloud/Region         Version  SLA          Timestamp
mymodel  demo-postgres  localhost/localhost  3.1.0    unsupported  17:24:20+02:00

App              Version  Status   Scale  Charm            Channel  Rev  Exposed  Message
data-integrator           blocked      1  data-integrator  edge      10  no       Please relate the data-integrator with the desired product
pgbouncer                 unknown      0  pgbouncer        edge      25  no       
postgresql                active       3  postgresql       edge     281  no       

Unit                Workload  Agent  Machine  Public address  Ports  Message
data-integrator/0*  blocked   idle   5        …                   Please relate the data-integrator with the desired product
postgresql/0*       active    idle   0        …                    Primary
postgresql/1        active    idle   1        …                    
postgresql/2        active    idle   2        …           

Machine  State    Address       Inst id        Base          AZ  Message
0        started  …             juju-3d56ca-0  ubuntu@22.04      Running
1        started  …             juju-3d56ca-1  ubuntu@22.04      Running
2        started  …             juju-3d56ca-2  ubuntu@22.04      Running
5        started  …             juju-3d56ca-5  ubuntu@22.04      Running

Wiring everything together

Juju provides a powerful integration abstraction (a.k.a. relation) that allows to establish a communication link between two workloads (e.g. PgBouncer and PostgreSQL server).

Therefore, relating PgBouncer to PostgreSQL is as simple as running the following command:

juju relate postgresql pgbouncer

And relating the Data Integrator to PgBouncer is as simple as the previous operation:

juju relate data-integrator pgbouncer

After some minutes, running  juju status should yield an output similar to the following:

Model    Controller     Cloud/Region         Version  SLA          Timestamp
mymodel  demo-postgres  localhost/localhost  3.1.0    unsupported  17:31:43+02:00

App              Version  Status  Scale  Charm            Channel  Rev  Exposed  Message
data-integrator           active      1  data-integrator  edge      10  no       
pgbouncer                 active      1  pgbouncer        edge      25  no       
postgresql                active      3  postgresql       edge     281  no       

Unit                Workload  Agent  Machine  Public address  Ports  Message
data-integrator/0*  active    idle   5        …           
  pgbouncer/0*      active    idle            …           
postgresql/0*       active    idle   0        …                      Primary
postgresql/1        active    idle   1        …                      
postgresql/2        active    idle   2        …                      
… The rest was omitted to save some bytes (and the elephant's environment).

And voila ! The communication links are now established between:

  • PgBouncer and PostgreSQL
  • Data Integrator and PgBouncer

Now that our meal is ready, let’s taste it.

Time for tasting

Talking to the elephant

We will continue our culinary experience by communicating with PostgreSQL using the Data integrator charm. The following command will create a user and display its credentials:

juju run data-integrator/leader get-credentials

The latter will , only one time, display a username and password in the following format:

ok: "True"
postgresql:
  database: test-database
  endpoints: localhost:<port>
  password: <password>
  username: <username>
  version: "14.7"

We can now connect to the deployed PostgreSQL by using the following commands:

juju ssh postgresql/0*
psql  --host=<ip address of postgresql/0* that displayed in juju status output>   
      --username=<previously provided user name>
     --password test-database

After typing the previously provided password we get a prompt where we can run queries against our PostgreSQL deployment. For example, we can issue the following query:

test-database=> SELECT VERSION();

You can exit the prompt by typing \q and you can exit the system container by typing exit.

Messing, kindly, with the elephant

High availability is not only about deploying several replicas of PostgreSQL. It is also about providing automatic failover when a problem affects one of the replicas. Let’s check together what our charmed operator will do for us when we  simulate a failure in one of the PostgreSQL units.

We recommend, from now on, to execute the upcoming instructions in a new terminal/tab so that you can follow the behaviour of our charmed operator.

We will simulate a first issue in the PostgreSQL primary by doing the following:

# Connect to the system container where the primary elephant is hosted
juju ssh postgresql/0* 
# The following will display all running processes
ps -edf
# The following will terminate all postgresql related processes
sudo pkill postgres 
ps -edf

Patroni should restart all the postgreSQL processes shortly after they are killed. As a consumer of PostgreSQL, you will not notice any issue.

We will now try to abruptly stop the system container where the primary PostgreSQL is running. This can emulate a crash of a server or a sudden network isolation of the primary

exit
Juju status
# Note the ip address of the primary PostgreSQL
# Then note the Inst id corresponding to the primary (in the Machine section of juju status’ output) 

lxc list
# The above will display all the system containers managed by lxd/lxc
# Identify the system container used by the primary by comparing the lxc’s NAME to the previously identified Inst id

lxc stop --force --timeout 0 <Inst id>
# The above will abruptly stop the Primary’s host

lxc list

By checking juju status, you can see that an automatic failover happened and that our cluster self healed after only a few seconds! 

A juju status should present an output similar to the following:

Model    Controller     Cloud/Region         Version  SLA          Timestamp
mymodel  demo-postgres  localhost/localhost  3.1.0    unsupported  19:07:11+02:00

App              Version  Status  Scale  Charm            Channel  Rev  Exposed  Message
data-integrator           active      1  data-integrator  edge      10  no       
pgbouncer                 active      1  pgbouncer        edge      25  no       
postgresql                active    2/3  postgresql       edge     281  no       

Unit                Workload  Agent  Machine  Public address  Ports  Message
data-integrator/0*  active    idle   5        …                     
  pgbouncer/0*      active    idle            …                      
postgresql/0        unknown   lost   0        …                      agent lost, see 'juju show-status-log postgresql/0'
postgresql/1        active    idle   1        …                     Primary
postgresql/2*       active    idle   2        …                      

Machine  State    Address       Inst id        Base          AZ  Message
0        down     …             juju-3d56ca-0  ubuntu@22.04      Running
1        started  …             juju-3d56ca-1  ubuntu@22.04      Running
2        started  …             juju-3d56ca-2  ubuntu@22.04      Running
5        started  …             juju-3d56ca-5  ubuntu@22.04      Running

Time for feedback

At Canonical, we are committed to open-source software. Therefore, all of our charms are open-source and are available under the following links:

So if you like PostgreSQL as much as we do, please do not hesitate to submit  feedback, propose a commit or contact us on mattermost to discuss your ideas and requests.

Talk to us today

Interested in running Ubuntu in your organisation?

Newsletter signup

Get the latest Ubuntu news and updates in your inbox.

By submitting this form, I confirm that I have read and agree to Canonical's Privacy Policy.

Related posts

How to secure your database

Cybersecurity threats are increasing in volume, complexity and impact. Yet, organisations struggle to counter these growing threats. Cyber attacks often...

Should you use open-source databases?

You are not the only one asking this seemingly popular question! Several companies are torn between the rise in appeal of open-source databases and the...

Patterns to achieve database High Availability

The cost of database downtime A study from ManageForce estimated the cost of a database outage to be an average of $474,000 per hour. Long database outages...