Your submission was sent successfully! Close

You have successfully unsubscribed! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates about Ubuntu and upcoming events where you can meet our team.Close

Cloud storage pricing – how to optimise TCO

This article is more than 1 year old.

The flexibility of public cloud infrastructure allows for little to no upfront expense, and is great when starting a venture or testing an idea.  But once a dataset grows and becomes predictable, it can become a significant base cost, compounded further by additional costs depending on how you are consuming that data.

Public clouds were initially popularised under the premise that workloads are dynamic, and that you could easily match available compute resources to the peaks and troughs in your consumption, rather than having to maintain mostly idle buffer capacity to meet peak user demands.  Essentially shifting sunk capital into variable operational expense.

However, what has become more apparent is that this isn’t necessarily true when it comes to public cloud storage.  Typically what is observed in a production environment is a continual growth of all data sets.  Those that are actively used for decision making or transactional processing in databases, tend to age out but need to be retained for audit and accountability purposes.  Training data for AI/ML workloads grow and allow models to be more refined and accurate over time.  Content and media repositories grow daily, and exponentially with the use of higher quality recording equipment.

How is public cloud storage priced?

Typically there are three areas where costs are incurred.

  • Capacity ($/GB): this is the amount of space you use for storing your data or the amount of space you allocate/provision for a block volume.
  • Transactional charges when you interact with the dataset.  In an object storage context, this can be  PUT/GET/DEL operations.  In a block storage context, this can be allocated IOPs or throughput (MB/s).
  • Object storage can also incur additional bandwidth charges (egress) when you access your data from outside of a cloud provider’s infrastructure or from a vm or container in different compute regions. These charges can even apply when you have deployed your own private network links to a cloud provider!

If in the future you decide to move your data to another public cloud provider, you would incur these costs during migration too!

Calculating cloud storage TCO

Imagine you have a dataset that’s 5PB and you want to understand its total cost of ownership (TCO) over 5 years.  First we need to make some assumptions about the dataset and how frequently it will be accessed.

Over the lifetime of the dataset we will assume that it will be written to twice, so 10PBs of written data.  We will also assume that it will be read 10 times, and each object is an average of 10MB.

In a popular public cloud, object storage capacity starts at $0.023/GB, and as usage increases the price decreases to $0.021/GB.  You are also charged for the transactions to store and retrieve the data.  These costs sound low, but as you start to scale up, and then consider the multi-year cost they can quickly rise to significant numbers.

For the 5PB example, the TCO over 5 years is over $7,000,000, and that’s before you even consider any charges for compute to interact with the data, or egress charges to access the dataset from outside of the cloud provider’s infrastructure.

Balancing costs with flexibility

Is there another way to tackle these mounting storage costs, yet also retain the flexibility of deploying workloads in the cloud?

IT infrastructure is increasingly flexible, so with some planning it is possible to operate an open-source storage infrastructure based on Charmed Ceph that is fully managed by experts adjacent to a public cloud region and connected to the public cloud via private links to ensure the highest availability and reliability.  Using the same assumptions around usage as before, a private storage solution  can reduce your storage costs by more than 2-3x over a 3-5 year period.

Having your data stored using open-source Charmed Ceph in a neutral location, yet near to multiple public cloud providers unlocks a new level of multi-cloud flexibility.  For example, should one provider start offering a specific compute service that is not available elsewhere, you can make your data accessible to that provider without incurring significant access or migration costs. As you would when accessing one provider’s storage from another provider’s compute offering.  

Additionally, you can securely expose your storage system to your users via your own internet connectivity, without incurring public cloud bandwidth fees.

Later this quarter we will publish a detailed whitepaper with a breakdown of all the costs of both of these solutions alongside a blueprint of the hardware and software used.  Make sure to sign up for our newsletter using the form on the right hand side of this page (cloud and server category) to be notified when it is released.

Learn more

ceph logo

What is Ceph?

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

It's an optimised and easy-to-integrate solution for companies adopting open source as the new norm for high-growth block storage, object stores and data lakes.

Learn more about Ceph ›

ceph logo

How to optimise your cloud storage costs

Cloud storage is amazing, it's on demand, click click ready to go, but is it the most cost effective approach for large, predictable data sets?

In our white paper learn how to understand the true costs of storing data in a public cloud, and how open source Ceph can provide a cost effective alternative!

Access the whitepaper ›

Interested in running Ubuntu in your organisation? Talk to us today

ceph logo

A guide to software-defined storage for enterprises

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

In our whitepaper explore how Ceph can replace proprietary storage systems in the enterprise.

Access the whitepaper ›

Interested in running Ubuntu in your organisation? Talk to us today

Newsletter signup

Get the latest Ubuntu news and updates in your inbox.

By submitting this form, I confirm that I have read and agree to Canonical's Privacy Policy.

Related posts

The role of secure data storage in fueling AI innovation

There is no AI without data Artificial intelligence is the most exciting technology revolution of recent years. Nvidia, Intel, AMD and others continue to...

CentOS EOL – What does it mean for Ceph storage?

Out of the darkness and into the light, a new path forward Back in 2020, the CentOS Project announced that they would focus only on CentOS Stream, meaning...

Ceph Storage for AI

Use open source Ceph storage to fuel your AI vision The use of AI is a hot topic for any organisation right now. The allure of operational insights, profit,...