Cloud storage pricing – how to optimise TCO

Philip Williams

on 19 January 2023

This article is more than 2 year s old.

The flexibility of public cloud infrastructure allows for little to no upfront expense, and is great when starting a venture or testing an idea. But once a dataset grows and becomes predictable, it can become a significant base cost, compounded further by additional costs depending on how you are consuming that data.

Public clouds were initially popularised under the premise that workloads are dynamic, and that you could easily match available compute resources to the peaks and troughs in your consumption, rather than having to maintain mostly idle buffer capacity to meet peak user demands. Essentially shifting sunk capital into variable operational expense.

However, what has become more apparent is that this isn’t necessarily true when it comes to public cloud storage. Typically what is observed in a production environment is a continual growth of all data sets. Those that are actively used for decision making or transactional processing in databases, tend to age out but need to be retained for audit and accountability purposes. Training data for AI/ML workloads grow and allow models to be more refined and accurate over time. Content and media repositories grow daily, and exponentially with the use of higher quality recording equipment.

How is public cloud storage priced?

Typically there are three areas where costs are incurred.

Capacity ($/GB): this is the amount of space you use for storing your data or the amount of space you allocate/provision for a block volume.

Transactional charges when you interact with the dataset. In an object storage context, this can be PUT/GET/DEL operations. In a block storage context, this can be allocated IOPs or throughput (MB/s).

Object storage can also incur additional bandwidth charges (egress) when you access your data from outside of a cloud provider’s infrastructure or from a vm or container in different compute regions. These charges can even apply when you have deployed your own private network links to a cloud provider!

If in the future you decide to move your data to another public cloud provider, you would incur these costs during migration too!

Calculating cloud storage TCO

Imagine you have a dataset that’s 5PB and you want to understand its total cost of ownership (TCO) over 5 years. First we need to make some assumptions about the dataset and how frequently it will be accessed.

Over the lifetime of the dataset we will assume that it will be written to twice, so 10PBs of written data. We will also assume that it will be read 10 times, and each object is an average of 10MB.

In a popular public cloud, object storage capacity starts at $0.023/GB, and as usage increases the price decreases to $0.021/GB. You are also charged for the transactions to store and retrieve the data. These costs sound low, but as you start to scale up, and then consider the multi-year cost they can quickly rise to significant numbers.

For the 5PB example, the TCO over 5 years is over $7,000,000, and that’s before you even consider any charges for compute to interact with the data, or egress charges to access the dataset from outside of the cloud provider’s infrastructure.

Balancing costs with flexibility

Is there another way to tackle these mounting storage costs, yet also retain the flexibility of deploying workloads in the cloud?

IT infrastructure is increasingly flexible, so with some planning it is possible to operate an open-source storage infrastructure based on Charmed Ceph that is fully managed by experts adjacent to a public cloud region and connected to the public cloud via private links to ensure the highest availability and reliability. Using the same assumptions around usage as before, a private storage solution can reduce your storage costs by more than 2-3x over a 3-5 year period.

Having your data stored using open-source Charmed Ceph in a neutral location, yet near to multiple public cloud providers unlocks a new level of multi-cloud flexibility. For example, should one provider start offering a specific compute service that is not available elsewhere, you can make your data accessible to that provider without incurring significant access or migration costs. As you would when accessing one provider’s storage from another provider’s compute offering.

Additionally, you can securely expose your storage system to your users via your own internet connectivity, without incurring public cloud bandwidth fees.

Later this quarter we will publish a detailed whitepaper with a breakdown of all the costs of both of these solutions alongside a blueprint of the hardware and software used. Make sure to sign up for our newsletter using the form on the right hand side of this page (cloud and server category) to be notified when it is released.

Learn more

What is Ceph?

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

It’s an optimised and easy-to-integrate solution for companies adopting open source as the new norm for high-growth block storage, object stores and data lakes.

Learn more about Ceph ›

How to optimise your cloud storage costs

Cloud storage is amazing, it’s on demand, click click ready to go, but is it the most cost effective approach for large, predictable data sets?

In our white paper learn how to understand the true costs of storing data in a public cloud, and how open source Ceph can provide a cost effective alternative!

Access the whitepaper ›

Interested in running Ubuntu in your organisation? Talk to us today

A guide to software-defined storage for enterprises

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

In our whitepaper explore how Ceph can replace proprietary storage systems in the enterprise.

Access the whitepaper ›

Interested in running Ubuntu in your organisation? Talk to us today

Performant, reliable and cost-effective cloud scaling with Ceph

Canonical Ceph simplifies the entire management lifecycle of deployment, configuration, and operation of a Ceph cluster, no matter its size or complexity. Install, monitor, and scale cloud storage with extensive interoperability.

Find out how Ceph scales clouds so cost-effectively ›

Cloud storage pricing – how to optimise TCO

Philip Williams

How is public cloud storage priced?

Calculating cloud storage TCO

Balancing costs with flexibility

Learn more

What is Ceph?

How to optimise your cloud storage costs

A guide to software-defined storage for enterprises

Performant, reliable and cost-effective cloud scaling with Ceph

Newsletter signup

Related posts

How to reduce data storage costs by up to 50% with Ceph

How to utilize CPU offloads to increase storage efficiency

Meet the Canonical Ceph team at Cephalocon 2024