What is object storage?

Philip Williams

on 10 November 2022

This article is more than 2 year s old.

Object storage is a type of storage where data is manipulated as distinct units. It has accompanied the cloud computing revolution, with S3 (Simple Storage Service) being the very first AWS service. The API for which later turned into the industry standard for the majority of object stores.

Object stores have a very simplistic interface, and do not require you to manage complicated SCSI and HBA drivers, multipathing tools, or volume managers embedded into your operating system. Access to storage becomes an application integration where you point your application at an HTTP endpoint, and use a simple set of verbs to describe what you want to do with a piece of data. Users and applications can be given access to buckets, a bucket being somewhat analogous to a folder. However there is no hierarchical behaviour.

As an example of how these verbs work, do you want to PUT an object somewhere for safekeeping? Do you want to GET an object so that you can do some work with that piece of data? Or do you want to LIST the contents of your bucket? Perhaps these three verbs are an oversimplification of what is possible with object storage, but this is loosely where cloud object storage began. It was an initiative to make storage more economical by removing proprietary technologies and creating a simple scalable storage solution, without the complexities of legacy technologies.

Now that we have a basic understanding of object stores, let’s explore some use cases.

Uses of Object Storage

When building a new application, you will need to build it with object storage in mind. Instead of relying on cluster-aware filesystems and quorum devices, the application will need to handle failover and data consistency itself to remain available during infrastructure failures.

Many off the shelf applications now have native deployment models for working with cloud-native infrastructure, and most importantly with object storage. When your application has finished processing or creating a piece of data, it can be written to an object store for safekeeping, and can easily be retrieved as and when needed.

We can even use object storage buckets to trigger events. Imagine the scenario where you have a mobile app that uploads photos or video, and then some processing happens, before publication. Once a photo or video is uploaded to an object store, an event is triggered to let your backend application know that there is a new object to be processed. And once that object has been processed the output could be written to a bucket that triggers another job to push it to your Content Distribution Network (CDN).

Where can I get Object Storage?

There are lots of options available, all public clouds have object storage offerings. Some of the most well-known are Azure Blob Storage, GCP Cloud Storage, and Amazon AWS S3. Each of these offerings has its own APIs but the most commonly used is the S3 API.

The S3 API has been implemented in other storage solutions, such as Ceph and to a certain extent OpenStack Swift. However, Swift’s implementation is not as feature-complete as Ceph’s and is lacking some features around object lifecycle management and notifications.

Major storage vendors, such as Dell EMC and NetApp, also have solutions, which have largely standardised on the S3 API. Yet, when compared with open source solutions, these remain cumbersome and expensive.

Public or private cloud object storage?

The public cloud might not always be the right choice for all workloads, or for storing all of your data. Despite the fact that the public cloud is instantly accessible, which makes it a great way to get started, over time and as your data set grows, it can become rather cost-inefficient. Public clouds were created around the notion that you can scale up and down on demand, but storage tends to only scale up. Cloud provider costs not only include the charges for storing data, but also retrieval too, and additionally, some providers charge for the number of API operations that you request, and for network transfer costs on top!

A privately hosted Ceph solution can provide significant savings when you have predictable capacity requirements, and you can more effectively manage your own transit costs, either into a public cloud, via products like Direct Connect or ExpressRoute, or at no cost in your own DC or Colo.

Is S3 on Ceph a solution for you?

A Ceph cluster that is compatible with both the AWS S3 API and the OpenStack Swift API can be a cost-effective way to provide object storage to your applications, by combining open-source software with commodity hardware to meet performance, availability and capacity needs.

Learn more about open source Ceph:

Canonical Charmed Ceph

Blog : Cloud Adjacent Storage

Webinar : Reduce your cloud storage costs with cloud adjacent Ceph

What is Ceph?

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

It’s an optimised and easy-to-integrate solution for companies adopting open source as the new norm for high-growth block storage, object stores and data lakes.

Learn more about Ceph ›

How to optimise your cloud storage costs

Cloud storage is amazing, it’s on demand, click click ready to go, but is it the most cost effective approach for large, predictable data sets?

In our white paper learn how to understand the true costs of storing data in a public cloud, and how open source Ceph can provide a cost effective alternative!

Access the whitepaper ›

Interested in running Ubuntu in your organisation? Talk to us today

A guide to software-defined storage for enterprises

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

In our whitepaper explore how Ceph can replace proprietary storage systems in the enterprise.

Access the whitepaper ›

Interested in running Ubuntu in your organisation? Talk to us today

Performant, reliable and cost-effective cloud scaling with Ceph

Canonical Ceph simplifies the entire management lifecycle of deployment, configuration, and operation of a Ceph cluster, no matter its size or complexity. Install, monitor, and scale cloud storage with extensive interoperability.

Find out how Ceph scales clouds so cost-effectively ›

What is object storage?

Philip Williams

Uses of Object Storage

Where can I get Object Storage?

Public or private cloud object storage?

Is S3 on Ceph a solution for you?

Learn more about open source Ceph:

What is Ceph?

How to optimise your cloud storage costs

A guide to software-defined storage for enterprises

Performant, reliable and cost-effective cloud scaling with Ceph

Newsletter signup

Related posts

How to reduce data storage costs by up to 50% with Ceph

How to utilize CPU offloads to increase storage efficiency

Meet the Canonical Ceph team at Cephalocon 2024