Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting our team. We will be in touch shortly.Close

  1. Blog
  2. Article

Canonical
on 17 October 2023

Canonical announces supported solution for Apache Spark® on Kubernetes


17 October 2023 Today, Canonical announced the release of Charmed Spark – an advanced solution for Apache Spark® that provides everything users need to run Apache Spark on Kubernetes.  Apache Spark is suitable for use in diverse data processing applications including predictive analytics, data warehousing, machine learning data preparation and extract-transform-load (ETL). Canonical Charmed Spark accelerates data engineering across public clouds and private data centres alike and comes with a comprehensive support and security maintenance offering, so teams can work with complete peace of mind.

“Enterprise data engineers want Apache Spark with the ease and long term security commitment of Ubuntu”, said Mark Shuttleworth, Chief Executive Officer at Canonical. “Charmed Spark is the first of many Canonical open source data solutions designed for reliability and multi-cloud operation. Every production deployment is warranted for ten years compliance and security maintenance”.

Cloud native with 10 years of support

Today’s Kubernetes infrastructure extensively depends on containerised images, but assuring image provenance of open source software can be challenging. The Charmed Spark OCI image, available on Github, is included in the solution. The entire solution is backed by the Ubuntu Pro enterprise support and security maintenance subscription – with up to 10 years of support available for the release.

Priced per node not per application

Charmed Spark is the first release of the forthcoming Canonical Data Fabric suite of data processing solutions for all sizes of data. Customers purchase 24/7 or weekday enterprise support on a per-node basis through the Ubuntu Pro + Support plan, which covers  all applications within the suite as well as additional solutions for AI offered by Canonical including Charmed Kubeflow and Charmed MLFlow.

Made for data engineers wherever they work

Charmed Spark is built to run Spark on Kubernetes, which brings cloud-native portability across clouds and on-premise data centres.  Charmed Spark delivers support for Apache Spark 3 with its improved Python integration and an even richer Spark-SQL featureset.

The included spark8t Python SDK and command line tooling simplifies working with Spark on Kubernetes, and can be conveniently installed via the provided spark-client Snap or as a Python package. The Spark container image, built on Ubuntu 22.04 LTS, delivers an out-of-the-box runtime for creating Spark applications that run on Kubernetes clusters.

The spark-history-server-k8s operator enables administrators to quickly deploy, configure and operate the Spark History Server on a Kubernetes cluster using Juju – Canonical’s open source orchestration engine – either directly or with Terraform.

Getting started with Charmed Spark

Users can get started with Charmed Spark on Canonical Kubernetes and Amazon EKS by following the documentation at ubuntu.com/data/docs. Learn more at canonical.com/data/spark.

About Canonical

Canonical, the publisher of Ubuntu, provides open source security, support and services. Our portfolio covers critical systems, from the smallest devices to the largest clouds, from the kernel to containers, from databases to AI. With customers that include top tech brands, emerging startups, governments and home users, Canonical delivers trusted open source for everyone.

Related posts


robgibbon
3 July 2023

Charmed Spark beta release is out – try it today

AI Article

The Canonical Data Fabric team is pleased to announce the first beta release of Charmed Spark, our solution for Apache Spark. Apache Spark is a free, open source software framework for developing distributed, parallel processing jobs. It’s popular with data engineers and data scientists alike when building data pipelines for both batch an ...


robgibbon
3 May 2023

Big data security foundations in five steps

Data Platform Article

We’ve all read the headlines about spectacular data breaches and other security incidents, and the impact that they have had on the victim organisations. And in some ways there’s no place more vulnerable to attack than a big data environment like a data lake. ...


robgibbon
22 February 2024

Migrating from Cloudera to a modern data hub architecture

Data Platform Article

In the early 2010s, Apache Hadoop captured the imagination of the tech community. A free and powerful open source platform, it gave users a way to process unimaginably large quantities of data, and offered a dazzling variety of tooling to suit nearly every use case – MapReduce for odd jobs like processing of text, audio ...