What is Kubeflow?

Kubeflow is an open source artificial intelligence / machine learning (AI/ML) tool that helps improve deployment, portability and management of AI/ML models. Kubeflow allows users to quickly create, train and tune neural networks within Kubernetes for dynamic resource provisioning.

Kubeflow works well with TensorFlow and other modern AI/ML frameworks such as PyTorch, MXNet and Chainer allowing users to enhance their existing code and setup.

Community and governance

Governed by the Kubeflow community and originated by Google, the project has over 2,000 community members, more than 180 direct code contributors and over 28,000 contributors. Canonical is an active member of the Kubeflow community and strongly believes in its role in democratising AI/ML by making model training and deployment as frictionless as possible.

Effort scale of the different areas required to train, deploy and manage AI/ML projects. Boxes in blue are improved by Kubeflow.

Software-defined Machine Learning Operations (MLOps)

For a business, the ability to define machine learning environments through code and APIs enables fast-paced automation and minimisation of MLOps costs. Kubeflow includes the ability to manage and reuse machine learning pipelines, hyper-parameter optimisations and Kubernetes serving configurations. Kubeflow’s servicing systems have been designed to be multi-framework compatible in order to allow for interoperability with TensorFlow, XGBoost, scikit-learn, NVIDIA TensorRT Inference Server, ONNX and PyTorch. Similarly, Kubeflow allows execution graph manipulation, analytics and auto- scaling.

Is Kubeflow the right AI/ML tool for your business?

Ask the experts

Companies involved in Kubeflow

Kubeflow was originally released in March 2018 by Google as an open source initiative to develop machine learning applications using TensorFlow on top of Kubernetes to minimise MLOps effort.

Today, dozens of companies contribute to Kubeflow with many more playing a part in the broader community. Canonical is focused on making it as friction-less as possible to get setup with Kubeflow and run daily operations with it.

Contributors to Kubeflow

What makes up Kubeflow?

At its most basic, Kubeflow is comprised of Jupyter notebooks, hyper- parameter tuning, pipelines, serving, model training and more.

Jupyter notebooks


Kubeflow comes with support for managing Jupyter notebooks, an open-source application that allows users to blend code, equation-style notation, free text and dynamic visualisations to give data scientists a single point of access to their experiment setup and notes.

Katib - hyper-parameter tuning


Hyperparameters are set before the machine learning process takes place. These parameters (e.g. topology or number of layers in a neural network) can be tuned with Katib. Katib supports various ML tools such as TensorFlow, PyTorch and MXNet making it easy to reuse previous experiments results with Katib and Kubeflow.

Pipelines


Kubeflow pipelines facilitate end-to-end orchestration of ML workflows, management of multiple experiments and approaches as well as easier re-use of previously successful solutions into a new workflow. This helps developers and data scientists save time and effort.

Serving


Kubeflow makes two service systems available, KFServing and Seldon Core. These allow multi-framework model serving and the choice should be made based on the needs of each project.

Training


Model training is possible with various frameworks, in particular TensorFlow, Chainer, MPI, MXNet and PyTorch. These cover the most popular frameworks in the data science and AI/ML space ensuring that developers are able to use Kubeflow’s MLOps features with their favourite tools.

… and more


Kubeflow is continuously expanding, notable additions are the tracking and managing of metadata of the machine learning workflows as well as Nuclio functions, a high performance serverless solution for data processing, analytics and ML workloads.

Who uses Kubeflow?

There are thousands of individual users of KubeFlow across a broad range of industries with roles varying from data scientist to DevOps engineers. Kubeflow is particularly favoured for its ease of use, software defined MLOps and native execution on Kubernetes.

In addition to CERN and numerous universities around the world, Kubeflow has become a popular choice for transport and logistics companies such as Uber, Lyft and GoJek.

Use cases are also growing in the media industry with Spotify having already adopted Kubeflow.

Financial and e-commerce services have been keen adopters, with PayPal and Shopify just some of those that are developing on Kubeflow.

Further adoption has been seen in healthcare with Babylon Health and travel with Amadeus IT Group.

How to install Kubeflow

Kubeflow can easily be setup by installing MicroK8s, a zero-ops Kubernetes provided by Canonical. Kubeflow comes ready to be enabled as part of MicroK8s, making kicking the tyres even easier than before.

Install Kubeflow in a few easy steps with MicroK8s