SAI Notes #04: CI/CD for Machine Learning.
Let's look into how CI/CD is different for ML projects compared to regular software.
👋 I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in Data Engineering, MLOps, Machine Learning and overall Data space.
This week in the weekly SAI Notes:
CI/CD for Machine Learning.
Organisational Structure for Effective MLOps Implementation.
ML Systems in the context of end-to-end Product Delivery.
CI/CD for Machine Learning.
MLOps tries to bring to Machine Learning what DevOps did for regular software. The important difference that Machine Learning aspect of the projects brings to the CI/CD process is treatment of Machine Learning Training pipeline as a first class citizen of the software world.
A large misalignment that I see in the industry is ML Training Pipelines being not treated as a separate entity from CI/CD Pipelines and their steps being executed as part of the CI/CD Pipeline itself. So let’s establish this:
CI/CD pipeline is a separate entity from Machine Learning Training pipeline. There are frameworks and tools that provide capabilities specific to Machine Learning pipelining needs (e.g. KubeFlow Pipelines, Sagemaker Pipelines etc.).
ML Training pipeline is an artifact produced by Machine Learning project and should be treated in the CI/CD pipelines as such.
What does it mean? Let’s take a closer look:
Regular CI/CD pipelines will usually be composed of at-least three main steps. These are:
Step 1: Unit Tests - you test your code so that the functions and methods produce desired results for a set of predefined imputs.
Step 2: Integration Tests - you test specific pieces of the code for ability to integrate with systems outside the boundaries of your code (e.g. databases) and between the pieces of the code itself.
Step 3: Delivery - you deliver the produced artifact to a pre-prod or prod environment depending on which stage of GitFlow you are in.
How does it look like when ML Training pipelines are involved?
Step 1: Unit Tests - in mature MLOps setup the steps in ML Training pipeline should be contained in their own environments and Unit Testable separately as these are just pieces of code composed of methods and functions.
Step 2: Integration Tests - you test if ML Training pipeline can successfully integrate with outside systems, this includes connecting to a Feature Store and extracting data from it, ability to hand over the ML Model artifact to the Model Registry, ability to log metadata to ML Metadata Store etc. This CI/CD step also includes testing the integration between each of the Machine Learning Training pipeline steps, e.g. does it succeed in passing validation data from training step to evaluation step.
Step 3: Delivery - the pipeline is delivered to a pre-prod or prod environment depending on which stage of GitFlow you are in. If it is production environment, the pipeline is ready to be used for Continuous Training. You can trigger the training or retraining of your ML Model ad-hoc, periodically or if the deployed model starts showing signs of Feature/Concept Drift.
Organisational Structure for Effective MLOps Implementation.
Keep reading with a 7-day free trial
Subscribe to