👋 I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in Data Engineering, MLOps, Machine Learning and overall Data space.
This is a 🔒 Paid Subscriber 🔒 only issue. If you want to read the full article, consider upgrading to paid subscription.
MLOps brings to Machine Learning what DevOps did to Software Engineering. Today I will lay down my thoughts around the process of evolving MLOps maturity in organisations. This time the focus will be on the technical and process rather than organisational side (you can find more of my thoughts around effective organisational structures for MLOps implementation here).
There have been many articles written about the maturity levels of MLOps, the most popular ones have been by GCP and Azure. While the articles bring a lot of clarity to the processes, they do leave the reader with the belief that extremely complicated final levels of maturity in MLOps is necessary to succeed at Machine Learning (this is to be expected as the platforms provide consultancy and end-to-end platforms to achieve the said maturity hence benefiting from it).
I have been involved in the evolution of Machine Learning processes in companies of different sizes, starting from a startup with a team of 5 people, moving on to a scale-up with hundreds of engineers and eventually corporations with thousands of employees. Today, I want to take a more simple approach and look into the evolution of maturity in MLOps from my own point of view and experience.
Let’s move on.
The Beginning of Machine Learning Journey.
I have been part of multiple initiatives of trying to bring Machine Learning into a product from the very early stages. The story is simple and will usually take one of two directions: you are hired into an organisation as a Data Scientist or Machine Learning Engineer and tasked with finding problems that could be solved leveraging ML or to solve a specific business problem the management decided to be of importance to be solved.
The System is very simple and lacks any kind of automation. This is what you would see in most organisations that are just starting with running ML in Production.
No MLOps processes are in place and for a good reason: while MLOps practices bring a lot of benefits, they also slow down progression of small projects that do not have need for them.
In this article I assume that Data Engineering pipelines are already in place. What does that mean?
Raw data from external sources is ingested into either Data Warehouse or Data Lake.
The Data has been curated and the quality of it ensured in a curated data layer. Data exposed via data marts in case of a Data Warehouse or golden datasets in case of Data Lake. Both, Raw Data and Curated Data is available for a Data Scientist to explore.
Before even thinking about MLOps, Data Scientist works in the Experimentation environment analysing (4.) Raw and Curated data and running basic Machine Learning Training pipelines (5.) ad-hoc. In this article, for visual simplicity, in the diagrams I use a very basic ML Training pipeline: Preprocessing -> Model Training -> Model Validation. In the real world scenarios these pipelines can and will become a lot more complicated, there would most likely be branching, additional quality gates and training of multiple models at once conditionally on some variables.
Keep reading with a 7-day free trial
Subscribe to