SAI #03: Machine Learning Deployment Types, Spark - Architecture and more...
Machine Learning Deployment Types, ML Experiment/Model Tracking, Spark - Architecture, Kafka - Reading Data (Basics)
๐ This is Aurimas. I write the weekly SAI Newsletter where my goal is to present complicated Data related concepts in a simple and easy to digest way. The goal is to help You UpSkill in Data Engineering, MLOps, Machine Learning and Data Science areas.
In this episode we cover:
Machine Learning Deployment Types.
Experiment/Model Tracking.
Spark - Architecture.
Kafka -Reading Data (Basics).
MLOps Fundamentals or What Every Machine Learning Engineer Should Know
๐ ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐๐ฒ๐ฝ๐น๐ผ๐๐บ๐ฒ๐ป๐ ๐ง๐๐ฝ๐ฒ๐.ย
There are many ways you could deploy a ๐ ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐ ๐ผ๐ฑ๐ฒ๐น to serve production use cases. Even if you will not be working with them day to day,ย the following are the four ways you should know and understand as a ๐ ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐๐ป๐ด๐ถ๐ป๐ฒ๐ฒ๐ฟ.
ย
โก๏ธ ๐๐ฎ๐๐ฐ๐ต:ย
ย
๐ You apply your trained models as a part of ๐๐ง๐/๐๐๐ง ๐ฃ๐ฟ๐ผ๐ฐ๐ฒ๐๐ on a given schedule.
๐ You load the required Features from a batch storage, apply inference and save inference results to a batch storage.
๐ It is sometimes falsely thought that you canโt use this method for ๐ฅ๐ฒ๐ฎ๐น ๐ง๐ถ๐บ๐ฒ ๐ฃ๐ฟ๐ฒ๐ฑ๐ถ๐ฐ๐๐ถ๐ผ๐ป๐.
๐ Inference results ๐ฐ๐ฎ๐ป ๐ฏ๐ฒ ๐น๐ผ๐ฎ๐ฑ๐ฒ๐ฑ ๐ถ๐ป๐๐ผ ๐ฎ ๐ฟ๐ฒ๐ฎ๐น ๐๐ถ๐บ๐ฒ ๐๐๐ผ๐ฟ๐ฎ๐ด๐ฒ and used for real time applications.
ย
โก๏ธ ๐๐บ๐ฏ๐ฒ๐ฑ๐ฑ๐ฒ๐ฑ ๐ถ๐ป ๐ฎ ๐ฆ๐๐ฟ๐ฒ๐ฎ๐บ ๐๐ฝ๐ฝ๐น๐ถ๐ฐ๐ฎ๐๐ถ๐ผ๐ป:ย
ย
๐ You apply your trained models as a part of ๐ฆ๐๐ฟ๐ฒ๐ฎ๐บ ๐ฃ๐ฟ๐ผ๐ฐ๐ฒ๐๐๐ถ๐ป๐ด ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ.
๐ While Data is continuously piped through your ๐ฆ๐๐ฟ๐ฒ๐ฎ๐บ๐ถ๐ป๐ด ๐๐ฎ๐๐ฎ ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ๐, an application with a loaded model continuously applies inference on the data and returns it to the system - most likely another Streaming Storage.
๐ This deployment type will most likely involve a real time ๐๐ฒ๐ฎ๐๐๐ฟ๐ฒ ๐ฆ๐๐ผ๐ฟ๐ฒ ๐๐ฃ๐ to retrieve additional ๐ฆ๐๐ฎ๐๐ถ๐ฐ ๐๐ฒ๐ฎ๐๐๐ฟ๐ฒ๐ for inference purposes.
๐ Predictions can be consumed by multiple applications subscribing to the ๐๐ป๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ฐ๐ฒ ๐ฅ๐ฒ๐๐๐น๐๐ ๐ฆ๐๐ฟ๐ฒ๐ฎ๐บ.
ย
โก๏ธ ๐ฅ๐ฒ๐พ๐๐ฒ๐๐ - ๐ฅ๐ฒ๐๐ฝ๐ผ๐ป๐๐ฒ:
ย
๐ You expose your model as a backend Service.
๐ It will most likely be a ๐ฅ๐๐ฆ๐ง ๐ผ๐ฟ ๐ด๐ฅ๐ฃ๐ ๐ฆ๐ฒ๐ฟ๐๐ถ๐ฐ๐ฒ.
๐ The API service retrieves Features needed for inference from a ๐ฅ๐ฒ๐ฎ๐น ๐ง๐ถ๐บ๐ฒ ๐๐ฒ๐ฎ๐๐๐ฟ๐ฒ ๐ฆ๐๐ผ๐ฟ๐ฒ ๐๐ฃ๐.
๐ Inference can be requested by any application in real time as long as it is able to form a correct request that conforms ๐๐ฃ๐ ๐๐ผ๐ป๐๐ฟ๐ฎ๐ฐ๐.
ย
โก๏ธ ๐๐ฑ๐ด๐ฒ:ย
ย
๐ You embed your trained model directly into the application that runs on a user device.
๐ This method provides the lowest latency and improves privacy.
๐ Data most likely has to be generated and live inside of device.
ย
๐ง๐๐ป๐ฒ ๐ถ๐ป ๐ณ๐ผ๐ฟ ๐บ๐ผ๐ฟ๐ฒ ๐ฎ๐ฏ๐ผ๐๐ ๐ฒ๐ฎ๐ฐ๐ต ๐๐๐ฝ๐ฒ ๐ผ๐ณ ๐ฑ๐ฒ๐ฝ๐น๐ผ๐๐บ๐ฒ๐ป๐ ๐ถ๐ป ๐ณ๐๐๐๐ฟ๐ฒ ๐ฒ๐ฝ๐ถ๐๐ผ๐ฑ๐ฒ๐!
๐๐
๐ฝ๐ฒ๐ฟ๐ถ๐บ๐ฒ๐ป๐/๐ ๐ผ๐ฑ๐ฒ๐น ๐ง๐ฟ๐ฎ๐ฐ๐ธ๐ถ๐ป๐ด.
A good ๐ ๐ผ๐ฑ๐ฒ๐น ๐ง๐ฟ๐ฎ๐ฐ๐ธ๐ถ๐ป๐ด ๐ฆ๐๐๐๐ฒ๐บ should be composed of two integrated parts: ๐๐
๐ฝ๐ฒ๐ฟ๐ถ๐บ๐ฒ๐ป๐ ๐ง๐ฟ๐ฎ๐ฐ๐ธ๐ถ๐ป๐ด ๐ฆ๐๐๐๐ฒ๐บ and a ๐ ๐ผ๐ฑ๐ฒ๐น ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐.
From where you track ๐ ๐ ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ metadata will depend on ๐ ๐๐ข๐ฝ๐ maturity in your company.ย
If you are at the beginning of the ML journey you might be:
1๏ธโฃ Training and Serving your Models from experimentation environment - you run ๐ ๐ ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ๐ inside of your ๐ก๐ผ๐๐ฒ๐ฏ๐ผ๐ผ๐ธ and do that manually at each retraining.
If you are beyond Notebooks you will be running ๐ ๐ ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ๐ from ๐๐/๐๐ ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ๐ and on ๐ข๐ฟ๐ฐ๐ต๐ฒ๐๐๐ฟ๐ฎ๐๐ผ๐ฟ ๐ง๐ฟ๐ถ๐ด๐ด๐ฒ๐ฟ๐.
In any case, the ๐ ๐ ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ will not be too different and a well designed System should track at least:
2๏ธโฃ ๐๐ฎ๐๐ฎ๐๐ฒ๐๐ used for ๐ง๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐ ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐ ๐ผ๐ฑ๐ฒ๐น๐ in ๐๐
๐ฝ๐ฒ๐ฟ๐ถ๐บ๐ฒ๐ป๐๐ฎ๐๐ถ๐ผ๐ป ๐ผ๐ฟ ๐ฃ๐ฟ๐ผ๐ฑ๐๐ฐ๐๐ถ๐ผ๐ป ๐ ๐ ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ๐. Here you should also track your ๐ง๐ฟ๐ฎ๐ถ๐ป/๐ง๐ฒ๐๐ ๐ฆ๐ฝ๐น๐ถ๐๐. At this stage you should also save all important metrics that relate to ๐๐ฎ๐๐ฎ๐๐ฒ๐๐ - ๐๐ฒ๐ฎ๐๐๐ฟ๐ฒ ๐๐ถ๐๐๐ฟ๐ถ๐ฏ๐๐๐ถ๐ผ๐ป etc.
3๏ธโฃ ๐ ๐ผ๐ฑ๐ฒ๐น ๐ฃ๐ฎ๐ฟ๐ฎ๐บ๐ฒ๐๐ฒ๐ฟ๐ (e.g. model type, hyperparameters) together with ๐ ๐ผ๐ฑ๐ฒ๐น ๐ฃ๐ฒ๐ฟ๐ณ๐ผ๐ฟ๐บ๐ฎ๐ป๐ฐ๐ฒ ๐ ๐ฒ๐๐ฟ๐ถ๐ฐ๐.
4๏ธโฃ ๐ ๐ผ๐ฑ๐ฒ๐น ๐๐ฟ๐๐ถ๐ณ๐ฎ๐ฐ๐ ๐๐ผ๐ฐ๐ฎ๐๐ถ๐ผ๐ป.
5๏ธโฃ ๐ ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ is an ๐๐ฟ๐๐ถ๐ณ๐ฎ๐ฐ๐ itself - track information about who and when triggered it. Pipeline ID etc.
โ
๐๐ผ๐ฑ๐ฒ: Everything is code - you should version and track it.
When a ๐ง๐ฟ๐ฎ๐ถ๐ป๐ฒ๐ฑ ๐ ๐ผ๐ฑ๐ฒ๐น ๐๐ฟ๐๐ถ๐ณ๐ฎ๐ฐ๐ is saved to a ๐ ๐ผ๐ฑ๐ฒ๐น ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐ there should always be a 1 to 1 mapping of previously saved ๐ ๐ผ๐ฑ๐ฒ๐น ๐ ๐ฒ๐๐ฎ๐ฑ๐ฎ๐๐ฎ ๐๐ผ ๐ง๐ต๐ฒ ๐๐ฟ๐๐ถ๐ณ๐ฎ๐ฐ๐ which was outputted to ๐ง๐ต๐ฒ ๐ ๐ผ๐ฑ๐ฒ๐น ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐:
Keep reading with a 7-day free trial
Subscribe to