Few things which I found interesting from Keynote interview of @faisalzs by @samcharrington at #twimlcon 2021 day 2. https://twitter.com/twimlai/status/1351948281635504129
3 types of users of ML platform
- Algorithm Engineers: SW Engineering + Good understanding of ML. Capable of taking idea & productize
- ML Researchers: Explore cutting edge techniques
- Data Scientists: Builds applications for decision support.
- Algorithm Engineers: SW Engineering + Good understanding of ML. Capable of taking idea & productize
- ML Researchers: Explore cutting edge techniques
- Data Scientists: Builds applications for decision support.
Here is a snip of the ML Platform for personalization at Netflix
- Loosely coupled platform of composable building blocks
- Higher in the stack is more differentiated ML services
- Loosely coupled platform of composable building blocks
- Higher in the stack is more differentiated ML services
4 ways to solve any problem in the platform
- Build it in house
- Leverage other Netflix internal platforms
- Go open source
- Buy commercial solutions
Netflix Platform does all four to balance spending resources & people's time, to minimize opportunity cost.
- Build it in house
- Leverage other Netflix internal platforms
- Go open source
- Buy commercial solutions
Netflix Platform does all four to balance spending resources & people's time, to minimize opportunity cost.
Few open source tools used at Netflix: Training (TF, PyTorch, Sklearn, xgboost, vowpal wabbit), Spark (distributed feature computation)
Netflix open sourced: Polynote, container management systems
Transitioned from in house products to better open source products (e.g. XGBoost)
Netflix open sourced: Polynote, container management systems
Transitioned from in house products to better open source products (e.g. XGBoost)
Experiment to production
- Alignment between data scientists & engineers doesn't work as the problem scale
so..
- Algo engineers understand ML & productionize
- Data Scientist are are enabled with APIs to benchmark their research offline. They don't take things to production
- Alignment between data scientists & engineers doesn't work as the problem scale
so..
- Algo engineers understand ML & productionize
- Data Scientist are are enabled with APIs to benchmark their research offline. They don't take things to production
ML Platform provides lot of guard rails through out the product pipeline for validation of assumptions.
These give confidence to automate (while deploying models) things more freely.
These give confidence to automate (while deploying models) things more freely.
Example guard rails are there in the areas of offline pipeline, serving system, model drift, tracking feature importance. These are good sw eng principles. These checks and balances stop from hurting user experience.
Business metrics vs technical metrics
- Key metric for netflix are related to user experience, retaining the users
- Technical metrics are proxies to the user experience metrics
- Key metric for netflix are related to user experience, retaining the users
- Technical metrics are proxies to the user experience metrics