The Streaming Paradigm

A legendary podcast with Chip Huyen, CEO at ClayPot AI. We had chit-chats about her thought process, the book "Designing ML systems" and the stuff she's doing with streaming and ML.

Streaming Ecosystem
With ML going real-time, the first concern that pops up is, "do we need that extra complexity with ML"?

Chip stated that it is a general assumption streaming (real-time) adds extra complexity. However, technology gets easier over time as we understand it better. The future will decide how complex real-time use cases can get.

In which use cases is streaming not necessary?
In cases where you do not need to access data as it comes. Plain and simple.

On the bright side, adoption of any concept goes hand in hand with the availability of tools. At the current state of innovation, they help make the implementation of real-time easy.

Designing Machine Learning Systems is a book by Chip Huyen, it gives a deep dive into designing and developing ML systems.

Latency is one topic that is well covered in the book and we talk about in the pod. Latency is the time from when you receive a prediction request until the time it takes to return the prediction.

The implications of prediction latency depend on the type of deployment (i.e batch or streaming).

At the previous Scotland meetup, Dr. Adam Sroka, Director of Hypercube Consulting shared his thoughts and ideas on the concept and definition of MLOps.

The definition of MLOps varies depending on whom you ask. The concept of MLOps is to bring DevOps and ML together.

Accelerate Metrics are the metrics that surround the practice of DevOps. They help you build better software that delivers value. Some of the KPIs that it measures include;

Cycle time - Time from the first commit to full delivery for a feature.
Deployment Frequency - deployments/time
Change Failure Rate - % of deployment that breaks production
Mean Time of Recovery - mean time to fix production.

How many of these are relevant to MLOps?

MLOps Iceberg
The thing we care about in MLOps can be viewed from two levels.

On the surface level, we view the components of MLOps based on service uptime, model metrics, and monitoring.

Downstream this is broken into bias, data drift, model drift, concept drift, failure modes, pipeline errors, schema changes e.t.c

Wallaroo is a platform designed to be a control room for production ML to facilitate deployment, management, observability/monitoring, and optimization of models in a production environment.

Wallaroo caters to AI teams small or large, working on projects organized in a way that works for them. Teams of DS, ML Eng, DevOps, and business analysts access. Work in an integrated fashion in your environment via SDK, UI, or API.

You can install Wallaroo on your choice of AWS, Azure, GCP, or on-premises and deploy your models and pipelines to your environment in seconds. Once in production, you can keep your models running in a production environment and up to date through capabilities such as hot
model switching, A/B testing, Shadow Deploy, anomaly detection, and drift detection monitoring.

You can get some hands-on and grow your skills by downloading and installing the free Community Edition and going through the hands-on tutorials in the link. There is also a community slack channel where you can get help and share your projects and ideas with like minded folks.

This blog was written in partnership with Gurmehar Kaur Somal, Application Engineer at Arize AI.

Once the model is released into production, you notice that the performance of the model has degraded over a period of time.

This blog will show you how to automatically surface and troubleshoot the reason for this performance degradation by analyzing embedding vectors associated with the input images so that you can take the right action to retrain your model and clean your data, saving time and effort to correctly wrangle the datasets and visualize them.

Michał Oleszak is a Machine Learning Engineer with a statistics background. He has worn all the hats; having worked for a consultancy, an AI startup, and a software house.

Deploying a machine learning model to production is just the first step in the model’s lifecycle. After you go-live, we need to continuously monitor the model’s performance to make sure the quality of its predictions stays high.

This is relatively simple if we know the ground-truth labels of the incoming data. Without them, the task becomes much more challenging. In one of his previous articles, he shows how to monitor classification models in the absence of ground truth.

This time, we’ll see how to do it for regression tasks.