Share
Faster AI Inference
 ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌
That survey we asked you to fill out last week? Here are the results in a spreadsheet! Tell your friends to fill out the survey if they haven't already.
 
Coffee Session
  • Edge MLOps // Jason McCampbell

In this wonderful session, Jason McCampbell, chief architect at Wallaroo, schooled us on the challenges of doing MLOps at the Edge.

Semiconductors are the fundamental building blocks of Edge devices. High-Performance Computing that optimally or efficiently uses the hardware, is a parallel where semiconductor optimization algorithms and ML models overlap.

Depending on whom you ask, an Edge device can mean anything from microcontrollers to servers in data centers. The way the deployment is executed is what defines it as an Edge device.

Edge ML Challenges
There are several restrictions and strict security requirements when using data for EdgeML. They can either be sensitive types of data like IP-protected data, personally identifiable information (PII), etc., or they live across different development regions or servers around the world.

This makes continuous training and monitoring of ML models on the edge a major frustration. Federated learning is one technique that is used to consolidate these kinds of data for EdgeML.

MLOps at the Edge is a bit more complicated than traditional MLOps, particularly for the Ops folks. There are a lot of constraints to think about depending on the implementation and application, like limited network connectivity, efficiently updating the models on the device, data transmission from the device, etc.


 
Weekly Book Highlights
  • Understanding Machine Learning Systems by Chip Huyen.

Machine Learning in Research vs Production

A background in ML, either as a researcher or a traditional software engineer is two sides of a coin when understanding ML systems, which is critical in designing and developing ML systems. This is because there are major differences in the challenges that are encountered with ML in production versus research.

In research, state-of-the-art model performance on benchmark datasets is a major requirement, fast training / high throughput is a computational priority, data is static, and fairness and interpretability are often not a focus.

But in production, different stakeholders have different requirements, fast inference / low latency is a computational priority, data is constantly shifting, and fairness and interpretability must be considered.

Machine Learning Systems vs Traditional Software

For more than half a century, software engineering (SWE) has made the use of traditional software in production a success. If ML experts adopt software engineering skills, ML production would be much better.

However, the challenges between machine learning systems and traditional software are unique. With traditional software, there’s an underlying assumption that code and data are separated. In fact, things are kept as modular and separate as possible. On the contrary, ML systems are part code, part data, and part artifacts created from the two.

Also with traditional software, there is only a need to focus on testing and versioning the code. With ML, we have to test and version our data too, and that’s the hard part.
 
Past Meetup
  • OpenVino toolkit // Adrian Boguszewski

    Adrian Boguszewski, AI Evangelist at Intel, was our guest host at the meetup. He showed us how to make our ML deployment easier with Open Visual Inference and Neural network Optimization (OpenVINO) toolkit.

    OpenVINO
    is an open-source toolkit for optimizing and deploying AI Inference. It utilizes any hardware it runs on. It enables seamless deployment of models to production without having to build API servers or worry about hosting servers. It is also the perfect tool for running models locally and efficiently on a CPU without a GPU.

  • Model Optimizer
    OpenVINO uses its model optimizer to convert already trained models from any ML framework (i.eTensorflow, Pytorch, ONNX, Caffe, etc.) to an Intermediate Representation (IR). IR consists of two files, an XML file with the model architecture and a binary file with the weights and biases. The model optimizer also performs some optimizations, like graph pruning and operation fusion, to increase the performance of the models while running Intel hardware.

 
Blog Post
  • Improve MLflow Experiment // Stefano Bosisio

Stefano Bosisio is a Machine Learning Engineer at Trustpilot, based in Edinburgh. Stefano helps data science teams have a smooth journey from model prototyping to model deployment.

Tracking experiments robustly is an added efficiency as a data scientist. In this article, Stefano Bosisio shows how to improve MLflow experiments by tracking historical metrics.


Read Here
 
Sponsored Post
  • MLOps for Data Scientists // Wallaroo

A data scientist's job is not to eke every last bit of "accuracy" out of a model. Their job is to achieve business goals while meeting operational constraints.

One reason why data scientists might struggle with the model deployment process is that production considerations run counter to many data scientists' training, habits, and interests.

Some best practices to overcome this are; Strive for well structured code, Be mindful of the production environment, ‘Simpler is better than better’, “Faster is better than better".

The Wallaroo platform helps data scientists be more self-sufficient at working with their models in production. Data scientists can easily upload their models and specify modeling pipelines via the Wallaroo SDK, with just a few lines of python, using the notebook environment that they are most comfortable with. Once uploaded, ML Engineers can also monitor and manage the models and model pipelines via the Wallaroo API.

Using Wallaroo, data scientists will continue to have visibility into model behavior, via pipeline inference logs and advanced observability features, like Wallaroo's drift detection functionality. By enabling more data scientist self-sufficiency, and providing an intuitive space for data scientist/ML Engineer collaboration, Wallaroo supports an efficient and effective low-ops ML environment.
 
We Have Jobs!!
There is an official MLOps community jobs board. Post a job and get featured in this newsletter!

IRL Meetups
Toronto, ON - March 14, 2023
Madrid - March 28, 2023
Bristol - March 29, 2023
Switzerland - March 30, 2023

Thanks for reading. This issue was written by Nwoke Tochukwu and edited by
Demetrios Brinkmann and Jessica Rudd. See you in Slack, Youtube, and podcast land. Oh yeah, and we are also on Twitter if you like chirping birds.




Email Marketing by ActiveCampaign