|
|
|
|
The conference videos are being populated here if you weren't able to make it.
|
|
|
|
|
|
|
|
|
- Sparks Translational Gap // Matei Zaharia.
Matei Zaharia co-founder and chief
technologist at Databricks, let us pick his brains on this one. He shared how Spark went from being an idea to actually becoming a full-fledged product and being used by a ton of people
Spark started out as a result of kin interest in the kind of data center scale computing that was happening mostly at web companies like Google, Yahoo, Microsoft, and so on. At that time they were, mostly doing things with the web and indexing the whole web and then doing stuff on top of it. Coincidentally, open-source projects like Hadoop had started out, to do map pages.
Since collecting data isn't actually expensive and the cost of storage is pretty low, the company and scientific lab were interested in using large-scale data, with an efficient and optimal trade-off. , However, they were limited in the kind of applications they could run. Although they were good for building a web
index, they weren't good for running, more interesting algorithms like machine learning.
Spark was developed from Hadoop to address use cases that people couldn't do well with Hadoop. Spark's engine was primarily built to handle things like machine
learning algorithms. Over time it expanded to doing, other things like large-scale on-disk batch processing stuff.
Another reason that Spark became very closely tied to machine learning is that scientists and researchers at Berkeley used it for large-scale machine learning.
|
|
|
|
|
|
|
|
|
Rahul Parundekar, the founder of AI Hero, talked about streamlining the model serving on Kubernetes. This is done by leveraging the concept of declarative MLOps
Declarative Paradigm is defined as what needs to be accomplished, without defining how it needs to get done. This is the way Kubernetes works. Kubernetes allows the orchestration of workloads, IT jobs, servers, databases, etc.
When
developing an ML solution with Kubernetes, a target layout is defined using a yaml file, which is then applied to Kubernetes using the Kubernetes API. Kubernetes layout automatically takes care of all orchestrations and serving of the models. That way, end users will get access to these backends, model servers, and any other services.
It starts out by storing the layout in etcd, and then the control plane starts scheduling servers inside pods on virtual machines or nodes using either the cloud provider or the control manager. Each node has a container orchestration system as well as a networking interface.
In a nutshell, "you define what you want and the system takes care of it". This makes it a useful paradigm for MLOps
YouTube
|
|
|
|
|
|
|
|
|
- Traceability & Reproducibility
This blog was written by Vechtomova Maria
In the context of MLOps, traceability is the ability to trace the history of data, code for training and prediction, model
artifacts, and environment used in development and deployment. Reproducibility is the ability to reproduce the same results by tracking the history of data and code version.
Machine learning models running on production can fail in different ways. They can provide wrong predictions, and produce biased results. Often, those unexpected behaviors are difficult to detect, especially if the operation seems to be running successfully.
This blog explains how traceability allows us to identify the root cause of the problem and take quick action. Also making it easier to find the code versions responsible for
training and prediction, as well as the data used.
|
|
|
|
|
|
|
|
|
This blog was written by Médéric Hurier (Fmind)
If you work on MLOps, you must navigate an ever-growing landscape of tools and solutions. This is both an intense source of stimulation
and fatigue for MLOps practitioners.
Vendors and users face the same problem: How can we combine all these tools without the combinatorial complexity of creating custom integrations?
In this article, he proposes a solution analogous to POSIX to address this challenge. First, he motivates the creation of common protocols and schemas for combining MLOps tools. Second, he presents a high-level architecture to support implementation. Third, he conclude with the benefits and limitations of standardizing MLOps.
|
|
|
|
|
|
|
|
|
|
|
- The Buyer’s Guide to Evaluating ML Feature Stores & Feature Platforms
If you’re looking to adopt a feature store or platform, but don’t know where or how to start your research, then this guide is for you.
Download this free guide to:
- Access a comprehensive framework for understanding the capabilities of different feature stores and feature platforms
- Get examples and tips on how to use a data-driven approach to evaluate vendors so you can find the right solution for your
organization’s needs
- Learn how the right solution can improve ML model accuracy and unlock new real-time ML use cases using streaming or real-time data.
|
|
|
|
|
|
|
From now on we will highlight one awesome job per week! Please reach out if you want your job featured.
- MLOps Engineer at VMO2: If you looking for a team that puts data at the heart of what they do to deliver a best-in-class digital experience for customers, with machine learning (ML) foundational to this mission.
|
|
|
|
|
|
|
|
|
Thanks for reading. This issue was written by Nwoke Tochukwu and edited by Demetrios Brinkmann and Jessica Rudd. See you in Slack, Youtube, and podcast land. Oh yeah, and we are also on Twitter if you like chirping birds.
|
|
|
|
|