Share
thank you all for making this community great!
 â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ â€Œ

There is a first time for everything. After starting this community online during a pandemic when everyone was confined to their houses, we had our first official gathering last week in San Fransisco. (ok ok I know Berlin had gatherings already but it doesn't count cause Alexey Grigorev organized it....)

It was amazing getting to actually hang out with many of you I have been seeing through a screen for the past year and a half. Finally, we got to get together in the flesh! Huge thanks to Simba and the Feature Form team for picking up the tab. I'm looking forward to doing more of these around the globe. Check out the pictures and if someone has more share them with us!!

Want to organize an MLOps happy hour in your city? Reach out! No talks. No venue headache. Just drinks and conversations. (I am also open to other ideas)

Meetup
Data Centric AI

This past week, we had Alberto Rizzoli from V7labs. During this conversation we talked about the Training Data Value Chain, what is missing and what v7labs is creating: a Dataset Management System with an easy MLOps integration.

We talked about how V7 has tackled this problem, what the needs for the MLOps community are, and how to standardize our work to enable further collaboration.


It was interesting to hear Alberto's opinion about why they chose Elixir for the platform, the MLOps behind the scenes, the importance of sometimes taking shortcuts by using third party tools and that you can't do it all. Decide on what you want to be #1 and focus on that.

P.S. Stop worrying about creating training data and lets make data sexy again.
Coffee Sessions
Pyt🔥rch
Talking PyTorch is always interesting, as the Facebook ML OSS project is one of the most important parts of the machine learning tooling ecosystem. This week, we talked to Dmytro Dzhulgakov, a tech lead for PyTorch, and boy, was it an interesting conversation!

We started off talking about Dmytro's journey to being a engineer and tech lead at Facebook, and what his role entails. Dmytro has been at Facebook for 10+ years, so he gave some very interesting advice on how to manage a career in the software engineering for machine learning world. After that, we got deep into the present and future of PyTorch and what improvements the project is making to support MLOps workflows. PyTorch is a large project, and Dmytro shared with us the valuable lessons he learned from confronting multifaceted scaling challenges while working on PyTorch. Finally, we talked about the future of machine learning engineering, especially as relates to how software engineers work by comparison.

Dmytro is a world expert on how to build tools that bridge the gap between ML research and production, and it's well worth listening to this podcast to learn from him!

Till next time,
Vishnu
Guest Wisdom
Production Oriented
The coffee session we had with Niall was fire!

He talked about his years of experience at Google and Microsoft as an SRE. if you don't know Niall, literally wrote the book on being an SRE and was there at Google when the discipline was forming.

In this session, we talked with Niall about how machine learning organizations need to start to take advantage of SRE best practices like SLOs. Production machine learning depends on high-quality software engineering, and we get Niall's take on how to ensure that in a machine learning context.

Sponsored
Introducing Array Type Features
Array Features in Operational ML

Arrays are a feature data type that can be used across a number of applications. Consider a retailer that serves product recommendations to users based on their current search query and purchase history. Our retailer might build the following kinds of features:

    Lists of categorical variables.
  • product_categories: a list of categories a product belongs to, e.g. [shoes, women, outdoors] for a pair of women’s hiking boots.
  • user_last_10_purchased_products: a list of the last ten product ids purchased by a user. Using our streaming capabilities, Tecton can keep this feature extremely fresh.
Dense embeddings.
  • product_embedding: a precomputed embedding based off of each product’s description and metadata.
  • search_text_embedding: a query-time embedding computed from the user’s search text, e.g. "5-piece knife set". This embedding can be provided to the Tecton API to be combined with precomputed features.

In this article, we’ll go through (1) how arrays are commonly used in operational ML systems and (2) an example of how a user can compute a similarity score between a product and a query using embeddings in real time with Tecton.
Current Meetup
Great Expectations
The CTO of Superconductive (Great Expectations) the OSS data observability tool will be joining us to talk about Durable Data Discovery. The whole idea here is how to make exploratory analysis stick.

Building an effective ML pipeline requires understanding the data available to you and how it's changing. Exploring a new dataset is often an iterative, interactive process that gives the engineer doing it tremendous insight into the underlying data generating processes and the pipelines that have touched it. Yet too often, those insights are lost when a system goes into production or after internal handoff between teams.

We'll talk about how to capture Exploratory Data Analysis done when first working with a dataset. With a clear understanding of what data characteristics were important in crafting a dataset, it becomes possible to collaborate on and share clear expectations about the true differentiator in ML pipelines -- the data that fuels them.

Sub to our public calendar or click the button below to jump into the meetup on Wednesday at 10am PST/5pm BST
Reading Group
Object Design Style
We will have a panel discussion for this week's reading group with two guests that are valuable members of the MLOps Community. We will be discussing the impact of OOP and Design Patterns in ML projects to enhance our delivery as ML practitioners. Link: https://lu.ma/psof09tp

Laszlo Sranger is the founder of Hypergolic, an ML Consulting company that helps startups and enterprises bringing the maximum out of their data and ML operations.

Previously, Laszlo worked as a Senior Data Scientist at King and built the Data Science team from the ground up at Arkera as the Head of Data Science.

Tim Blazina
is a Data Scientist / ML Engineer working at Migros-Genossenschafts-Bund, formerly at Felfel and Ascarix AG. Tim has a PhD from ETH Zurich, one of the most prestigious universities in the world, and has been wearing many hats since ML model development to putting models into production.

Relevant sources for the reading group: 1, 2, 3 and 4.
Best of Slack
Jobs
See you in slack, youtube, and podcast land. Oh yeah, and we are also on Twitter if you like chirping birds.



Email Marketing by ActiveCampaign