Share
Preview
We turned one year old πŸŽŠπŸΎπŸŽ‰
 β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ

Engineering Labs next topic looks like it has been decided by you and the winner is..... Feature Stores! Looking forward to 8 teams getting a chance to have some hands on experience creating a feature store. We are going to be holding the intro meeting to explain all the details of the lab next week. If you want to follow from the sidelines jump in the #engineering-labs channel

Reading Group
Cliff Notes
β€œA word after a word after a word is power” - Margaret Atwood

Another incredible turn out to the reading group last Friday. This is quickly becoming one of the most interesting initiatives happening in the community. The engagement and knowledge sharing is really cool to see.

Lucky for us there were two amazing community members Ishan and Daniel who left their notes in slack. As a reminder, we were reading this paper from booking.com.

The author highlights the disconnect between domain informed folks and data practitioners. I would like to think that if we can have domain informed who are also adept in CS/statistics it would be of incredible value.
  • Could having individuals who have the domain knowledge and CS/statistical knowledge allay some of the issues highlighted by the author?
  • Let’s say for the sake of argument we can have hybrid individuals like that; what would be a better approach: Domain informed learns statistics and computer science or CS/data informed learns domain knowledge?
  • Another thought that also piqued my interest is the incentive structure proposed by the author. I think that is a good idea.
- Ishan

As for Daniel's notes, here we go:
  • had a great note on modeling 2nd order effects based on 1st order inputsΒ Β  -- love that idea
  • The idea of using an ML classifier to probe at differences between training data and online data is something I got really excited about -- I see a ton of opportunities to integrate this idea into our monitoring stack
  • One point made is that as ML becomes more of an engineering discipline, something we can lose is the statistical expertise/rigor that comes from fields like econometrics or other more traditional data science backgrounds. It definitely motivated me to get a little more serious about proper statistics and experimentation design related to ML deployment. I also wonder about the role of technology/infrastructure in enabling stats noobs like me to be successful here
  • We talked about problem formulation, and I mentioned how keeping things modular has helped me... led to a great "after-hours" chat around pros/cons of multi-task networks and if the juice is worth the squeeze
  • The idea of building models to make predictions about user/visitor attributes... and then these predictions being consumed by other resources (models, analysts, etc.) and the leverage they're able to produce... super interesting to think about how data science / ML teams can best enable the business
  • talked about how one of the levers we have is to choose how / when we respond to predictions. He gave the example of adding some randomness in time, rather than acting immediately... can reduce the "creepy factor" for the uncanny valley hill
  • I take something new away from this paper every time I go back and read it... after the meeting today I already feel like I missed a ton in my last read through, and will revisit over the next week.

+ As a reminder if you would like to follow along jump in the #reading-group and/or sign up below.
Past Meetup
Life Experience
In-Depth Monitoring

Serious expertise was shown by none other than the author of Machine Learning in Action last meetup. We started by hearing about the human element of ML. Ben reflected the view of many others we have had on in the past saying that MLOps is absolutely an organizational problem, not a tooling one.

Ben later spoke about the fragility of ML projects and what metrics to use while monitoring. He got a bit more granular and spoke about how to weigh metrics when you are dealing with hundreds of them. My inevitable next question any time we talk about monitoring is how can we stay out of alert hell?

The meetup finished with my personal favorite War Stories! If you want to hear about how to loose hundreds of millions of dollars, check that part out.

As always, excellent questions coming from you all. One which I am still laughing about...."I have worked with data scientists who feel like their job is producing the model. There is little to no regard for the β€œin production” aspect of the model. Have you any thoughts on how we can bridge the gap between data science and the wider engineering team so models are produced with production in mind? Should I just add them to the on-call rota when the project is released?

Short answer = Yes, add them to the on-call rotation.

+Ben hooked us up with a 35% off discount for his book. Get your copy here and use the code: podmlops21.

Check out the video here and podcast here for the full story.
Coffee Session
MLOps Investments
Data Scientist Turned VC Shares Her Thoughts

Many of you have discussed the role of VC on the MLOps ecosystem on Slack. Recently, we chatted with Sarah Catanzaro, an investor at Amplify Partners, who gave us her take on trends in MLOps.

Catanzaro gave us insight into how a former head of data turns her experience into investments. It's truly a small world--Sarah invested in the company run by last week's meetup guest Josh, Flywheel ML!


We had a wide-ranging discussion with Sarah, three takeaways stood out:

  1. The relationship between unstructured data and structured data is due for change. In most settings, you have some form of structured data (i.e. a metadata table) and unstructured data (i.e. images, text, etc.) Managing the relationship between these forms of data can constitute the bulk of MLOps. Because of this difficulty, Sarah forecasted new tooling arising to make data management easier.
  2. Academic benchmarks suffer from a lack of transparency on production/industry use cases. In conversation with Andrew Ng, Sarah shared her lesson that despite all the blame industry professionals place on academics for narrowly optimizing to benchmarks with little practical meaning, they also share the blame for making it difficult to create meaningful benchmarks. Companies are loath to share realistic data and the true context in which ML has to operate.
  3. MLOps is due for consolidation, especially as companies adopt platform-driven strategies. As many of you all know, there are tons and tons of MLOps tools out there. As more companies address these challenges, Sarah predicted that many of the point solutions would start to be consolidated into larger platforms.

Check out the video, and podcast.

Till next time,
Vishnu
Current Meetup
1 Year Later
It's A Party 🎉🥳🎊

It's been a year of doing these meetups and coffee sessions. Because of this excise, we have been able to talk with people way out of our league. Talking to all these different experts has given us so many insights which helped us mature our understanding of what MLOps is.

For this meetup David, Vishnu and I are going to be talking about some of our highlights over the last year from all these conversations. You know what they say though, talk is cheap so I'll also be asking Vishnu and David what are some specific actions they have taken after getting advice from the experts.

Story time: We want this to be a bit of a sharing so if you have learned and taken actions because of the past year of meetups please come and tell us. We will open up the floor to anyone who wants to participate.


+ As always see you at 5pm GMT / 9am PST tomorrow, Wednesday by clicking the link below. I'll bring enough cake for all of us.
Conference
apply()
Mark Your Calendars

You may have heard the community is a partner in the MLOps upcoming conference; apply(). There are some incredible speakers from Netflix, Google, Doordash, Lemonade, StichFix, Etsy, Pinterest, and Spotify. Not to mention myself and Ilnardo will be talking about the MLOps community and the engineering labs in a lighting talk! Oh yeah, and I'll be MCing the whole thing so wish me luck on that one!

It's free, and it's gonna be fun. I will even play some guitar and lead a meditation during the breaks.
Best of Slack
Jobs
See you in slack, youtube, and podcast land. Oh yeah, and we are also on Twitter if you like chirping birds.



Email Marketing by ActiveCampaign