Share
🕵️‍♀️ ... and make sure they are working
 ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

I'll be an MC at 3 of the sessions this week at the TMLS conference so if you are around feel free to drop me a line! On that note, you can still get tickets to the conference here and see the full program here. We are currently giving away 3 free tickets in slack so jump in the thread and get involved!

Tuesday's Theme
All You Need To Know About Monitoring
Background: I realize that I have the benefit of hearing about many articles that are attacking certain topics. This week I wanted to shine some light on one of my favorite topics, monitoring! There are great articles being written by all the vendors in the space so I thought I would do a round-up here of all the articles I have seen that help my understanding of monitoring for those of you that got your models out the door and are wondering.... now what ¯\_(ツ)_/¯

Breakdown
I think this article by two sigma ventures is a great place to start when it comes to monitoring. I also recommend this short video from when I talked with Shubhi Jain about how they did monitoring at Survey Monkey and also this deep dive we had with Lina Weichbrodt on all the different bugs you can encounter in your models and ways to monitor for that.

The Articles

These are obviously just a taste of everything that is out there, yet since I have been seeing so much chatter around it lately I thought some of you would enjoy having all of them in one place. It is also really nice to look at how each one of these companies is tackling the problem.

Honorable Mention goes to this article where Tecton.ai talks about how one of the main components of a feature store should be monitoring. And from the Tecton alma mater, Uber's take on data monitoring. Last but not least I couldn't talk about ML monitoring without mentioning this article from our new best friend Mr. ML in Production Luigi Patruno.
Past Meetup
MLOps At The UN
The How: Mark gave us an extensive view of how he and his colleagues at the UN used Wardly Maps to come up with the MLOps platform. I find this way of analyzing and looking at a gigantic problem absolutely fascinating and feel like there is so much to learn from.

If you are interested in Wardley Maps like I am, you can have a look at all their community resources here. For the full conversation with Mark and I you can have a listen here or watch the video here.

More Good News: Mark spoke about creating MLOps training and certification while we were in the meetup and it got me thinking that maybe we as a community could do something to help? If you are interested please reach out and I can put you in touch with the right people.

Data Privacy
Round 2 of Privacy: Regulations
Episode 2 of our 8 part series around data privacy is a deep dive on regulations and how they affect ML. There is so much fuss around regulations these days, whether we have too many or not enough, and how different each one is around the globe. We aim to explore and demystify what these regulations are, what they are saying, and how they affect our machine learning initiatives.

Special Guest: Our lovely host Fabiana is joined in this episode by Cat Coode, a specialist on data privacy, hailing from Blackberry where she used to design application architecture with a security-first focus. She then dove into data privacy and regulation shortly after as GDPR and similar regulations became more important for organizations. Currently, Cat helps companies understand how to apply regulations and what they need to become compliant with regulations. In this episode, Cat explains how regulations are more related to technical solutions than we would think.

*This series is brought to you by YData. YData offers a dataset experimentation platform with synthetic data generation that makes the process of building datasets take a fraction of the time and cost that they used to.

Blog
F.Acc.T. Part 2
My Bad: As it turns out I was a bit mistaken. The powers that be changed the name of the Fair, Accountable, Transparent acronym from FAT to FAccT, as to not shame anyone. I apparently was doing loads of self-shaming last time we wrote about this topic but its all good now, I have been through sufficient therapy to recognize that I am enough!

Anyway, about the new article, community member Nick Ball does a deep dive into what it means to create a transparent ML model. Some considerations for transparency include:

  • Do you need machine learning?
  • Do you need nonlinear ML, and in particular, do you need deep learning?
  • Is a model-agnostic explanation acceptable?
  • Do you need human-interpretable features?
  • What are the regulatory requirements?
  • Do you need reason codes?
  • Do you need to visualize your model?
  • Is the setup that you have thus come up with in fact OK?

Let us know what you think in the comments and please share with a friend to help expand the reach of our little blog.
Current Meetup
Our Current MLOps Landscape
As you know I am keen to hear everyone's perspective when it comes to the current MLOps landscape so for this week's meetup I convinced community members Timothy Chen and Nathan Benaich to come on and share their views on MLOps from the lens of the investors who are being pitched hundreds of deals a year.

Some themes we plan to discuss:
  • Greater trends in tooling landscape
  • Tools that have surprised them
  • Best in class or one e2e tool?
  • Where do they see the landscape in 5 years?

Tim's Bio: Tim is the Managing Partner at Essence VC, with a decade of experience leading engineering in enterprise infra and open source communities/companies. Prior to Essence, Tim was the SVP of Engineering at Cosmos, a popular open source blockchain SDK. Prior to Cosmos, Tim cofounded Hyperpilot with Stanford Professor Christos Kozyrakis which later exited to Cloudera. Prior to Hyperpilot, Tim was early employee at Mesosphere and CloudFoundry. Tim is also active in the open source space as an Apache member.

Nathan's Bio: Nathan Benaich is the Founder and General Partner of Air Street Capital, a venture capital firm investing in early-stage AI-first technology and life science companies. The team’s investments include Mapillary (acq. Facebook), Graphcore, Thought Machine, Tractable, and LabGenius. Nathan is Managing Trustee of The RAAIS Foundation, a non-profit with a mission to advance education and open-source research in common good AI. This includes running the annual RAAIS summit and funding fellowships at OpenMined. Nathan is also co-author of the annual State of AI Report. He holds a Ph.D. in cancer biology from the University of Cambridge and a BA from Williams College.

Meetup Details: As always we will be meeting on Wednesday (aka tomorrow) at 5pm GMT / 9am PST. some have asked me about calendar invites for the recurring event, follow these links so you can add to google cal, outlook and Yahoo cal
Best of Slack
Recomended Long Read
Invisible Women by Caroline Criado Perez, which won the 2019 FT-McKinsey book prize, examines the long legacy of women being overlooked in data science and the very real (and sometimes dangerous) implications.
Keep on rocking in the free world 🤘Check out our slack, youtube, and podcasts if you haven't already.



Email Marketing by ActiveCampaign