Share
Preview
How does Pinterest power image search across billions of images with low latency?
 ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

If you value your time. If you enjoy being productive. DO NOT join the #production-code channel. I have gone waaaay too far down the rabbit hole.

Past Meetup
Load Testing
Locus Swarm

We spoke with Emmanuel last week about his new book! He had all kinds of insights around ONNX, using ML to cross check your ML, and what you need to build robust CI/CD pipelines for ML.

I love how he explained the trend of shifting from building just a model to building a pipeline. Another point about robust ML pipelines is to have a proper quality assurance process. How? By having tests as part of the CI/CD pipeline. I feel like I am having deja vu!

Emmanuel is the second person to come on our meet up in the past month to talk about the importance of tests in MLOps. Ok universe, I'm listening.

Here is a link to the video and the podcast.
Blog
MLOps vs DevOps Part II
Complexity

I am back with part II of the MLOps vs Devops theme. We've had too many amazing conversations on this topic I couldn't just end it where I did. I am now contemplating a part 3 because going back and watching these videos with people like Ryan Dawson or Damian Bradly have been super useful for me. Their words take on new meanings for me now that my understanding has grown.

Here is a quote from Ryan on the nature of the beast. DevOps vs MLOps.

‘MLOps can be tricky to explain if colleagues and managers are used to traditional software engineering and DevOps. Being able to explain it to a complete stranger would be ideal, but we need to at least make it clear to the other IT professionals and internal stakeholders we work with. We have to answer questions like ‘isn't that just DevOps?’ clearly, otherwise the challenges of MLOps will continue to be underestimated.’
System Design Review
How Pinterest Powers Image Similarity
Something New

How does Pinterest power image search across billions of images with low latency? How does a top engineer at one of Silicon Valley's biggest companies architect cutting edge ML systems?

David and Vishnu have started a new series called System Design Reviews. Party time! This is super technical deep dive into some of the best papers with live commentary from the authors.

Shaji Chennan Kunnumel, a software engineer at Pinterest, sat down with them to talk about the blog post Detecting Image Similarity in (Near) Real-time Using Apache Flink. In the episode, Shaji walks through the system design for Pinterest’s near real-time architecture for detecting similar images.

The guys discuss Pinterest's usage of Kafka, Flink, rocksdb and much more. Starting with the high level requirements for the system, they go into Pinterest’s focus on debuggability. Next, they examine an easy transition from their batch processing system to stream processing.

Shaji describes different system interfaces and components involved such as Manas—Pinterest’s custom search engine—and how it all ends up in their custom graph database, downstream Kafka streams, and to Pinterest’s feature store—Galaxy.

With Shaji’s expert knowledge of the system, we were able to do a deep dive into the system’s architecture and some of its main components.

This is a master class for anyone interested in engineering production machine learning systems and understanding how system design works at one of the premier engineering organizations in the world!
Current Meetup
Security
DevSecMLOps or MLSecDevOps or MLDevSecOps?

One of the areas that has been most transformed by ML in past years is cybersecurity. Traditionally, SIEM (Security Intelligence and Event Management) is performed by human analysts. However, as the cyber powers and tools of the world are growing, we need more and more of these specialists.

The entire area of cybersecurity is experiencing a shortage of talent. This is where the ML is coming in to help us. Cybersecurity ML systems require expertise from specialists as well as unique ways of handling user-sensitive data. This imposes various architectural solutions.

In this talk, Monika will introduce us to the ways of using ML in cybersecurity and the unique challenges we face.

See you on Wednesday aka tomorrow at 5pm UK / 9am California. Click the button below to jump into the event, or subscribe to our public google calendar.

Best of Slack

  • MLOps Team Structure: Yet another fantastic thread in the #production-code channel (seeing a theme here?). This time, members discussed how to think through the "topology", structure, leadership, and psychological safety of DS/ML teams
  • Dud, a smaller DVC: Shout out to community member Kevin Hanselman for releasing a very cool open source project, Dud. It aims to be what Flask is to Django; a simpler, more focused version of a larger framework. Check it out!
  • White Paper on MLOps: H/t to community member Satish Gupta for sharing this great whitepaper from the folks at Google Cloud on the various components of MLOps! Very thorough and practical.
  • Property Based Testing of ML Models: Interesting article from community member Daniel Angelov about how to apply property based testing, a common practice in software engineering, to the exploding domain of data system and ML model testing.
Jobs

See you in slack, youtube, and podcast land. Oh yeah, and we are also on Twitter if you like chirping birds.



Email Marketing by ActiveCampaign