Share
Preview
and to unit test or not to unit test
 ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

I'll be hosting the apply() conf this Wednesday and Thursday. I made a little video about it to set the tone early. You can already win some free swag by registering and showing your best 80's attire here.

Coffee Session
Video is Hard
For this past Coffee Session we had Brannon Dorsey who has gone from art student to working in cybersecurity, to leading the backend team at Runway.

What's Runway? Runway is a company whose mission is to make content creation accessible to all by leveraging the use of machine learning and advances in computer graphics.

Browser based model As we can imagine working with video is hard! By abstracting away the hardware Brannon and team allow creators to edit in their own browser. This means it's easy to use, the barriers of entry is low and is optimized for collaboration. Like Brannon said: “we want you to be able to use our product with a chrome book that's 5 years old”. This was no small engineering feat to accomplish. Check out this episode to hear all about how they managed.

You can also check Brannon's blog post: Distributing Work: Adventures in Queuing

Past Meetup
DataOps Fundamentals
Last week we spoke with Micha from Maersk who taught us about DataOps and how to think about the space as a whole.

DataOps is software engineering
Throughout his talk, Micha illuminated the parallels between DataOps as a practice and Software Engineering, and how treating the former as the latter will accelerate your development and lead to more trustworthy data:

Test your data pipelines like you test your code - your pipelines should test quickly, easily and often

Local > dev - your local environment is your first line of defense. You should be able to run your code, pipelines, and tests there before pushing to dev (or prod!)

To unit test, or not to unit test - When should unit tests be used? When should they not? When fundamental components of pipeline code are reused frequently and changed infrequently, they should be unit tested. But with small components of a larger pipeline that are constantly changing due to new data and requirements, you may be better off testing the pipeline as a whole, rather than writing and maintaining unit tests for each component.
New Tool Tuesday
Production Ready?
I found a new toy the other day. This time it was modelkit, a python framework, meant to make ML models reusable, robust, performant and easy to deploy in all kinds of environments.

So, I reached out to Cyril Le Mat the creator of Modelkit. I wanted to learn more about the story behind why he built it and what the vision for the project was. If you want to skip ahead and check out a tutorial with the tool click here.

The story - Cyril and his data team of 10 felt like they were mature enough and experienced enough to handle difficult problems they ran into while putting models into prod. They had some standard ML libraries that they leaned on, but the majority of their logic was custom python code or python wrapped.

Guiding principles - The team had a few key philosophies they used while building.
  • Production inference should be separate from the training process
  • Model logic should be separate from ETLs
  • Little notebook usage, especially not in prod
  • All config and model development processes are on git

Some things I enjoyed about Modlkit were that its a simple pip install, it versions and tests all your artifacts, and is framework agnostic. Have a gander and let me know what you think.
Sponsored
Build Your Online Feature Store With Redis
When it comes to real-time machine learning, it’s important to ensure your data is fresh and fast. Milliseconds matter and can make the difference in delivering fast online predictions whether it’s personalized recommendations, detecting fraud, or figuring out the most optimal food delivery route.

Redis is trusted by many developers and ML engineers in many organizations and more are rediscovering Redis for its potential in harnessing real-time data towards online feature serving. With support on-premise, in the cloud, and on the edge, Redis Enterprise enables enterprise-grade deployment for globally distributed ML feature stores with 5 9’s of availability, linear scaling, active-active replication, and support for Redis on Flash.

Ensure your ML data is Live. Fresh. Fast. Build your online feature store on Redis for real-time serving, continuous re-training, and augmented vector predictions at ludicrously low latency.
We Have Jobs!!
There is an official MLOps community jobs board now. Post a job and get featured in this newsletter!

Best of Slack
Best of Slack is its own newsletter now. Sign up for it here.
See you in Slack, Youtube, and podcast land. Oh yeah, and we are also on Twitter if you like chirping birds.



Email Marketing by ActiveCampaign