Share
Preview
New Tool Tuesday, Weekly round-up, and whats next in MLOps
 β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ

We started doing 60 second short clips from the meetups you can check out here. Byte-sized wisdom for you to enjoy on the go. What's next for us? Tik-Tok?

New Tool Tuesday
Kites and Boxes
Another ML Monitoring Solution?

There is something very special about today's "New Tool Tuesday". My old boss and an MLOps community founder Luke Marsden is a co-creator.

I spoke with Luke over the weekend about the tool and why they created it. Before we talked about the tool though, first thing he said to me was "I saw your LinkedIn post..... yeah sorry about that" 😆

Anyway, I asked him why the hell he would make a monitoring solution at this time with the current space being full already.

"BasisAI have a product called Bedrock which is an MLOps platform. It has a monitoring component in it - to create boxkite we extracted the code from the proprietary platform and open sourced it."

Sounds a bit like cheating, go on.

"Under the hood, it's a simple python library which you pass your training data, training labels, production data and production labels it and it creates Prometheus histograms automatically from them. You ship the training time histogram with the model and feed it back into the boxkite component that runs in production, boxkite then knows how to compare the training time histogram with what it's seeing in production and exposes a prometheus endpoint."

Ok, Nice. But I'm still not convinced. Sounds like I can just do this with prometheus and grafana. Why would I need boxkite?

"One of the hardest parts was figuring out how to show the difference between the distributions in Granfana - normally to compare two distributions you use a technique called Kullback–Leibler divergence or KL divergence for short. There are loads of implementations of KL divergence in Python. One of the really clever things the BasisAI team did was to port KL divergence to Grafana.

So, there's a really hairy PromQL expression hiding in the Grafana dashboard which shows you the KL divergence purely in Prometheus & Grafana.

None of the ML monitoring tools we've seen work natively with Prometheus & Grafana. We don't believe you should be using one stack to monitor your ML and another stack to monitor your software. There should be unification between the MLOps teams & DevOps teams - and using the same tools is a key part of breaking down that wall."

Interesting point. MLOps and DevOps working together in unison on a single stack.

"We've also seen a lot of monitoring tools that are pretty heavyweight:Β  Seldon's Alibi for example uses an event bus to ship all of the production inferences to a central server which then runs the statistical techniques like KL divergence in Python.

Instead, boxkite is really lightweight - it's just instrumenting the training, exporting a prometheus compatible histogram. That's what's shipped with the model. It then exposes a simple Prometheus endpoint on your service which gets scraped just like the rest of your microservices.

Props to the BasisAI team for making something actually quite difficult seem pretty simple."

Luke set up a demo to check it out and kick the tyres if you feel so inclined.

Past Meetup
Greater Than Its Parts
Expectation Management

Welp, I was officially blown away by the simplicity of explanations Oguzhan Gencoglu gave in this last meetup. Unassuming and straightforward, I thoroughly enjoyed the chat.

Nuggets of Wisdom - I took quite a few notes from our chat. Ouz had a quotable moment in the beginning of our conversation, coincidentally he didn't turn off the quote machine for the full hour.

Backlog of Problems. This idea came towards the end of our chat and struck me hard. Very hard. It echoed what we heard from Luigi not more than a few weeks ago about really choosing the right problems to apply machine learning to. Ouz talked about how important it is to have a list of problem sets that Machine Learning could be used to solve. Luigi talked about how important it is to solve the right problems for the business.

"You can work on problems where you have huge benefits, you have huge gains but that translates into relatively small benefits for the company as a whole because you're just focused on an incremental win rather than something that could have transformational power."

Start with the end in mind. As we joked in this meetup Ouz has seen many a companies get stuck in a PoC loop, or as I lovingly call it, PoC Hell. One antidote he gave to combat this is having clear what productionization will take from the beginning. By doing this it makes each decision align with the greater goal. As Ouz put it, the sum is greater than its parts.

Full conversation on youtube and in podcast land.
Coffee Session
The Gift of Gab
A Medium Star Rising

If you have yet to read any of Adam Sroka's Medium articles, do yourself a favor and check them out now.

Adam writes persuasively about how to manage data scientist and machine learning engineers in today's hyper competitive technology economy.Β  If you're a data scientist or MLE, read his articles to understand how to be effective. If you manage data scientists or MLEs, read his articles to understand how to get the most out of data professionals in a realistic way!

I particularly learned two key insights from him on
attitude and leadership. As an experienced hiring manager and team lead, Adam expressed a great perspective on how to think about the attitude of data scientists coming on to the job market and how it impacts their effectiveness.

You can find the full conversation in video format or in podcastland.

Till next time,
Vishnu
Current Meetup
Live Stream MLOps
Can Live Streams Actually Help?

Can one learn anything useful by creating content online? The usual answer is a resounding YES. But what about live coding an MLOps project on Twitch? Can anything good come out of it?

On Wednesday we will talk with Felipe Campos Penha, Senior Data Scientist at Cargill about live coding on twitch and why it's more realistic than normal tutorials.

Bio
Felipe Penha creates content about Data Science regularly on the Data Science Bits channel on YouTube and Twitch. He has 8+ years of experience with hands-on data-related work, starting with his doctorate in Astroparticle Physics. His career in the private sector has been devoted to bringing value to various segments of the Food and Beverages Industry through the use of Analytics and Machine Learning. Check out his Twitch channel for a preview of our meetup.

See you on Wednesday at 5pm UK / 9am California. Click the button below to jump into the event, or subscribe to our public google calendar.
Blog
From Hacksaw To Power Tools
Learning Data Engineering As A Data Scientist

One pattern that many have spoken about in meetups and coffee sessions is how useful it can be for a data scientist to learn a bit of data engineering. I know this may spark a should a data scientist know k8s kind of debate, but don't shoot me I'm just the messenger. At the end of the day I think it comes down to how much value add can a bit of DE get you. If you are a lone solder in a small start up that might be quite significant.

The other side of the coin is what our latest coffee session guest says. "A data scientist can do anything, usually slower and more expensively than everyone else β€” but the stress is on the anything."

Let me know what you think, is it worth while for a data scientist to pick up some DE skills or should they focus on the problems where they can add the most value? Or maybe they arent mutually exclusive? I'm not sure there is a catch all to this one but I would love to hear your thoughts!
Best of Slack
Jobs
See you in slack, youtube, and podcast land. Oh yeah, and we are also on Twitter if you like chirping birds.



Email Marketing by ActiveCampaign