Share
Preview
Principle of avoiding undifferentiated heavy lifting
 ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌
Strong 2nd half of the month with IRL meetups and virtual stuff happening. Stay up to date with what's going on in the community by subscribing to our public cal.
Coffee Session
Tool of the Trade
Got to see the internal workings of Xiangrui Meng mind, who was the very first ML Engineer at Databricks.

He has probably touched your life in some way through the Data bricks Universe, be it MLFlow, Spark, or MLib.

Databricks
Databricks had a clear understanding and accurate prediction of the future for Machine Learning which was cloud computing.

Amongst other things, its unified platform tries to align the multiverse of the various players in the data team.

Keeping it Fresh for the future
From the perspective of an ML tool builder. The way ML is being thought of has evolved from the scalability of algorithms to model implementation and is currently at the productionalization crossroad .... pretty much the MLOps station.

MLFlow Pipelines
MLflow pipelines try to reinvent the wheels of pipelines for data scientists in an opinionated way, that enhances the productivity of the data scientist.

Abstracting the engineering workloads that need to happen at some point enables them to focus on what they enjoy doing.....Whatever that is.

This also simplifies the collaboration between data scientists and production engineering.

Coffee Session
Composable database
Call it a cake and coffee party with Samuel Partee to talk Redis vector similarity search.

Redis has been quickly evolving from the catching database it became known for. Sam explains to us how they are committed to becoming a useful tool in your ML toolkit.

Redis Stack is essentially a bundle(layers) of modules that exist within Redis. These layers enable the ability to customize the capability of its API within the database.They are technically composable databases that allow the input of modules that can change Redis’s functionalities.

Redis Narrative
Changing any product’s narrative especially when it comes to adapting that narrative to enable ML capabilities, isn’t always an easy recipe to bake.

The idea of “BYOR” implementation which means Bring Your own Redis implementation, plays a huge role in supporting Redis’s narrative of handling complex use cases like ML.

This concept, coupled with the layers structure adds more flexibility to the way Redis can be used or implemented.

You are probably thinking what I am thinking, and yes, you betcha.
GO WILD WITH THOSE THOUGHTS!!!

Redis ML Solution
As mentioned above, when we hear “Redis” we hear database. So the question becomes how does a “Mundane in-memory database” improve ML products?

Just remember, with caching comes speed. I'll leave you with that.

Coffee Session
Data Team Spectrum
We got a first-class tour of the data team building (pun intended) from Leanne Fitzpatrick, the director of Data Science of Financial Times.

Data Teams Architecture
The foundation of building the data team should have enough reinforcements that can actually support the business initiative.

The approach is to think more toward the angle of the data team's constraints and limitations. Think Negative engineering?

Then try to reinforce those shortcomings at the initial stage of the build process and get equipped with the right aggregates to design the right team for the business.



Organizational Deficiencies
In reality, it is still a little bit difficult to articulate the need for data expertise in a business.

Probably because of the friction between the data roles that exist.


The notion that the data team might not be able to determine what they need to do with the engineering resources introduces a lack of trust in the system.

But to be fair, it is just where we are in our journey as data Nomads.


More often than not there is a willingness to compromise on the data function of the business for core engineering functions.

This could be to a lack of understanding in measuring the success of the data functions and the maturity level of the entire data ecosystem.

Blog post
Embedding Pitfalls
Let’s say that you have read a very helpful post demystifying embeddings and you’re really excited. Your social media company can certainly use them, so you fire up your notebook and start typing away. As the clock ticks, excitement turns to frustration and you wonder: how do people even do this?

There are a few gotcha moments with embeddings. No post could ever cover every scenario, but this blog will attempt to give you some practical advice in three areas:

  • Change in your model’s architecture
  • Use another extraction method
  • Retraining your model

These are some of the main reasons that embeddings will change and get frustrating.
Sponsored
Production-Ready AI Infrastructure
Today, the vast majority of organizations running AI workloads are using containers and cloud-native technologies, but Kubernetes, the de-facto standard for container orchestration, lacks core capabilities for scaling AI workloads in production. Whether harnessing machine learning for business intelligence or to build AI-powered products and services, MLOps teams will need to deploy a purpose-built infrastructure stack that accelerates (rather than constrains) AI initiatives.

Join Run:ai on August 16th at 11:00 am ET, for a webinar to help you build a cloud-native AI platform that delivers value and ROI, all the way from model build right through to deployment.

We’ll borrow AI orchestration concepts from the world of HPC to better manage your expensive compute resources, look at GPU scheduling concepts that keep your data scientists happy, and share an open-source tool you can use to ensure maximum utilization of your GPU cluster.


We Have Jobs!!
There is an official MLOps community jobs board now. Post a job and get featured in this newsletter!

IRL Meetups

London August 11
Denver August 16
LA — August 17

Utah August 23
NYC — August 31
Best of Slack
Best of Slack is its own newsletter now. Sign up for it here.
Thanks for reading. This issue was written by Nwoke Tochukwu and edited by Demetrios Brinkmann. See you in Slack, Youtube, and podcast land. Oh yeah, and we are also on Twitter if you like chirping birds.



Email Marketing by ActiveCampaign