🫣 The dark side of LLMs

The Ugly Truth About LLMs

I had a conversation with Phillip Carter the author of that infamous blog "All the Hard Stuff Nobody Talks About when Building Products with LLMs".

🔑 Here are 3 key takeaways from the chat:

Collaboration between ML engineers and product managers: Lotta room to grow here. As a PM can you speak the lingo? (Do you know what a Vector is and why its important?) As an MLE are you thinking about the customer and how they will ultimately use the new features or product you are creating? cross-functional pollination is key in these times of API AI.
The right metrics are essential for measuring AI success: Phillip broke down how the new AI features he introduced to Honeycomb are judged as successful or not. In his specific case they looked at if the new Natural Language ability in their tool increased activation among trails. Once they saw that it did, they had a clear story for how the feature was ROI positive.

Paraphrasing/oversimplifying his words "If the total cost of our OpenAI API calls is 100k but we sign 2 new customers because the product is easier and more sticky then it makes it all worth it".
The best evaluation is testing in production: You don't really know how users will use these new AI features. You can spend ages trying to cover all the edge cases or you can put the tool in the hands of the customer and see what they do with it. As long as your iteration loops are fast enough to patch up the rough edges you are good.

Phillip is also looking to chat with others that are using LLMs to create a special interest group (SIG) around bringing observability practices to AI. We are setting up a channel in slack!

Video || Spotify || Apple

Job of the week
Senior Cloud Infrastructure Engineer (AI, Kubernetes) // SuperDuperDB - Open-Source, Core Founding Team; Full Remote + Office in Berlin.

Small team growing quickly building an open-source system to integrate AI and databases. SuperDuperDB enables developers to easily implement next-generation AI models and applications on top of your existing data store.

Work closely with the product, software and research teams to architect and implement a secure, scalable cloud infrastructure to host an AI database service using the latest best practices and tooling.

Strong LLM Foundations

Large Language Models (LLMs) have been developing at an exponential pace in the last six months. Every day new research is presented to the community. It can be easy to feel overwhelmed and as though you’ve missed out on technological advancements. However, these updates represent only a tiny fraction of change relative to the underlying foundation of how LLMs work. To help members of the community build these foundations, we are introducing our second LLM MOOC on edX: Large Language Models: Foundation Models from the Ground Up.

This course dives into the details of foundation models in natural language processing (NLP). You will learn the innovations that led to the proliferation of transformer-based models, including encoder models such as BERT, decoder models such as GPT, and encoder-decoder models like T5, and the key breakthroughs that led to applications such as ChatGPT. You will also learn about popular parameter-efficient fine-tuning methods (PEFT) plus practical hardware and deployment optimizations to improve the ease of producing and deploying high-quality large language models (LLMs). The course concludes with an overview of new multi-modal LLM developments, looking toward the future in this ever-changing, fast-paced landscape.

Enroll Today

MLOps World

My good friend Dave let me crash at his house last June for the MLOps World in Toronto. I ended up eating all his food and sweating off a few pounds in his infrared sauna. So this year he decided to move the conference to Austin, Tx. Now I won’t be anywhere near his family, or at home gym.

If you were looking for an excuse to get your company to pay for a trip to the heart of ‘merica, search no more. Make sure to check out friend of the pod Hien Luu’s talk if you do!

The call for speakers is still open so if you feel like you have something to say fill out this form! Or register for the event that takes place October 25-26th here. Use the promo code 'mlopscommunity' to get 50$ off your ticket price. (Not sponsored, I just owe him for all the homecooked meals he made for me last year)

Code Smells in Data Science: What can we do?

It's always a good day when we have the opportunity to listen to Lazslo's rants. During the latest in-person meetup in Bristol, he discussed his usual topic of clean code for data scientists.

But this time, he went a bit deeper. Mainly talking about:

Testing and Clean Architecture: Data scientists shouldn’t get away with not testing their code. While testing has long been a necessary practice in software engineering, it has been historically overlooked in the realm of data science.

Because of people like Lazslo and Matt ranting on Twitter and linkedin, the culture is shifting. There is a growing recognition that testing and clean architecture are essential for reliable code deployment.

Challenges in Team Dynamics: Always a philosopher, Laszlo talks challenges faced by data science teams when it comes to code delivery and synchronization.

Disagreements on prioritization, time constraints, and conflicting code styles can hinder effective problem-solving and workflow. Laszlo proposes that by understanding the psychological theory of self-determination, we gain insights into resolving team conflicts and fostering a healthy work environment.

Communicating the Importance of Testing: Conveying the importance of spending time on tests to product managers and business stakeholders can be challenging. These concepts, such as total cost of ownership and culture change, may not be seen as tangible features. Successful communication requires finding effective ways to translate these concepts into measurable metrics that resonate with the relevant stakeholders.

Watch Here

P.S. If you want even more Laszlo in your life, check out the timeless podcast we did with him 2 years ago, or subscribe to his substack.

Evaluating and Debugging Diffusion Models

Lets discuss the challenges of evaluating and debugging diffusion models. Agata wrote a blog post recently that goes deep into wandbs new LLM course. This is a deep dive on module two which covers:

How to monitor the loss curve during training
How to sample from the model regularly to check the quality of the generated images
How to use wandb to track the progress of your diffusion model training and to debug any problems that you encounter

The post also includes a hands-on tutorial that walks you through the process of training a diffusion model on the sprites dataset.

Read Now