When the cluster costs more per day than you do per year...

I didn't know Mick and Keith were into gardening but apparently the Rolling Stones gather no moss. And just like them, I'm keeping busy.

Next Wednesday, 3 April, I'll be back at it, hosting apply() with some pretty spectacular speakers from Vanta, Vanguard, Pinterest, Langchain, Meta, and Tecton who will share best practices on LLMs, RAG, Recsys, Feature Engineering and more.

You can't always get what you want. Unless you want an incredible line-up of speakers to help you master AI and ML in production. In that case, check out the whole line up and register for free here.

As ever, expect some shenanigans (though not as much as Keith).

The Art and Science of Training LLMs // Bandish Shah and Davis Blalock // MLOps Podcast #219

I don’t think Alanis Morissette is a subscriber to the newsletter, but I think she’d appreciate how ironic it is that I’m struggling to write about a podcast that starts with a chat about the difficulty of writing a newsletter.

Davis was explaining why he’s taking a break from writing his newsletter, Davis Summarizes Papers. Looking forward to reading it again when he’s ready, but in the meantime, it was great to have him and Bandish on to chat about training LLMs and the pains they go through on a day-to-day basis.

They chatted about the challenges regarding data engineering and tokenization, data quality, managing and processing data at scale, and the need for flexibility in model training. They also shared the importance of well-configured data engineering processes, the need for mature software stacks to handle large-scale training demands, and the importance of combining good defaults with deep organizational learning.

And I suppose you oughta know that I know the opening line might not be considered particularly ironic. Unless Alanis thinks it is...

Video || Spotify || Apple

Introducing GenAI studio, a new experience built on Determined AI that empowers you to build custom AI models for real-world uses.

Quickly tailor open-source foundation models like Llama2, Falcon, and MPT on your domain data in the no-code GUI or Jupyter notebook.

GenAI studio’s unique “security-first” experience supports both on-prem and cloud infrastructures and includes self-hosting options, allowing users to easily upload custom datasets, or utilize Hugging Face datasets. Save your configurations as a 'Snapshot' for consistent reproducibility and use the fine-tuning wizard to tailor your favorite model 'Snapshot' with ease.

As a GenAI studio user, you also get the full suite of enterprise-grade ML training benefits from the Determined AI platform like experiment tracking and GPU resource management.

Register for our deep dive on April 2nd and get early access to GenAI studio.

Want to find out more?

Demo // Blog // Docs // Slack

Looking Back on 4 Years of the MLOps Community // Demetrios Brinkmann // MLOps Podcast #220

How to introduce this week's guest?

Well, he’s known for his extensive work in nurturing and expanding the understanding and implementation of machine learning operations, and embodies the spirit of exploration, collaboration, and education within the MLOps Community.

Not my words, but those of ChatGPT.
Making me re-think the t-shirt now...

But, 4 years into this, it felt like a good time to stop and reflect. So, a big shout-out to Mihail for agreeing to host and let me chat about the start and growth of the community. We go into how it grew from a Slack channel to IRL events, and all the other stuff we do now. There's a look to the future too, with our first in-person conference happening in June, so keep an eye out for that!

Plus, we talk about the importance of you, the community members, why you join, why you stay, and some success stories because of the community.

So, whether you're an active volunteer or a quiet consumer—thanks for being here.

Video || Spotify || Apple

💡Job of the week

Software Engineer, AI/ML // Conveyor (US, remote)

Conveyor is a customer trust platform powered by generative AI. As one of the first engineers on the team, this is a unique opportunity to shape Conveyor’s AI core capabilities, and create a huge impact on a critical pillar in the product strategy.

Responsibilities:

Define and implement metrics to evaluate and improve AI quality.
Design and improve the full AI lifecycle for Conveyor products, including experimenting with new technologies.
Autonomously manage and own individual project priorities, deadlines, and deliverables.
Inform the product roadmap and implementation decisions based on feasibility and maintainability.

Requirements:

4-6 years in software engineering, focusing on AI/ML technologies recently, with built and maintained scalable products.
Adept at both research and production development, up to date with AI/ML innovations and able to apply them to use-cases.
Can quickly iterate on projects, have a keen product sense, and can propose improvements to enhance user experiences.
Able to collaborate effectively, mentor peers, and prioritize user needs in product development.

Evaluating Generative AI Systems // Jineet Doshi // IRL #70 Silicon Valley

If there's anyone who knows about evaluation, it's gotta be the guy who helped with the MLOps Community Evaluation Survey, right?

Jineet covers different approaches to evaluating gen AI systems, like traditional NLP techniques, human evaluators, and using LLMs. Covering the pros and cons of each approach, he even sprinkles in some insights from our Community evaluation survey. Nice!

He also looks at the need for evaluating safety and security aspects of LLMs, such as toxicity, bias, and potential security threats, emphasizes the importance of evaluating larger generative AI systems, such as RAG, and provides an overview of the open-source tools and initiatives available in the space.

Jineet says in his intro this is especially interesting to him because it's an open problem, there's no right or wrong. Except, you'd be wrong not to watch it.

Watch it here!

7 Methods to Secure LLM Apps from Prompt Injections and Jailbreaks

Jailbreaking makes me think of the Thin Lizzy song, 🎶Tonight there's gonna be a jailbreak, somewhere in this town 🎸

But for the more professional of you, it might make you think of embarrassing Tech Crunch headlines - or just give you headaches.

Well this blog will help with two of the three things above by going through the types of prompt attack and rating the severity of them. It then gives 7 ways to mitigate these risks, including: red-teaming pre-launch, limiting user input, applying the least privilege principle and using Rebuff to add in canary words. Plenty there to keep you secure!

Also, 'Somewhere in this town'? I'd guess it's going to be at the jail, Phil.

With thanks to Sahar Mor for their contribution.