Share
Plus, Happy Birthday MLOps πŸŽ‚, evaluation, stopping jailbreaks and more!
 β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ
I didn't know Mick and Keith were into gardening but apparently the Rolling Stones gather no moss. And just like them, I'm keeping busy.

Next Wednesday, 3 April, I'll be back at it, hosting
apply() with some pretty spectacular speakers from Vanta, Vanguard, Pinterest, Langchain, Meta, and Tecton who will share best practices on LLMs, RAG, Recsys, Feature Engineering and more.

You can't always get what you want. Unless you want an incredible line-up of speakers to help you master AI and ML in production. In that case, check out the whole line up and register for free here.

As ever, expect some shenanigans (though not as much as Keith).
MLOps Community Podcast
The Art and Science of Training LLMs // Bandish Shah and Davis Blalock // MLOps Podcast #219

I don’t think Alanis Morissette is a subscriber to the newsletter, but I think she’d appreciate how ironic it is that I’m struggling to write about a podcast that starts with a chat about the difficulty of writing a newsletter.

Davis was explaining why he’s taking a break from writing his newsletter, Davis Summarizes Papers. Looking forward to reading it again when he’s ready, but in the meantime, it was great to have him and Bandish on to chat about training LLMs and the pains they go through on a day-to-day basis.

They chatted about the challenges regarding data engineering and tokenization, data quality, managing and processing data at scale, and the need for flexibility in model training. They also shared the importance of well-configured data engineering processes, the need for mature software stacks to handle large-scale training demands, and the importance of combining good defaults with deep organizational learning.

And I suppose you oughta know that I know the opening line might not be considered particularly ironic. Unless Alanis thinks it is...

Announcing GenAI studio: Your Generative AI Playground
Introducing GenAI studio, a new experience built on Determined AI that empowers you to build custom AI models for real-world uses.

Quickly tailor open-source foundation models like Llama2, Falcon, and MPT on your domain data in the no-code GUI or Jupyter notebook.

GenAI studio’s unique β€œsecurity-first” experience supports both on-prem and cloud infrastructures and includes self-hosting options, allowing users to easily upload custom datasets, or utilize Hugging Face datasets. Save your configurations as a 'Snapshot' for consistent reproducibility and use the fine-tuning wizard to tailor your favorite model 'Snapshot' with ease.

As a GenAI studio user, you also get the full suite of enterprise-grade ML training benefits from the Determined AI platform like experiment tracking and GPU resource management.

Register for our deep dive on April 2nd and get early access to GenAI studio.

Want to find out more?
Demo // Blog // Docs // Slack
MLOps Community Podcast
Looking Back on 4 Years of the MLOps Community // Demetrios Brinkmann // MLOps Podcast #220

How to introduce this week's guest?

Well, he’s known for his extensive work in nurturing and expanding the understanding and implementation of machine learning operations, and embodies the spirit of exploration, collaboration, and education within the MLOps Community.

Not my words, but those of ChatGPT.
Making me re-think the t-shirt now...

But, 4 years into this, it felt like a good time to stop and reflect. So, a big shout-out to Mihail for agreeing to host and let me chat about the start and growth of the community. We go into how it grew from a Slack channel to IRL events, and all the other stuff we do now. There's a look to the future too, with our first in-person conference happening in June, so keep an eye out for that!

Plus, we talk about the importance of you, the community members, why you join, why you stay, and some success stories because of the community.

So, whether you're an active volunteer or a quiet consumerβ€”thanks for being here.


Upcoming MLOps Community Roundtable
Join us on April 4th for the Databricks Roundtable to learn about the launch of DBRX, a new state-of-the-art open large language model from Databricks!

We'll get into the technical nuances, potential applications, and implications of DBRX for businesses, developers, and the broader tech community.

Register here to be among the first to learn of DBRX!
💡Job of the week

Software Engineer, AI/ML // Conveyor (US, remote)

Conveyor is a customer trust platform powered by generative AI. As one of the first engineers on the team, this is a unique opportunity to shape Conveyor’s AI core capabilities, and create a huge impact on a critical pillar in the product strategy.

Responsibilities:
  • Define and implement metrics to evaluate and improve AI quality.
  • Design and improve the full AI lifecycle for Conveyor products, including experimenting with new technologies.
  • Autonomously manage and own individual project priorities, deadlines, and deliverables.
  • Inform the product roadmap and implementation decisions based on feasibility and maintainability.

Requirements:
  • 4-6 years in software engineering, focusing on AI/ML technologies recently, with built and maintained scalable products.
  • Adept at both research and production development, up to date with AI/ML innovations and able to apply them to use-cases.
  • Can quickly iterate on projects, have a keen product sense, and can propose improvements to enhance user experiences.
  • Able to collaborate effectively, mentor peers, and prioritize user needs in product development.

    March Model Madness Update!
    Some early shocks coming in from the results of Round 1 in Model March Madness!

    In the chat category, GPT 3.5 was mauled by a Bear, and Mistral 7b v0.1 was savaged by a Tiger – that’s both of those out in that category.

    Gemini Pro and CPT 3.5 had a really bad day at the office as they’re out of the Instruct category too.


    But maybe the biggest news is in the Code Category: Codellama-70b-instruct, Codellama-70b-python, Gemini Pro and GPT4 – ALL OUT IN ROUND 1!

    Want to know which model’s best to help with your code? Make sure you’re voting in Round 2!
    MLOps Community IRL Meetup
    Evaluating Generative AI Systems // Jineet Doshi // IRL #70 Silicon Valley

    If there's anyone who knows about evaluation, it's gotta be the guy who helped with the MLOps Community Evaluation Survey, right?

    Jineet covers different approaches to evaluating gen AI systems, like traditional NLP techniques, human evaluators, and using LLMs. Covering the pros and cons of each approach, he even sprinkles in some insights from our Community evaluation survey. Nice!

    He also looks at the need for evaluating safety and security aspects of LLMs, such as toxicity, bias, and potential security threats, emphasizes the importance of evaluating larger generative AI systems, such as RAG, and provides an overview of the open-source tools and initiatives available in the space.

    Jineet says in his intro this is especially interesting to him because it's an open problem, there's no right or wrong. Except, you'd be wrong not to watch it.

    Blogpost
    7 Methods to Secure LLM Apps from Prompt Injections and Jailbreaks

    Jailbreaking makes me think of the Thin Lizzy song, 🎶Tonight there's gonna be a jailbreak, somewhere in this town 🎸

    But for the more professional of you, it might make you think of embarrassing Tech Crunch headlines - or just give you headaches.

    Well this blog will help with two of the three things above by going through the types of prompt attack and rating the severity of them. It then gives 7 ways to mitigate these risks, including: red-teaming pre-launch, limiting user input, applying the least privilege principle and using Rebuff to add in canary words.Β  Plenty there to keep you secure!

    Also, 'Somewhere in this town'? I'd guess it's going to be at the jail, Phil.

    With thanks to Sahar Mor for their contribution.
    Looking for a job?
    Add your profile to our jobs board here
    IRL Meetups
    San Francisco - April 4 (thanks to RubberDuckyLabs and BainΒ  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Β  Capital Ventures)
    Berlin - April 9 (supported by Outerbounds)
    Melbourne - April 10
    Munich - April 10
    Amsterdam - April 11 (cheers to Nebius and Toloka)
    Oslo - April 16
    Stockholm
    - April 23 (shout out to Weights & Biases and Stormgrid)

    Thanks for reading. See you in Slack, YouTube, and podcast land. Oh yeah, and we are also on X. The MLOps Community newsletter is edited by Jessica Rudd.



    Email Marketing by ActiveCampaign