|
|
|
|
Welcome to your weekly dose of just about the only newsletter on the internet not generated by chatGPT.
Prove it you say?
hadfhjskalfhdksalhfiwo; <- Would AI write that?
|
|
|
|
|
|
|
Maxime Beauchemin, the creator of Airflow and Preset joined us to talk about his newest creation, Promptimize.
So what is Promptimize?
A tool that aims to evaluate and benchmark LLM prompts.
In classic Max fashion, it's fully open-sourced (which we talked about why on the pod).
Here are three key takeaways from the episode:
- Prompt Engineering for Better Model Performance: Maxime emphasizes the importance of prompt engineering for controlling and taming AI models.
- Prompts can be used to ask for structured output from AI systems, enabling specific use cases like requesting SQL queries in a JSON format.
- The value of prompting is crucial. By structuring questions and providing context with prompts, AI systems can be effectively utilized for specialized tasks.
- The Power of Test Suites for
Model Evaluation: You know test suites in traditional software development? Max draws parallels between prompt suites and test suites 🤯
- He proposes that test sets and prompt cases serve as valuable anchors for evaluating model performance amidst the ever-changing AI landscape.
- Developing a comprehensive test suite highly optimized for a specific use case enables quick identification of the best model for that particular scenario.
- Embracing User Feedback and Iteration: have we ever talked about this?
- Utilizing feature flags and conducting user interviews during the beta phase can provide valuable insights into the usefulness of the added features.
- User research techniques such as logging data, thumbs up/down ratings, and interviews can help evaluate the effectiveness of AI assist features.
|
|
|
Job of the week Sr Data Scientist AI // Wex - a payments and technology company leading the way in a rapidly changing environment.
Lead the effort to design, develop, and program methods, processes, and systems to consolidate and analyze unstructured, diverse “big data” sources to generate actionable insights and solutions through the utilization of 3rd party and custom developed LLMs.
|
|
|
|
|
|
|
Since interviewing one of the creators of daft Sammy Sindhu I subscribed to their newsletter.
I got my first dose of quality from them last week (unlike my hack job writing this
newsletter).
We all know Parquet files leave much to be desired, but how much exactly?
Welp, Sammy and his cofounder Jay go really deep into the main gotchas you can encounter when working with parquet. They also touch on how exactly this file format works under the hood.
Blog
|
|
|
|
|
|
|
Entering MLOps through Model Cards // Javier López Peña // Meetup IRL #42 Madrid
Reproducibility: The Holy Grail of Model Development.
Anyone who has been in the MLOps game for a bit knows reproducibility is a constant question that comes up repeatedly.
In fact, the company I was working at when I started the community was trying to tackle the data versioning problem in ML (before they went out of business…but that's a story for another day).
Today, Javier discusses the reproducibility challenges he faced when working with certain models. He also talks about ML flow and DVC's role in ensuring reproducibility for his ML initiatives.
Video
|
|
|
|
|
|
|
LLM in Prod 2 Recap
We are working overtime to get a massive amount of videos out!
Today we released 15 new ones. Here is the playlist if you want to watch them all. Otherwise, pick and choose from the new batch below.
|
|
|
|
|
|
|
|
|
|
|
Add your profile to our jobs board here
|
|
|
|
|
|
|
|
|
|
|