If you *start* with a problem, is that working backwards from it...? 🤔

GenAI in Production - Challenges and Trends // Verena Weber // MLOps Podcast #224

Eat more fruit and veg.
Don’t look at your phone after 10 pm.
Cancel unused subscriptions.

All things that are easy to say but hard to do.

Verena had another one: “Start with a problem, not with a solution.”

It’s a common theme we’ve had but it shows how important it is by how often it gets repeated. To illustrate the importance she shared an example from her time at Amazon, starting with customer problems and then looking to research documents for solutions. This led to an approach to prevent negative flips in model updates called positive congruent training. This meant a more consistent user experience. We also chatted about using synthetic data, AI model comparisons, transitioning to multi-modal models, and creating supportive environments for folk within the tech industry.

Click below to watch – easy to say, and to do!

Video || Spotify || Apple

Overview:
Explore deploying a RAG-based chatbot in production using NVIDIA® H100 Tensor Core GPUs and open-source technologies. This session covers the integration of Kubernetes, Cuda, Triton Server, TensorRT, Milvus, PyTorch, and Llama2.

Event Details:
• Date/Time: May 16th, 17:00 (GMT+2)
• Location: Online (link provided post-registration)

Agenda:
• Techniques and tools for deploying RAG in production environments.
• Deep dive into RAG's architecture tailored for scalability.
• Live demo showcasing practical deployment and operational strategies.

Who Should Attend?
CTOs, technical managers, product managers, ML engineers, MLOps engineers, and anyone looking for alternative solutions to classic LLMs.

Speaker
• Boris Popov, CSA at Nebius AI.

Register here for free!

What’s a T-Rex’s favorite tool? A dino-saw!

Although that might change now - at the start of this chat we settle on the pronunciation of DBRX to be D-BRX, like T-Rex.

And if they get a mascot dinosaur, they may have to call it Moe, after the mixture of expert architectural approaches they took. This uses feedforward networks in a transformer block and moves tokens to their relevant 'experts' for a reduction in the computational work per token.

We also talk about the use of PyTorch to enhance model flexibility and operational challenges, such as optimizing inference web servers.
Of course, these things are never pain-free, so they also shared the challenges and how monitoring systems at the GPU, Kubernetes cluster, and network levels is critical to the training process.

Be sure to give it a listen because this chat was dino-mite!

Video || Spotify || Apple

If AI is truly going to change the world, it needs to get past just helping you draft a less passive-aggressive email about microwaving fish in the office.

This talk looks at two use cases beyond chatbots and text.

Diana focused on using LLMs for molecule discovery, outlining the process of selecting databases, configuring the LLMs, and crafting prompts to extract chemical knowledge using JSON output. Nick, on the other hand, detailed improvements made to gen AI for a banking call center, emphasizing the challenges of regulatory compliance, data cleaning, and computational limitations. Both projects highlighted the key phases of data processing: diarization, transcription, and analysis.

Maybe they'll be able to use the molecule discovery for smell-free fish.

Catch it here!

💡Job of the week
Principal ML Engineer - LLMs & Generative AI // Truveta (US, Remote)

Truveta is looking for machine learning experts who can build Foundation Models.

Responsibilities:

Develop and refine LLMs and generative models for various applications, including healthcare.
Create and implement novel algorithms in ML and NLP.
Optimize GPT-like models for performance and accuracy on large datasets.

Requirements:

Strong communication skills, demonstrated leadership in mentoring ML teams, and excellent problem-solving abilities.
Advanced knowledge in NLP and LLM architectures, ideally with a Ph.D. in Computer Science, Electrical Engineering, or related fields.
Proficient in Python and deep learning frameworks like PyTorch and TensorFlow.

The Power of Combining Analytics & ML on One Platform // Rebecka Storm // IRL #73 Stockholm

Ever been so frustrated that you started your own company?

Well, Rebecka co-founded Twirl to create a unified platform for analytics and machine learning due to frustrations over teams spending too much time on foundational work instead of solving problems. She highlights the benefits of this integration, such as increased efficiency, consistency, and improved data quality, which eases the work for non-ML experts. Rebecka also touches on the importance of supporting transformations in SQL and Python, and the need for system ownership to maintain team autonomy.

Don’t stay frustrated, watch it and be inspired!

Watch it here!

Stanford HAI Releases 2024 AI Index Report

Flashbacks to my school reports: has potential, must try harder.
The 7th edition of the report introduces new estimates on AI training costs, analyses of the AI landscape, and a chapter about the impact on science and medicine.
Do any of the Top Takeaways surprise you?

MLOps vs. Eng: Misaligned Incentives and Failure to Launch?
What can happen when the stars are aligned?
A look at the data science/engineering divide and how misaligned incentives can stall deployments with some expert advice on aligning teams for success.

Deceiving to Enlighten: Coaxing LLMs to Self-Reflection for Enhanced Bias Detection and Mitigation
Who couldn't benefit from some self-reflection?
Proposing a method for bias detection and mitigation utilizing LLMs in a multi-role debate scenario, employing a ranking system to pinpoint and rectify biases.

Detecting AI Generated Text Based on NLP and Machine Learning Approaches

Please don't use this on this newsletter.
Presenting an AI detection model using XGB Classifier, SVM, and BERT deep learning methods to identify AI-written text. Results indicate BERT outperforms other models, achieving 93% accuracy.