We are having another Virtual Meetup this week. Come check out live demos with code walkthroughs of creative projects using open source MusicVAE, GAN Art, Neural StyleTransfer, SketchRNN, StableDiffusioni, Open AI Whisper, and transformer-based models for joke generation. Register Here.
Coffee Session Unicorn Inverstor
Me, myself, and I had an insightful chat with George Mathew, an investor and managing director at Insight Partners. He shared his thesis behind making investments in the MLOps ecosystem.
Investment Thesis Timing is a key factor that drives the decision-making of investments.
For Insight Partners, the timing to invest in MLOps worked out perfectly. Investments began at a time when they noticed a trending occurrence of disaggregation in the MLOps toolchain and the emergence of the modern data stack, which are fundamentally
important areas of MLOps.
Different areas in the MLOps space The messy landscape of the MLOps toolchain creates an unclear understanding of things when congregating the tools in the different areas of MLOps.
George derived a formula that guides his decisions when investing in ML products (MLOps tools) within the MLOps space.
This formula breaks down the different areas of MLOps into two parallels that cut across each other. It helps define where a product can be placed on the MLOps toolchain. On one end, the tool is placed under the major subcategories of the MLOps components (i.e., data preparation, model preparations, and model governance and security). While on the other hand, lies the tool's nuances based on its value prop. The value props
that a tool offers could either be in form of an algorithm layer like Desi data or OctoML or an application layer like weights & biases, fiddler.
Evaluating KPIs for Deals In the case of an open-source project, seeing a good level of community-led growth in the project shows a clear path to success. It also defines a natural path for the commercialization of the project based on its major point of traction.
Non-financial metrics of determining a good deal go beyond the GitHub stars. Other intensive metrics like the number
of pull requests, the number of external and internal contributors, and the number of activeness contributions also need to be critically looked at before thinking of investments and commercialization.
The best road to building ML solutions in a reliable way is to promote positive and quality
culture.
Operations & Reward Running distributed teams comes with trade-offs, whether operating in a fully remote or entirely local situation. However, constantly developing strategies to resolve the shortcomings in the two situations is vital in building an efficient and productive distributed team.
Sometimes the strategies to accommodate these shortcomings could be costly for more technically experienced players because they try to carry everyone along, hence setting them up for burnout.
Luckily rewards can be used to keep the balance. It is a good way to incentivize their efforts. Process, reward, and persuasion are huge in driving change in any organization adopting ML.
Positive Culture Data always finds a way of slipping down into every area of an organization in the ML development process. It is a lot tougher to take the traditional software development approach of siloing the operation of teams or developers.
Therefore the logical organizational approach is to enable "all kinds of people" within the organization and "know all kinds of things" about the goals and requirements of ML.
Data Contracts are API-based agreements between Software Engineers who own services and Data Consumers that understand how the business works in order to generate well-modeled, high-quality, trusted, data.
In this blog, Chad Sanderson and Adrian Kreuziger take a comprehensive look into data contracts as a technology using open-source components.
This article is part one of a three-part series on the technical implementation of data contracts.
Sponsored Post genv - AI GPU Environment Management
For Deep Learning scientists, allocating AI resources, specifically GPUs used to develop and train their models, is an integral part of their day-to-day work.
GPUs are scarce resources, and oftentimes there’s not enough of them to go around.
This compute
power shortage leads to data scientists spending too much of their day allocating and scheduling GPU-turns.
Seeing this pain first-hand, Raz Rotenberg, a member of Run:ai’s Core Team decided to build a free, open-source GPU Environment Management tool named genv.
genv helps teams and even individual GPU users easily control, configure and monitor GPU resources.
Developers can now access their GPUs straight from popular tools such as Jupyter Notebook, VS Code, and CLI.
Kevin co-founded Tecton after leaving Uber, where he built deep expertise in operation ML platforms. Part of
the team that created the Michelangelo platform, he’s helped scale ML-driven applications from 0 to 1000+
Willem is a tech lead at Tecton, where he spearheads the Feast open-source feature store team. Feast is the brainchild Willem created while at Gojek, where he honed his expertise in building data and ML platforms.
This is your opportunity to ask questions about:
Real-time machine learning
Feature stores
Scaling ML applications
Building and scaling ML platforms
How Feast came to be
Uber’s ML platform
and any other questions you have!
Feel free to schedule your questions if you can’t attend in time.
Also, please share this event with your colleagues/friends. Let’s get this thing poppin’!
Thanks for reading. This issue was written by Nwoke Tochukwu and edited by Demetrios Brinkmann and Jessica Rudd. See you in Slack, Youtube, and podcast land. Oh yeah, and we are also on Twitter if you like chirping birds.