CPU vs GPU

CPUs are what we're used to. They're native to our computers and even if my mom doesn't know it, she uses them. But now, let's take a trip into the world of GPUs.

If you are looking for a modern historical phenomenon, look no further than the amazing story of NVIDIA and its oversized bets on the future. They bet everything on CUDA and the emergence of the GPU. Their timing was perfect. In 2010 when imagenet became a thing, the emerging GPU hardware was in a prime position to be ML's workhorse of the future. Since deep learning resurged from its AI winter hibernation, we have seen many companies cater to its compute-intensive demands.

Where are we now?
GPUs have come a long way since the birth of AlexNet in 2012, but operating them could still use a bit of TLC. Enter Ronan and the work his team is doing at Run:ai.

The Vision - On your computer when you use CPUs there is an operating system that allocates the right cores to the programs which need them. The operating system controls the networking and allocates memory to give us a seamless user experience. This is nothing new, we hardly even think about it. Now think about your ML infra. You have the same thing - compute, storage, memory and networking. You need an operating system to manage everything.

But where is that?

Right now you have Linux that orchestrates systems in nodes. Then you have this amazing technology every data scientist loves called Kubernetes. This brings operating system capabilities to clusters of nodes. But when we look at the way GPUs are allocated, it's static and exclusive.

The problem - When a GPU is being allocated today to containers and workloads it can only run on that container it's allocated to. Doesn't matter if that application is using the GPU or not. These pains bring inefficiency to the landscape where we have seen so much advancement. GPU can be tied up by an application but spend much of its time not in use. As we all know, idle GPUs are expensive.

Check out how Ronen and team are trying to tackle this in our most recent coffee session

Real-time ML-based applications are on the rise but deploying them at scale for large datasets with low latency and high throughput is challenging. This talk will discuss the important role of feature stores for machine learning in deploying these applications.

By exploring a number of use cases in production, we will see how the choice of online data store and the feature store data architecture play important roles in determining its performance and cost. Throughout the presentation, we will illustrate key points by connecting them to juggling and Dr. Seuss! Here are some key takeaways you can expect:

Real-time AI/ML rely on super-fast data stores for serving features or data inputs for model online predictions/ inference
These data stores are often called ‘online stores’ or ‘online feature stores’ and are the critical component of the real-time data layer for AI/ML
There are significant differences in the performance and cost of feature stores, depending on the architecture, supported types of features, and components selected (as the feature server and online store).
Online Stores together with Offline Stores make up an emerging logical component called Feature Store, which is becoming the cornerstone of MLOps
Microsoft Azure SQL DB, Google BigQuery, AWS Redshift, and Snowflake are examples of excellent choices for offline store
There are different architectures and implementation options for implementing a Feature Store - from open-source, build from scratch, buy or subscribe

See you tomorrow at 9am in California /5pm in London.

The key to unlocking robust models is clean, well-formed datasets.

Although data quality issues are prevalent in industry-grade production datasets, most modern tools today aren’t built to address this problem effectively. Instead, these tools focus on changing the model configurations or getting more data and hope for improved generalization.

With unaddressed critical flaws in the dataset, this paradigm can cause the model to be biased towards certain situations and result in inaccurate performance metrics and even worse consequences in production.

So how do we fix it? Our friends at Galileo joined us for the most recent blog post to show how to tackle these challenges head on.

Clean empty samples
Clean garbage samples
Fix labeling errors
Find class overlap

The Galileo team shows us how to do all this within their tool, and gives us the performance results after these actions are taken.

Last week, apply(conf) brought together practitioners from across the industry that have built production ML applications. A common theme was that every team has adopted some version of a feature store.

Teams from Faire, Walmart Labs, Square, Gojek, Adyen, Abnormal Security, Billie and Vital all have feature platforms at the center of their applications.
But when two people say feature stores, they might not mean the same thing. Chip Huyen described this best: basic feature stores allow teams to create a catalog to re-use features across use cases, and persist those features for offline training and online serving.

Tecton is a complete feature platform. In addition to the capabilities above, Tecton automates feature computation. It registers transformation logic, executes and orchestrates the data pipelines that transform raw data into features, provides accurate backfills, and allows users to define features against batch and streaming data sources.

In a nutshell, Tecton enables every data engineer to support more data scientists by automating one of the most challenging aspects of production machine learning: managing data pipelines.

To learn more about Tecton:

Visit their website
Join em on Slack