| Hamilton is a framework that helps a team of Data Scientists manage the creation of a complex dataframe in a shared code base by writing specially shaped functions.
Last week Community member Stefan Krawczyk and team finally did what they have been threatening to do for ages, open source Hamilton.
So why was Hamilton created? StitchFix had a DS team that kept running into problems understanding code dependencies,
testing, and documentation. These problems compounded as their code base grew.
A quick aside on the origin of the tool from the release article "we first want to mention that, while we explored a variety of offerings, we did not find any open-source tooling that would dramatically improve our capability to solve the aforementioned problems. Second, we’re not solving a big data challenge here, so a base assumption is that all data can fit in memory"
What was the result? For one, Code reviews are streamlined and simpler due to tighter encapsulation of business logic into functions.
Lastly, by handling the how, Hamilton allows Data Scientists to focus on the what. It was the result of a successful cross-functional collaboration between the platform and data science teams at Stitch Fix, and has been running in production since November 2019.
Check out the full article to see if the use case of hamilton is something you could use!
|