“We’ve moved MLflow into the Linux Foundation as a vendor-neutral non-profit organization to manage the project long-term”
The open source tool from the US company (whose founders created analytics engine Apache Spark) sees an eye-ball popping 2.5 million downloads a month and has 200 contributors from 100 organisations.
MLflow allows organisations to package their code for reproducible runs and execute hundreds of parallel experiments, across platforms. It integrates closely with Apache Spark, SciKit-Learn, TensorFlow and other open source ML frameworks.
You can learn more about it in our launch write-up here…
It is the second open source project after Delta Lake, that the California-based firm has donated to the Linux Foundation — an organisation which provides support for open-source communities through financial and intellectual resources; overseeing open source projects in almost every sector including government, education and film production.
The news caps a busy week for Databricks, which also announced the acquisition of data visualisation and querying specialist Redash.
David Wyatt, SVP & GM EMEA at Databricks told Computer Business Review:
“Unlike traditional software development that is only concerned with versions of code, ML models need to also track versions of data sets, model parameters, and algorithms, which creates an exponentially larger set of variables to track and manage.
“In addition, ML is very iterative and relies on close collaboration between data teams and application teams.
“MLflow keeps this process from becoming overwhelming for organizations by providing a platform to manage the end-to-end ML development life cycle from data preparation to production deployment, including experiment tracking, packaging code into reproducible runs, and model sharing and collaboration”.
Other News This Week for Databricks
Databricks took the opportunity at the online summit to announce a hosted version of the recently acquired Redash, to facilitate a “rich visualisation and dash boarding experience”. A spokesperson for the data engineering platform explained its processes behind buying Rednash in a recent statement:
“We first heard about Redash a few years ago through some of our early customers. As time progressed, more and more of them asked us to improve the integration between Databricks and Redash.
“Our acquisition of Redash was driven not only by the great community and product they’ve developed, but also the same core values we share. Both our organizations have sought to make it easy for data practitioners to collaborate around data, and democratize its access for all teams.
“Most importantly though, has been the alignment of our cultures to help data teams solve the world’s toughest problems with open technologies”.
See also: Databricks’ CEO Ali Ghodsi on Microsoft, “Mumbo-Jumbo”, and the Magic of Merging Data Teams