Pentaho finally seems to have breathed some life into the data mining component idling in its open source business intelligence platform to simplifying the process of incorporating predictive analytics into their data warehousing environments.
The Orlando, Florida-based company, which is peppered with executives from commercial BI software vendors like SAS Instutute and Business Objects, has effectively taken its Weka open source data mining project and integrated it with its open source ETL tool based on the Kettle project. The aim is to help companies jump-start predictive analytics application development for areas like customer prospecting and marketing.
The key, according to Pentaho, is to streamline the data preparation stage, which is traditionally the most expensive and time-intensive activity in building advanced predictive analysis and data mining applications.
Pentaho’s Data Integration (Kettle) module speeds up the process by transforming data into a proprietary .ARFF file format – i.e. an analytics-ready format — that can be directly used by the Weka data mining tools.
The process also includes new data sampling tools to help users root out trends and patterns in large data volumes, without having to analyze single records.
Pentaho says the data mining-ETL integration also lets users integrate analytic models directly into their data warehousing processes.
Pentaho offers support and indemnification, as well as commercial licenses for its open source data mining software.