Allows processing of both structured and unstructured data within single, seamless offering
EMC, a provider of information infrastructure offerings, has introduced a new data co-processing Hadoop appliance — the Greenplum HD Data Computing Appliance, which combines Hadoop with the EMC Greenplum Database, allowing the co-processing of both structured and unstructured data within a single, seamless offering.
The Apache Hadoop is seamlessly integrated with the Greenplum database in the Greenplum HD Data Computing Appliance. The offering supports Hadoop external tables, thereby enabling users to access data residing on the Hadoop Distributed File System (HDFS) without materialising the data, said the company.
The facility allows administrators to read and write files in parallel from Greenplum to HDFS, enabling rapid and simple data sharing.
The Hadoop-based EMC Greenplum HD Community edition and EMC Greenplum HD Enterprise edition software are combined with product certification by other partners, which will enable real-time data interaction, said the company.
The EMC Greenplum HD product family is available in two editions – Community and Enterprise. Greenplum HD software provides a complete platform including installation, training, global support and value add beyond simple packaging of the Apache distribution.
EMC said that the Greenplum HD Enterprise Edition is a 100% interface-compatible implementation of the Apache Hadoop stack. By maintaining Hadoop interface compatibility, the Enterprise Edition provides seamless application portability while delivering advanced features required by larger organisations.
The EMC Greenplum HD Community Edition is a 100% open source certified and supported version of the Apache Hadoop stack comprising HDFS, MapReduce, Zookeeper, Hive and HBase, added the company.
EMC data computing division president and general manager Bill Cook said there is a time and a place for the value that relational databases add to structured data, and there is a time and a place for the value Hadoop can give to unstructured data.
"Many of our enterprise customers need both and, with the help of our partners, we’re able to provide them both, while also meeting their expectations around high availability, fault tolerance, and enterprise-class support and service," Cook said.
The EMC Greenplum HD Community Edition, EMC Greenplum HD Enterprise Edition and the EMC Greenplum HD Data Computing Appliance are expected to be available in the third quarter of calendar 2011.