Jason Stamper talks to Angel Viña, founder and CEO of data virtualisation firm Denodo, about the company’s unusual approach to data integration
Could you start by telling our readers what you mean by data virtualisation?
Data virtualisation is the new hot area around data integration, and it comes at a very important moment for enterprises. They need to deal with data in real-time and manage ever-increasing volumes. This is the current reality for all large corporations around the world.
What we offer is a new approach. Instead of filling data warehouses with replicated data using conventional technologies, we focus on the use of the data from wherever it is; how to deliver it in a very dynamic, very agile way, and how to deliver and plug into the data sources without the need to replicate the data and without filling a new data mart or warehouse. Instead, we focus on this idea of provisioning the data from the current physical instances, and provide the right data integrations, the right data views to the data consumers.
But why specifically is this data virtualisation and not simply data integration?
Because you abstract, integrate and then deliver virtualised data services to a variety of users. The main idea is to create an abstraction of what you have in the physical repositories, so you basically work with your physical instances and through metadata you create the data views, the data services that a data consumer requires. That abstraction, that mediation layer, is the key element of this technology.
So you have an engine that provides connectivity and abstraction to the physical repositories, an engine that provides the capabilities to transform, clean and integrate the data in real-time, on the fly, from those physical repositories without the need for any storage.
This allows you to get the most from your current infrastructure: you get integrations on the fly to feed the data consumers, and you also decouple the data consumers from the producers, which provides a lot of agility for the IT department and the business.
In complex scenarios where the data sources are multiple and diverse, and the consumers are also many and different, and with a requirement for different data formats and types, you can easily provision data and deliver it while matching the requirements of new business needs.
Tell us more about the data sources you can bring together because we’re not just talking about Oracle, SQL Server of MySQL here, are we?
Well, there’s a big transformation going on in terms of ‘what is my data world?’ CIOs and enterprise architects are looking at new types of data and new sources that can feed the business applications. So databases and operational systems are definitely still there and very important; XML data sources and hierarchical silos gain increased importance. There are other, additional data sources that are common (but underutilised) that also have to come into the picture. We’re talking about file systems, content repositories, PDFs, websites and log files.
You say the Denodo Platform supports services-oriented architecture (SOA) principles. Why is that important?
In the past, the data layer was relatively ignored in SOA, inhibiting success. We are oriented towards delivering data services that are flexible, reusable and loosely coupled with other participants in SOA, like transactional services and message bus, to deliver the promise of SOA. Whether you call it Data as a Service or Information as a Service, at the end the focus of the technology is the delivery of data. We support real-time delivery – the exposure of data services to the data consumer in real time.
While there are many technologies that can be involved in the SOA stack, what we are able to do is deploy and implement solutions where we don’t forget what is important in SOA, so we try not to create the legacy of the future.
This means caring about security, governance and metadata management, and enforcing a unified schema across all your information assets; caring about all the concepts behind SOA. Deployment is a matter of months or even weeks, and in a way that it is possible to easily deploy in increments. That is why data virtualisation and data services are important to SOA.
Look out for the podcast with Angel Viña coming soon to www.cbronline.tv.