List: Have NoSQL databases become a viable enterprise alternative?
A couple of weeks ago Microsoft revealed that it would be embracing Linux on its SQL Server databases, with the move welcomed by several database and open source companies. There was, however, little attention given to what this means for competition in the market.
The reason I suspect this is the case is due to the validation is gives to the open source market, something that I’m sure other vendors will hope brings in more revenue via their own offerings. So while Microsoft may have strengthened its offering, its competitors will be thinking the market size is big enough for them all to exist.
Currently the database market is estimated to be worth about $46bn globally, a total that is expected to go up to $50.1bn in 2017. Relational databases rule this market but growth in that area is slowing (5.4%) while the open source database grew by 31%.
Of course, in a smaller market share the growth percentages can be more dramatic, but a trend has emerged that shows competition is heating up.
While much of the market uses relational databases (RDBMS) or SQL, many are starting to shift to NoSQL databases, or non-relational/distributed database.
As competition heats up, CBR identifies the main differences between SQL and NoSQL and identifies some of the major players and what they are doing on either side.
What’s the difference?
There are numerous differences between SQL and NoSQL databases. For a start SQL databases are table based while NoSQL are document based, key-value pairs, graph databases or wide-column stores.
What this means is that SQL databases are capable of representing data in the form of tables which consist of rows of data. NoSQL databases are a collection of the above mentioned categories and they do not have standard schema definitions that need to be followed.
The schema for SQL databases are predefined while NoSQL databases have dynamic schema so that they can deal with unstructured data.
This is important because of the predicted rise in unstructured data which is growing by 56% per year, compared to 12% a year for structured data. In 2013 there was an estimated 80 exabytes of unstructured data, with that number now standing at over 300 exabytes – this is one of the reasons why some believe that NoSQL databases will be the database of the future.
Another difference is that SQL databases are vertically scalable while NoSQL scale horizontally. SQL uses structure query language that defines and manipulates data whereas NoSQL queries are focused on the collection of documents.
SQL databases are typically a good fit for a complex query intensive environment, something that NoSQL doesn’t excel in.
That’s not to say they are bad but NoSQL is just better suited to different workloads for example it works better with hierarchical data storage but SQL databases aren’t well suited to this.
Leading SQL databases
Oracle is the market leader in the database sector dominating over 40% of it. The company’s latest database version is 12c, with the c standing for cloud, highlighting Big Red’s transition to the cloud.
The database took four years to develop and offers features such as database consolidation, query optimisation, performance tuning, high availability, partitioning, backup and recovery and numerous others.
One of the important features is the consolidation of databases which helps to optimise both the hardware and operational staff.
The pluggable database feature aims to reduce the risk of consolidation so that it can easily plug into an existing database to or from a container database, one of the key benefits is that there is no need to change any code in the application.
The company has continued to upgrade its leading database as it has faced increasing competition from the likes of Microsoft’s SQL Server and the NoSQL databases. Recent updates have seen it improve the data integration with cloud, big data analytics and real-time replication.
Microsoft SQL Server
Considered the second most widely deployed database, Microsoft’s SQL Server has been in the news recently for its new found openness to Linux.
Described by one IDC analyst as an "enormously important decision," the move to make its database software available for the Linux platform has been welcomed by the open source community.
Some of the praise is down to the validation that it gives to open source, having a big name player like Microsoft in the community could be a boost to overall sales.
The move for Microsoft comes as it hopes to increase its 21.55% share of the database market by tapping into the open source world. Considering that the open source database market grew by 31% it can be seen as a sensible move from Microsoft to try and tap into it.
Support it set to start from mid-2017 but users won’t have to wait till then to see other advancements in the software.
The company has already revealed two Release Candidates for the highly anticipated SQL Server 2016 which includes capabilities such as real-time operational analytics, rich visualisations, built-in advanced analytics, security technologies and hybrid scenarios that will let customers extend data storage to the cloud.
IBM’s DB2 database software has long been one of the major players in the market but has perhaps been overshadowed by the likes of Microsoft and Oracle with their own offerings.
The relational database offers integrated support for a number of NoSQL capabilities as the company aims to offer the best of both worlds. These capabilities include XML, graph store, and Java Script Object Notation (JSON).
The database can be used for both transactional and analytical operations in addition to offering continuous availability of data that is designed to help keep transactional workflows and analytics operating smoothly.
Three variants of the platform are available, Workstation, Midrange and Mainframe.
DB2 has long had its future questioned but it continues to hold a decent size of the market and the company has been forward looking with improved support for Apache Spark and solid support for the IBM POWER8 server architecture.
Leading NoSQL databases
Data compiled by Austrian consultancy firm Solid IT recently placed MongoDB as one of the top ranked databases; this success can be partly attributed to its ease of use and popularity with developers.
The document-oriented database that natively supports the JSON format has proved popular because it doesn’t require a database administrator to bootstrap. This means it doesn’t need a DBA to load a program into a computing as it has a self starting process.
The software allows for flexible replication for sharding across nodes and it offers multi-version concurrency control which makes it easy to keep old versions of data available in order to maintain consistency in complex transactions.
The software is written in C++ and it maintains some of the friendly properties of a SQL database with functionality that offers query and index.
Recent customer wins for the company have seen it deployed by YouGov, the firm made MongoDB 3.0 its default non-relational database which helps the surveying firm deal with 2-4gb of new data it captures on average every hour.
Redis offers a disk-backed in-memory database which makes it one of the faster offerings in the market which makes it well suited to workloads that have rapid changing data and an approximate data size estimate.
In addition to this it is well suited data analytics given its fast processing speeds.
The company has been busy improving its offering in the past couple of months with integration for Spark SQL, along with a new Spark-Redis connector that is designed to speed up certain big data analytics tasks by 100 times or more, the company said.
The Spark-Redis connector will provide an open source library for reading and writing data from and to a Redis cluster with Spark, the move aims to capitalise on the growing popularity of the Hadoop tool.
Enhancements have been helped by a $15 million Series B funding round which it closed last year. Bain Capital led the round with Carmel Ventures and Silicon Valley Bank. The funding means that it has now raised $28 million.
Cassandra was created at Facebook and has emerged as a hybrid of a column-oriented database with a key-value store. The ability to group families gives users the feeling of tables and helps with replication and consistency for linear scaling.
Data is written to Cassandra in a way that provides both full data durability and high performance; the data is first recorded in an on-disk commit log and then written to a memory-based structure that is called a memtable.
Its main appeal is that it can be used for managing large scale volumes of data such as web/click analytics and measurements from the Internet of Things; this is because it is well suited to quickly written inputs.
Cassandra is written in Java and its protocol CQL3 is designed to offer a similar feel to SQL, although it offers no JOINs or aggregate functions.
The open source database became a top-level Apache project in 2010 which helps the database to remain fresh with constant innovations that come from the open source community.