CTO Briefing: Why rent infrastructure when owning your own hardware is so much better? Spectra Logic’s Matt Starr pits tape against cloud for CBR.
Following the success of its 2015 Deep Storage Summit and the launch of ArcticBlue, an object storage based disk platform, Editor Ellie Burns sat down with Spectra Logic CTO Matt Starr to discuss the tape market – a sector vying for attention in a storage market increasingly focused on cloud.
EB: How do you think tape is currently viewed in the storage market?
MS: In the large enterprise and data intensive applications like HPC, M&E, genomics and oil and gas, tape is the go-to storage technology for long term retention and low cost.
Moving down from the larger customers we see that the role of tape begins to transition from being used as an archive media to a backup media and again here the cost benefit still pays off.
Only when you move into the lower part of the SMB market do you see tape replaced by deduplication disk or other non-tape technologies. Tape’s growth market is archive but tape is still strong in the mid to large enterprise for backup.
EB: How does tape compare to cloud storage?
MS: If you watch some of the cloud providers today talk about their storage infrastructure you will see that they too use tape for long term storage.
There is a good YouTube video from a gentleman at Microsoft whose statement is basically that tape always wins when it comes to long-term storage. So, cloud is ideal for the smaller companies that currently have little to no infrastructure, but as these companies grow the cost of cloud begins to out pace the cost of bringing and managing the hardware in-house.
Having a plan to go to cloud is fine, but customers need to be sure to have a map to get off of cloud when the cost overruns start to hit.
EB: With those companies with little to no storage infrastructure, why would companies opt for tape?
MS: Cloud storage is a marketing term for saying, "I don’t have any storage infrastructure here – it has been outsourced." In the end, if someone with a mild amount of data can justify cloud storage, then something is wrong with the justification.
The cost of a disk or a tape is the largest part of the storage infrastructure system in mid to large systems. If a cloud provider can provide storage cheaper than you can buy the same storage capacity, you pay for a pipe that costs $50K a month to that provider, they make a profit – then the user is doing something wrong.
EB: What is the business case for companies looking to invest in tape vs cloud?
MS: This one has a lot of inputs but we will focus on where cloud is ideal, such as in distribution of content, like training videos or documentation.
So, if my company needs easy data distribution then cloud works. Along with this space is the idea of remote storage, if my company is small and I must have an "online" second copy somewhere; cloud may prove to be ideal.
As companies grow and their storage footprint grows, cloud storage begins to become a major cost centre. The cost of a high-speed link can run $30-$120K per month, data storage can run between .01¢ to .20¢ per gigabyte storage and the charge to get data out can be $1000.00 per 10 TB. If you start to really use cloud storage with larger data sets, the costs can suddenly escalate exponentially.
My advice to anyone looking at cloud is to do a total TCO, include labour, floor space, and the entire local infrastructure. Then add in the cost of the data pipe, in-out charges and the fine print inside of the cloud provider contracts that talk about how much data you can retrieve, without getting hit with additional charges.
In some cases cloud may work; if I own three coffee shops and with receipts and two or three security cameras worth of data, cloud may be perfect. If I were a genomic institute generating 20 -100 TB a month, cloud storage would soon overwhelm my budget.
EB: Apart from cost, how does tape compare to cloud when looking at capacity?
MS: Cloud is great for small application and smaller shops, storing 10 TB in the cloud is ideal verses trying to build that infrastructure. Building 1 PB in the cloud costs 10 times more over 3-4 years than rolling your own.
Again, cloud is just someone else’s data centre and infrastructure that you are renting. Small data centres do not cost that much more per TB to build and maintain than large data centres.
EB: How would you define deep storage?
MS: Deep storage is the set of tiers in the storage realm that provide the reliability, cost effectiveness, density and scalability to allow customers to contain cost yet deliver an SLA that meets the needs of a business.
90% or more of all data has a well known data cycle and most of this data needs to be kept for an extended period of time. The value of this data could be known or unknown but as the need to access recedes, the data should be moved to the deeper tiers.
EB: Why should businesses opt for it?
MS: Cost, ownership and control are the three that come to mind first. Cost is obvious once you rollout a complete TCO. The second parts go to ownership and control. Not being able to retrieve your data at a specific rate without incurring additional cost can be burdensome.
Not knowing the complete SLA of the data, provenance or curation of that data set. In short, if a company’s data is their business, why would they outsource the manufacturing, warehousing (storage) and delivery of their product?
For NASCAR, Discovery and NBC their product is digital content. It is not plastic or metal or a paper, it is a digital stream. So, they need ownership and control of this product. The same is true for genomics, oil and gas and so on.
EB: How will big data and IoT drive data growth?
MS: My view is both of these will drive data growth but each in a unique way. My opinion is IoT will not generate massive amounts of raw data, but instead IoT data generation will come from when the data is analysed and processed.
When the data is put into a visual world, where the user or consumer can see all of the data points on a visualisation wall, is when IoT will generate massive amounts of data. Big Data I believe is somewhat the polar opposite; taking massive data sets and crushing them into a result that could be as small as yes or no.