News: IBM researchers think that they can cut the horsepower and learning times by using “resistive” processing units.
Tech giant IBM has developed a new technology that can speed up the training for deep neural networks (DNNs).
Though DNNs can be taught to perform almost any task, training them is time consuming and complex. Training artificial intelligence (AI) systems involves the usage of supercomputers or data centres for a significant number of days.
In a research paper titled ‘Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices’, authors Tayfun Gokmen and Yurii Vlasov said:"In recent years, DNNs have demonstrated significant business impact in large scale analysis and classification tasks such as speech recognition, visual object detection, pattern extraction, etc.
"Training of large DNNs, however, is universally considered as time consuming and computationally intensive task that demands datacenter-scale computational resources recruited for many days,"
But the scientists at IBM’s T.J. Watson Research Center have come out with "resistive processing units," that can reduce the horsepower and learning time sharply.
"We proposed a concept of resistive processing unit (RPU) devices that can simultaneously store and process data locally and in parallel, thus potentially providing significant acceleration for DNN training.
"The tolerance of the training algorithm to various RPU device and system parameters as well as to technological imperfections and different sources of noise has been explored," the paper said.
The proposed RPU device minimises the data flow during training by storing and updating the weight values locally. This mechanism enables to fully use the locality and the parallelism of the training algorithm.
"For large DNNs with about 1 billion weights this massively parallel RPU architecture can achieve acceleration factors of 30,000X compared to state-of-the-art microprocessors while providing power efficiency of 84,000 GigaOps/s/W," it said.
Issues that currently need many days of training involving thousands of machines can be resolved in hours on a single RPU accelerator.
"Most of the recent DNN architectures are based on combination of many convolution and fully connected layers with a number of parameters of the order of a billion. Our analysis demonstrates that a single RPU accelerator chip can be used to t rain such a large deep neural networks," the paper said.
Global tech companies, including IBM, have been competing fiercely to become the top player in the AI space. The company that emerges as a dominant player in AI is expected to lead the tech industry in the coming years.
A machine learning specialist Pedro Domingos told New York Times:"Whoever wins this race will dominate the next stage of the information age."
The market for machine learning applications is estimated to reach $40bn by 2020. About 60% of those applications are expected to run on platform software offered by Amazon, Google, IBM and Microsoft.
A computer scientist who manages Google’s AI development, Jeff Dean, said that intelligent software applications would emerge as "commonplace," adding that machine learning will impact every sector.
Watson division general manager David Kenny said: "It’s early days, but the long-term goal is to have hundreds of millions of people use Watson as self-service AI."