The company created a technology that recognises spoken words ever closer to human parity.
IBM reached a new AI milestone in speech recognition, achieving an industry record of 5.5% word error rate using the Switchboard linguistic corpus.
The company broke the industry record by extending its deep learning technologies and incorporating an acoustic model that learns from positive examples while taking advantage of negative ones.
The model gets smarter and performs better when similar speech patterns are repeated.
IBM achieved another major AI milestone in conversational speech recognition last year with a computer system that reached a word error rate of 6.9%.
To reach the latest 5.5% breakthrough, IBM combined long short term memory (LSTM) and WaveNet language models with three strong acoustic models.
IBM’s new AI milestone adds to recent advancements it made in speech technology. Last December, diarization was added to the company’s Watson Speech to Text service, marking a step forward in differentiating individual speakers in a conversation.
IBM cognitive computing vice president Michael Karasick said: “These speech developments build on decades of research, and achieving speech recognition comparable to that of humans is a complex task. At IBM, we are dedicated to creating the technology that will one day match the complexity of how the human ear, voice and brain interact.
“This progress will have important implications for how man and machine collaborate in the future, making the interactions more natural and productive. We believe it is only a matter of time before we achieve parity on speech recognition with humans.”
IBM principal research scientist George Saon said in a blog post, “While we are energised by our progress, our work is dependent on further research—and most importantly, staying accountable to the highest standards of accuracy possible.”
READ MORE: Elementary AI my dear Watson…and Einstein: IBM, Salesforce strike landmark Artificial Intelligence deal
Earlier, human parity was considered a 5.9% word error rate. IBM partnered with speech and technology service provider Appen to reassess the industry benchmark and identified that human parity is lower than what anyone has yet achieved: 5.1%.
As part of its research efforts, IBM connected with various industry experts to get their input on the matter.
University of Montreal’s Institute for Learning Algorithms leader Yoshua Bengio said more work should be done to reach human parity.