New software could transform voice recognition systems

Bringing natural conversation to technology

Physical Sciences

New software that interprets the spoken word is poised to revolutionise voice recognition systems, proving a more seamless interface between people and their mobile devices, televisions and cars.

The spoken dialogue technology, which is being developed by VocalIQ, a spin-out from the University of Cambridge’s Dialogue Systems Group, was designed to enhance automated voice recognition interfaces, which rely heavily on predefined commands. VocalIQ’s software, which is based on more than 10 years of research, offers users the ability to talk more naturally with their smart devices. Instead of merely recognising speech, the technology is able to understand and interpret dialogue. It can also learn on-line so that when it makes mistakes, it learns from them and avoids making the same mistake again. The more the software is used the smarter it gets.

“The use of speech to interact with machines has reached a tipping point. Without smart conversational interfaces which can adapt to suit the user, the Internet of Things cannot flourish. VocalIQ intends to be the prime supplier of these smart conversational interfaces.”

Steve Young, Chairman of VocalIQ

So, rather than merely digesting the user’s order to “find a restaurant,” the software learns to understand the nuances of a more natural conversation. A typical exchange might sound something like this, as a user tells his mobile “I don’t care where we eat but I need to find a nice restaurant for my girlfriend.” The software might then respond “There’s a really nice place to eat, it has good reviews, and it’s a 10 minute walk from your location. Are you OK with that?”

“There are no commands for the user to learn,” said Blaise Thomson, CEO and co-founder of VocalIQ, who is an expert in machine learning and dialogue system design. “It’s about having a conversation.”

The applications for the new software are many, ranging from video gaming to wearables such as smart watches and glasses. VocalIQ is currently working on a prototype application for one of the world’s largest car manufacturers.

“There were a billion smart devices made last year,” said Thomson, noting that most of them are neither easy to use nor safe when a user is on the move. Each year in the United States, driver distraction (calls, texting, and emails, among the factors) contributes to 16% of all fatal crashes, leading to around 5,000 deaths, according to the AAA Foundation.

“For all of the many devices we use, we want to find a way to get what we need, in the easiest, safest way possible,” Thomson said. “That’s where voice comes in.”

VocalIQ has received £750k in seed financing led by technology investor Amadeus Capital Partners. Cambridge Enterprise, the commercialisation arm of the University of Cambridge, is also investing.

VocalIQ’s chairman, Steve Young, is Professor of Information Engineering at the University of Cambridge. He was the original developer of the HTK speech recognition toolkit and co-founder of Entropic, an innovator in speech technologies, which was acquired by Microsoft.

“The use of speech to interact with machines has reached a tipping point,” says Young. “Without smart conversational interfaces which can adapt to suit the user, the Internet of Things cannot flourish. VocalIQ intends to be the prime supplier of these smart conversational interfaces.”

Photo credit: Unsplash