Microsoft's speech recognition software reaches 5.1% error rate

Microsoft's speech and dialog research group's speech recognition system now boasts a 5.1 percent error rate, according to a Aug. 20 company blog post by Xuedong Huang, PhD, a technical fellow at Microsoft.

The 5.1 percent error rate represents Microsoft hitting the industry record for conversational speech recognition, according to Dr. Huang. The milestone — 12 percent lower than last year's 5.9 percent — is based on human parity.

To reach this benchmark, the researchers used deep learning software and sought to improve their language models. These models are now able to analyze the entire history of a dialog to predict and contextualize what words are likely to come next.

"The speech research community still has many challenges to address, such as achieving human levels of recognition in noisy environments with distant microphones, in recognizing accented speech or speaking styles and languages for which only limited training data is available," Dr. Huang noted.

Click here to view the blog post.

Copyright © 2024 Becker's Healthcare. All Rights Reserved. Privacy Policy. Cookie Policy. Linking and Reprinting Policy.

 

Featured Whitepapers

Featured Webinars

>