Speech recognition technology has been around for more than half a decade, though the early uses of speech recognition — like voice dialing or desktop dictation — certainly don’t seem as sexy as today’s burgeoning virtual agents or smart home devices.
If you’ve been following the speech recognition technology market for any length of time, you know that a slew of significant players emerged on the scene about six years ago, including Google, Apple, Amazon and Microsoft (in a brief search, I counted 26 U.S.-based companies developing speech recognition technology).
Since that time, the biggest tech trend setters in the world have been picking up speed and setting new benchmarks in a growing field, with Google recently providing open access to its new enterprise-level speech recognition API. While Google certainly seems to have the current edge in the market after substantial investments in machine learning systems over the past couple of years, the tech giant may yet have a potential Achilles’ heel in owning an important segment of the global market — lack of access to China.
The six-year ban on Google in China is a well-known fact, and aside from the very rare lapse in censorship, the block seems relatively immutable for the foreseeable future. With the world’s highest population to date, China also has more mobile users than anywhere in the world, and a majority use voice-to-text capabilities to initiate search queries and navigate their way through the digital landscape.
Google may be missing out on reams of Mandarin audio data, but Baidu hasn’t missed the opportunity to take advantage. As China’s largest search engine, Baidu has collected thousands of hours of voice-based data in Mandarin, which was fed to its latest speech recognition engine Deep Speech 2. The system independently learned how to translate some Mandarin to English (and vice versa) entirely on its own using deep learning algorithms.
The Baidu team that developed Deep Speech 2 was primarily based in its Sunnyvale AI Lab. Impressively, the research scientists involved were not fluent in Mandarin and knew very little of the language. Alibaba and Tencent are two other key players in the Chinese market developing speech recognition technology. Though both use deep learning platforms, neither company has gained the level of publicity and coverage of Baidu’s Deep Speech 2.