by William Meisel, Editor, VUI News, President, TMA Associates
Arguably, the Graphical User Interface (GUI) made personal computers accessible to everyone. Today, the basics of the WIMP interface (Windows, Icons, Menu, and Pointing device) are intuitive and have transitioned to many other devices, most notably tablets and smartphones.
One could also argue that the GUI is getting overburdened and less intuitive as it supports an increasing number of features. We often find ourselves searching through many overlong menus or an overabundance of icons. The difficulty is compounded on devices with smaller screens and in situations where hands and/or eyes are otherwise occupied (most notably the automobile). The learning curve for interfacing with digital systems is steeper as we deal with an increasing number of devices and services, as well as upgrades to software we thought we had mastered.
This growing problem—what one might summarize as “Digital Overload”—motivates the trend toward a new paradigm, the use of natural language as a layer over all this complexity. The basic user manual of the Language User Interface (LUI) is “say or type what you want.” (Speaking a command only requires that the device have a microphone.) The LUI can be perceptually the same irrespective of device or application, creating a uniform experience across devices. It is further unified if it is identified as your “bot,” “digital assistant,” “virtual assistant,” or “personal assistant,” and given a name. “Siri,” “Cortana,” “Alexa,” and “OK, Google” come to mind, and suggest the resources and competition that can drive this trend. “Hound” from SoundHound is a recent entry that responds to natural language voice inquiries. The apparent success of Amazon’s Echo and its personal assistant “Alexa” as a voice-only natural language interface indicates that complex natural interaction by both text and speech are viable with today’s technology.
Confirming this trend, at Microsoft’s Build 2016 developers’ conference at the end of March, Microsoft CEO Satya Nadella said that “bots are the new applications.” He also spoke of a world where “human language is the UI layer” and Microsoft is participating in “conversational canvases,” a term he applied to any app where people are conversing in natural language, including email, chat, and SMS. Announcements during the conference validated Microsoft’s support for this point of view, with the announcements of the ability of outside developers to create “intelligent” bots that work inside of Skype and independently of Skype using a Microsoft cloud service. Similar announcements from Facebook on bots in its Messenger service and Kik for its messaging service are part of a trend that toward using natural language to make using digital systems easier. The trend is supported by tools such as machine learning and tools that require less data, such as Microsoft’s recent Language User Intelligence Service (LUIS) and Nuance’s Mix.
The LUI does not replace the GUI—it complements it. But the LUI has the potential to supersede the OS’s and web browser’s GUI as the defining interface to digital devices. We’ve seen this before as web search unified the World Wide Web irrespective of browser. The general personal assistants using a LUI in fact include web search as well as other device- or application-specific functions. Another aspect of the LUI is that it favors an answer over a list of options (e.g., web sites) that require screen space and visual attention. The growing area of machine learning that can summarize the implications of “big data” thus further supports the effectiveness of the LUI. In addition, an area in which technical advances are also accelerating is research into automating “dialog,” making interaction with machines a conversation rather than a one-off command or inquiry. A dialog system can include the context of previous interactions and knowledge of the individual, as well as requesting information the application may need that the user has not yet provided.
The analogy to web search suggests why every company must be part of the evolution toward the LUI. Every company must have a web site to be found by the search engines. As the search engine evolves toward a personal assistant using a LUI, companies that provide a specialized company personal assistant (a “bot”) with a LUI will be favored over those that drop a user into a GUI-driven web site. The general personal assistants are rapidly allowing connection with specialized personal assistants within the general personal assistant, e.g., a command such as “Ask XYZ how much a widget costs” or a text message within the general personal assistant directed at a company-specific assistant/bot.
Companies can benefit from the ease of use of a LUI dedicated to their business to create and keep customers, and they will eventually find having such an option a necessity. Fortunately, specialized assistants have a limited domain of language they are expected to understand. Creating them is easier than the task handled by the generalized personal assistants.
In technology, we often note the power of exponential growth in a technology—Moore’s Law being the obvious example. There is a similar potential for rapid expansion of effectiveness in natural language technology. Today’s speech recognition and natural language understanding technology is driven largely by the ability to do deep analysis quickly on the more powerful platforms today. The technology has passed the “tipping point” of utility—an increasing number of people are using it. That usage generates more data that can be used to improve performance, encouraging more people to use the LUI, generating an increasing amount of data to analyze to improve performance and coverage, creating an increasing number of users. The rate of improvement will be exponential.
The human voice can express information through language. It also has distinct characteristics that vary by individual. Using those biometric characteristics for authentication can provide an additional layer of security and privacy. In a device used by multiple users either simultaneously or at different times, individual differences can be used to distinguish speakers and maintain personalization of responses. The LUI thus has potential additional advantages enhancing its use when the interaction is by voice.
Companies that recognize the importance of the LUI trend early will have a competitive advantage. This is particularly true since the technology requires data on how your customer reacts to improve. Companies can start modestly, perhaps through using natural language in their IVRs or automated text chat on web sites. This builds a database that can be used to move toward powerful company personal assistants that become part of the company’s brand. It’s likely that such interactive assistants will eventually be a major vehicle for advertising, with the potential for support of their development coming in part from the marketing budget. An “ad” where the customer can ask all the questions they want and can buy the product by simply saying “I’m sold” would have an obvious attraction.
What about the difficulty of the technology? Is it available only to companies with expertise in language technologies? Of course not. Vendors have tools and templates, as well as services, that make the LUI available to any company. Like any major trend, vendors offering LUI tools will try to make money by making it easier for their customers to be part of that trend.
The LUI is here, and its use and effectiveness will grow exponentially. Companies should make it an important part of your technology and marketing plans.
Originally published by the Software Society