By Mark Barry of Luminoso
As we all go about our day using the internet, talking to brands, and interacting online, we don’t adhere strictly to the writing guidelines of the Associated Press. We use a lot of slang, jargon, technical corporate terms, figurative language, and other communication shorthands — unstructured digital data that other humans understand easily, but that confounds Artificial Intelligence (AI).
The problem is that we tend to use very colorful, interesting, and funny language because we know that our words could reach a larger audience that way. As linguistic philosopher Paul Grice suggested, “we are concise in our language, leaving out things we expect everyone to know because we don’t want to be considered boring.” This results in data that is very unpredictable, disorganized, and constantly evolving, ultimately leaving AI in the dark because it lacks a little something called common sense.
This glue (common sense) that holds language together makes understanding it uniquely difficult for AI systems because it is often left out of the datasets. To get there, we need to provide AI with a very deep, intuitive understanding of the language we use, along with the nuances and expressions that come along with it.
As linguist George Lakoff suggests, we understand complicated concepts by making analogies to simpler ideas. For example, in English, if we’re talking about data, we often use words for water, like information flow, as a way of conveying a core concept.
For instance, if you and I were having a conversation, I would never feel the need to tell you that if I drop something, it falls down. I expect you to know that. I would also expect you to know that when you get your morning coffee, it comes in a mug. These unspoken details are what AI systems need to make the right connections and contextualize our conversations.
When businesses started trying to understand digital customer feedback, they began with sentiment analysis (the ability to tell whether or not a statement is negative or positive). But the ceiling was hit pretty quickly when they realized that sentiment analysis misses out on the important stuff. It doesn’t truly capture why something is positive or negative, how to address it, or which aspects are more positive than others, let alone whether you’re even asking the right questions in the first place.
Over time, AI became smarter thanks to various deep learning techniques, but a different problem emerged: lack of data. Case in point; when Google’s DeepMind AI began learning to beat human players in the game of Go , it was crunching data on three million Go games. But that wasn’t enough. Once they ran out of data, they had it start playing itself. This is problematic though, because to teach an AI how to understand what customers are saying, you can’t exactly have it talk to itself to generate new training data.
It became clear that in order to build an AI that could learn how to understand customer feedback and add value to the conversation, we need to provide more sources of knowledge about how the world works, beyond the limited data from customer communications.
Developing Common Sense AI
ConceptNet is a large, open source knowledge base created at the MIT media lab in 2006. The knowledge base represent concepts, not just words, so that relations can be made and understood as a graph.
At Luminoso, our approach to embedding common sense into our products is to take background knowledge from ConceptNet and pair it with machine learning. This allows the system to adapt to new datasets without the need for heavy resources or lots of training data.
Meaning, the system strives to offer the kind of machine learning and natural language understanding you hear everyone talking about without the need for human intervention.
Luminoso uses a very small amount of data — the kind of data you already have — so you can start applying it to all sorts of unstructured text and start extracting insights to disseminate throughout the organization.
To demonstrate the value of common sense AI out in the wild, here’s a quick use-case: Luminoso was working with a Japanese car manufacturer not long ago. We were looking at feedback from dealerships to help unearth issues they hadn’t yet addressed but knew customers were talking about.
These customers were coming into the dealerships talking about how their cars smelled bad, and they used a lot of different words to describe this smell. Some described it as a wet dog even though they didn’t have a dog, some thought it smelled like an attic, and others simply said it smelled funky. And all of this feedback was being categorized by customer service agents as other, which is that bucket in your call center where unanticipated things that are truly going wrong go to haunt you.
Luminoso investigated this other bucket and found that these different descriptors for a bad odor were related to reports of dew or condensation inside certain models. The manufacturer was able to look into these models and found that the air conditioning hose was coming undone, causing water to leak into the car, leading to condensation, mold, and lo and behold, a bad smell.
Being able to tell dealerships how to address customers who come in with this issue was a huge win for their overall customer satisfaction, and it allowed them to unearth similar insights that may have eluded them going forward.
Ultimately, with the right common sense foundation in natural language understanding, these “lost in translation” moments become easy to reveal along with other “unknown unknowns” that companies spend years trying to get to the bottom of. So whether your goal is to improve customer or employee satisfaction, you’ll need a little common sense to really hear what they’re saying.
For more information, go to Luminoso.