By Dr. Lance Eliot, the AI Trends Insider
With the advent of ride sharing, passengers are increasingly interacting with the hired or rented driver of the car.
In the past, we were used to getting into a taxi and other than mumbling where we wanted to be taken, the actual conversation was usually either nonexistent or quite stilted. Today’s Uber and Lyft drivers are typically bent on engaging their passengers in lively conversation. How’s your day coming along, or gosh isn’t the weather very pleasant, goes the conversation. Some of these drivers do this to try and keep their ratings high, else they might get dinged points by a passenger and end-up in the doldrums of the ratings. Other drivers instigate dialogue because they like to talk and have taken a ride sharing driving role to meet new people, in addition to getting some extra dough beyond whatever other job they have.
Many riders find this cheery dialogue a refreshing change from past experiences of taxi and shuttle drivers and will chat things up with the driver throughout the driving journey. On the other hand, some riders prefer golden silence and a divine moment of reflection that can occur while getting a lift — savvy drivers are quick to usually allow a quiet atmosphere in the car in such cases. Less savvy drivers try to prod and get the passenger into a conversation, and even worse at times create tension in the car as they try to ferret out why the passenger won’t talk. Those drivers that ignore the preference of their rider are at times shocked to get a low rating since it seemed like all was well, in spite of the fact that the rider gave only short answers or opted to not talk at all.
If you were to closely examine the dialogue between the driver and the passenger, you can readily find overall patterns in the nature of the conversations. At the start of the trip, the driver and rider do a quick verbal exchange to clarify the specifics of the trip, which has normally been electronically transmitted to the driver beforehand. The driver confirms that the passenger is the indicated rider that had reserved the ride, and the passenger confirms that the driver is the assigned driver for the ride. There is usually also a confirmation about the destination. Once these preliminaries have been undertaken, the driver often provides some indication about the upcoming journey, such as how long it will take, or offers insights about the traffic situation, etc. At this point, there is then no particular conversation presumably required, and the rest of the ride can be pretty much in silence.
Once the destination gets close, the driver and rider exchange some more verbal indications of what’s about to happen. The driver might offer that they will be letting the rider off at the curb up ahead, and the rider might ask instead to have the driver pull into a nearby driveway. These bits of conversation, taking place at the start and end of the journey, constitute the minimum amount of conversation that can take place. Of course, in theory, the ride could occur in stony silence from start to finish, since the destination was electronically provided, but by-and-large there is conversation between driver and rider for at least the start and finish of the ride.
During the ride, there might be other task-oriented aspects that the driver or rider brings up. For example, perhaps the driver turns onto a street and there is a road crew there that is laying asphalt and the road is torn-up. The driver might mention to the rider that a U-turn is in order or might say that the street work could delay getting to the destination by a few minutes. There might also be conversation initiated by the rider midway of the trip, wanting to know if the driver thinks the destination will be reached on-time.
All of these snippets of conversation are based on the topic of the overall driving and ride aspects, and very task focused to the immediate task at-hand. The conversational space that exists between the start and the finish of the ride can be filled with other non-task elements that are off-the-topic of the driving and ride aspects per se. For example, a rider might ask about the locale, such as what kinds of things to do in the area, or whether the area is safe for walking around. The driver might bring up suggested local places to visit or restaurants to eat at. Veering further from the mainstay of the trip, sometimes a driver will ask what kind of work the rider does, or the rider might ask the driver about how they came to be a ride sharing driver.
Throughout all of this conversational dialogue, there are a complex series of socio-behavioral efforts taking place. Whether we realize it or not, this multi-party exchange involves subtleties of language, including the comprehension and understanding of the words spoken, along with gauging the social, behavioral, and cultural context in which it is taking place. Though it might seem at first glance to be effortless, there is actually a great deal of embedded cognitive activity that is occurring. There is an ongoing shifting of control during the discourse, and the dialogue involves moments of agreement, disagreement, involvement and retreat from involvement, and many other social roles and states are taking place.
The fluidity of the conversation is shaped by the capability of the parties involved in the conversation. A driver that is not adept at making conversation will likely end-up with a very abrupt feeling dialogue by the rider. A rider that doesn’t want conversation can be upset when being forced into a conversation by a less-than-savvy driver. In this duet of a conversational dance, the driver and rider are linked together for the duration of the driving journey, and the nature, depth, scope, focus, and joy or displeasure of the ride is bound to be shaped by the conversation that occurs (or doesn’t occur).
When you look at the driving journey in this light, it certainly seems like a lot for a minimum wage driver to have to shoulder responsibility for. No wonder it has historically been easier to just go along with silence.
What does all of this have to do with AI self-driving cars?
At the Cybernetic Self-Driving Car Institute, we are developing AI software that can interact with the occupants of a self-driving car.
The need for Natural Language Processing (NLP) in AI self-driving cars is rather apparent and a predicted “must” for the future of self-driving cars. (See my past column on In-Car Voice Commands NLP for Self-Driving Cars.)
Advances in socio-behavioral computing are gradually getting us closer to having the kind of dialogue and conversations that a human driver has with a human rider. In this multi-party arrangement, we are substituting the AI for the role of the human driver. Thus, as an autonomous agent, the AI driving the car interacts with the human passenger, doing so for purposes of (at least) being able to appropriately ensure that the driving journey and driving task occur.
There is nothing that prevents us from having the AI branch out to other aspects of conversation, but as a minimum we should expect that the AI will be able to cover the topic of the driving and rider aspects that pertain to the driving journey. It might be nice to have the AI also be able to discuss the weather or the best restaurants in town, but if it cannot do the needed job of ferreting out aspects related to the driving task, we’d be remiss in that the core itself is not undertaken.
Imagine the frustration by the rider if the AI did a great job of discussing the works of Shakespeare, but in the meantime the AI arrived at the wrong destination or took a turn down a blind alley and the self-driving car got into an untoward situation. Clearly, the AI for the self-driving car must be able to do whatever is first required about the self-driving car and the riders within, and only then can it also cover other non-task related aspects.
Recent research such as the DSARMD (Detecting Social Actions and Roles in Multiparty Dialogue) system are insightful signs of what can be done in analyzing conversations between humans and between humans and artificial autonomous agents. By collecting on-line chats, the researchers have been able to analyze observable linguistic features and automatically extract meaning from the dialogue. The researchers explored various communicative links, dialogue acts, localized topics, and topic reference polarity aspects. Meso-topics are topics that persist throughout various twists and turns in a conversation.
We can apply these same techniques towards the AI of self-driving cars. By collecting and studying dialogues between drivers and riders, it is possible to establish the nature of the discourse and how the AI should engage in appropriate dialogue during the driving journey. We can then develop AI systems based on those analyses.
Let’s take a look at some examples.
We have here a rider that we’ll refer to with the initials of JS, interacting with a prototype AI driver interaction system called Kitt (a bit of a small humor, the name is a play on the once popular TV series about a talking car).
01 – Kitt: Is your destination the Albertson’s on Beach Boulevard?
02 – JS: Yes, it’s the one that is near the In-and-Out Burger.
03 – Kitt: Okay, that Albertson’s is about a block from the In-and-Out Burger. Let me know when you are ready to proceed to the Albertson’s.
04 – JS: Let’s go.
Notice that the AI system indicated the destination by stating the name of the place and the street that it was on. The AI could have instead merely recited the actual street address, but it has used a more culturally slang way to refer to the destination, giving it less of a robotic feel. This also though was based on the realization that there wasn’t another Albertson’s on Beach Boulevard, since if there was another one then the indication of the destination would have been ambiguous.
When JS responded, he provided what at first glance seems like an easy answer, he used the word “yes” to respond to the question. But, he then added some additional information and the AI cannot just ignore the rest of the response. In this case, JS has apparently tried to clarify which Albertson’s by saying that it is near another location, namely an In-and-Out Burger. I’ve just pointed out that there isn’t another Albertson’s on that street, and so you might wonder why JS would believe he has to clarify where the Albertson’s is. It could be that he (falsely) thinks there’s another one on Beach Boulevard, or that he just wanted to make sure that the AI wasn’t confused about the location.
The AI echoes back to the rider that the Albertson’s is indeed near to the In-and-Out Burger, which provides reassurance that both the driver and the rider are aiming for the same destination. If the AI had discovered that the In-and-Out Burger wasn’t anywhere near the Albertson’s, the dialogue would have gone in a different direction. Suppose there is a Ralph’s on Beach Boulevard and it is near to the In-and-Out Burger, in which case, the AI would have asked JS whether he meant to say Ralph’s rather than the Albertson’s.
In any case, the AI next wanted to know whether JS was ready to proceed. This is a simple question but an important one. Just because JS has gotten into the car and closed the door, and confirmed the destination, does not mean that the AI can start ahead with the driving journey. Suppose JS is still getting settled into the passenger seat or otherwise is not ready to go. A simple question and a simple answer can ascertain whether to start or not.
Here’s another example:
01 – JS: I need to get my wallet, which I left at home.
02 – Kitt: Are you saying that you want to first go home, before going to the Albertson’s?
03 – JS: Yes, I need to go home first. I can’t believe that I left my wallet. It seems like I keep forgetting to take it with me, probably because I bought this new pair of pants and the wallet doesn’t fit well.
04 – Kitt: Confirming that you want to go home before going to Albertson’s. Changing direction now.
05 – JS: Thanks.
Notice that the rider has opted to alter a significant aspect of the driving journey. He has provided an indication that rather than directly going to Albertson’s, he needs to be driven home first. There is some extraneous conversational snippet involved, in that he says he needs to get his wallet. Getting his wallet is not a clear-cut aspect of the driving journey and would be difficult to use as a means to determine that the destination needs to be changed. His indication that left it at home offers an implied aspect that he needs to go home. But, does he need to go home after visiting Albertson’s or before visiting Albertson’s. The likely guess is before, but the AI needs to get clarification from JS.
JS confirms that he wants to go to home first. He then goes off-topic in terms of the destination and provides an indication about his wallet. This snippet cannot be ignored by the AI, since there might be some other hidden instructions in it. After parsing his small soliloquy, the AI did not find anything pertinent to the driving task, and so opts to confirm that it will go ahead and head to his home first.
In the future, a more sophisticated social-behavioral computing capability would perhaps entertain the discussion about his wallet and his pants. A human driver would likely carry on such a discussion, perhaps lamenting how easy it is to forget a wallet or sympathizing with the difficultly of carrying the wallet in the new pants.
All of this conversation requires attention to the lexical, semantic, and syntactics aspects of human interaction. At the beginning of the two conversations, the AI took a leadership role in terms of shaping the direction of the conversation (doing so to affirm the desired destination). The human rider took over leadership in the second conversation, wanting to change the direction of the driving journey. There are predictable patterns of social action signatures (SAS) that exist in the driving journey task and topics arena, thus allowing the AI to provide a rather focused and on-target conversational assessment.
How far beyond the driving task should we have the AI go? Some would say that the AI should not venture far beyond the driving task. If it does so, it might mislead the rider into believing that the AI can do more than it can actually do. Most people that are using Alexa and Siri today are aware that it is not a fully capable form of natural language processing (NLP) in the sense that it does not comprehend in the same way that humans do. This is rather obvious after even just a few minutes of “conversing” with those NLP systems.
Critics concerned about the AI in self-driving cars are quite worried that the AI might appear to be more capable than it is, and that the human occupants will falsely believe that the AI can counsel them or otherwise do things it cannot do. As such, they want the AI to be good enough to carry on a limited conversation about the driving task, but not do so in a manner that suggests it has true human-like comprehension.
We can already predict that humans in AI self-driving cars are likely to anthropomorphize the AI system in the self-driving car. Given that humans spend a lot of time in their cars, it will be easy enough to start becoming comfortable with your autonomous driving chauffeur. Until the time at which the AI has progressed into Artificial General Intelligence (AGI), which is considered broad intelligence encompassing common sense and overall world knowledge, we should perhaps be cautious about how the AI converses about the driving journey and how it interacts with the human occupants. We want clarity of conversation without creating false impressions. Hopefully, someday, it might be a truly wide-open socio-behavioral conversation.