1. 程式人生 > >Here’s Why Developing Natural Language Interface Is Hard

Here’s Why Developing Natural Language Interface Is Hard

Here’s Why Developing Natural Language Interface Is Hard

In this article I would like to show why NLI (Natural Language Interface) is often so hard to do. To illustrate this idea I’ll use a semi-trivial example to show that even for this simple use case the natural language interface presents a formidable problem to solve.

Any Chance Of Rain?

For our example let’s imagine we want to build (yet another) weather bot. Our bot will answer weather-related questions for a given city and a date range. It will also support past, present and future (forecast) weather requests. Our goal is to support natural language interface to our bot as close to human cognition as possible, i.e. as if our users would be talking to a real human being — trying to achieve that elusive free-form natural language comprehension

.

Top 3 Most Popular Ai Articles:

Basic Intents

Let’s start with simple and obvious examples of the requests we need to support:

What’s the current weather in New York?Show me San Jose forecast for the next 5 days

These requests seems rather trivial to encode using common intent-based matching. You basically have to detect three main entities:

  • A weather request indicator
  • A city name
  • A date range

Once you’ve built the model (in whatever tool you prefer) to detect all three types of entities you can relatively quickly build an action (i.e. an intent callback) that would return a weather information for a given city and given date range.

Most of the tutorials and examples stop right here. However, this is far from being even remotely equal to how real humans converse about weather…

Optional

Pretty obvious initial improvements one would need to make is to assume that city and date range elements are optional. Indeed, if city isn’t present the user is likely asking about her current location, and if date is not present she’s asking about the current date. These seem to be reasonable assumptions:

What’s my current weather? => result for current date and current locationWhat’s Chicago’s weather?=> result for current date and city of ChicagoAny chance of snow this Friday?=> result for city of Chicago and this Friday

However, these assumptions have to be processed in a special way by conversation management.

Conversation Resolution

Another thing you’ll notice right away is that you need to support conversational context. Frequently, when people inquiry about weather they don’t just ask a single question but often have followups. For example:

What’s the current Moscow weather?=> result for Moscow todayHm, what about tomorrow?=> result for Moscow tomorrowAny chance of rain?=> result for precipitation in Moscow tomorrow

While in everyday life these seem rather trivial, the programmatic logic for supporting this type of conversation management is far from trivial. For example:

  • When does conversation switch context and previous context should be “forgotten”?
  • When does conversation time out?
  • Which parts can or cannot be taken from previous sentences?

Depending on the framework you use this can be a significant project on its own.

Geographical Ambiguity

Yet another problem you’ll discover pretty quickly as you let users play with your bot is that your current model doesn’t distinguish between these two sentences:

What’s the local Moscow weather?What’s the local weather?

In the first example user is clearly asking about current Moscow weather, while in the second she’s likely asking about her current location. But then it conflicts with the conversation support we discussed above because city element “Moscow” is optional and we can pick it up from conversation context which should make second example equal to first! We have a contradiction…

That’s where things get complicated and naive conversation management doesn’t cut it anymore. The one rule you can possibly come up with to bypass this dilemma is this: if there is a word “local” (or its semantic siblings) and there’s no city in the current sentence — then user is asking about the weather at her current location; otherwise — fall back to default conversation management.

As a side note your NLP toolkit should clearly disambiguate between New York (state) and New York (city), Moscow (Russia) and Moscow (USA, ID), etc. It should also support common slang and abbreviations like LA (for Los Angeles and not for State of Louisiana), Big Apple, NYC, SF, etc.

Temporal Ambiguity

Another, more subtle, problem arises when we try to deal with date ranges. Look at these examples:

What’s my current forecast?=> result for the default 5-day forecast from todayWhat’s the precipitation forecast for Sep 25 — Sep 30th?=> result for given date rangeWhat was the ice storm forecast last week? => result for the last week

All examples have word “forecast” meaning future by default. However, the second example also specifies an explicit data range. Yet third example has word “forecast” but is asking about past date range. The situation gets even more confusing when we account for conversational context.

You can probably come up with some basic set of rules:

  • Word “forecast” assumes standard 5-day forward weather forecast
  • Word “past” assumes some date range looking back, say past 3 days
  • Default forecast/past date ranges are overridden if there’s explicit date range in the user request

Meteorological Ambiguity

Another complication is about weather request indicator we’ve mentioned at the very beginning.

Essentially, weather request is some form of a question about meteorological condition. In my own model for this type of bot I have almost 10,000 different ways to express that… Which makes it almost impossible to just “train” the model in supervised fashion. You need some formalized way to effectively encode this model that would allow for proper versioning, testing, future extension, etc.

Make sure that whatever the tool you select to build this bot you are not asked to list all these 10,000 utterances manually!

Conclusion

If you are somewhat confused by now — it’s absolutely fine. You have to be. The problem is that even for this trivialized example the free-form natural language interface is rather a non-trivial task. A lot of people jump head first into creating different NLI/NLU apps and chatbots just to realize that users hate the interaction experience because it doesn’t match the human cognition by a l-o-n-g mile… Technology is still developing in this space.

Here’s the example of implementation for such weather app with a free-form natural language interface using DataLingvo. It doesn’t fully resolve each and every problem discussed above, but it is a good starting point and it does a pretty decent job most of the time. And it’s only couple of hundreds lines of code: