AMADEUS

<< Projects index

Projects: UDA — Ubiquitous Digital Agents

Partners

University of York
Suresh Manandhar
01904 432746
Lexicle
Patrick Olivier
01904 435266

Presentation

Start presentation >>

Slide 1 Slide 2 Slide 3

Overview

Current natural language understanding systems typically require huge amounts of processing which is unrealistic on low power devices. However, in most cases, it is sufficient that each device only offers a limited natural language processing ability -- e.g. a fridge, email reader etc. may only be required to provide responses to specific questions. On the other hand, the range of linguistic variation that people use to ask the same question is very large. For instance, a simple question such as "what is the building temperature?" can be asked in many different ways e.g. "how hot/cold is it?". Current implementations of language understanding technology on low power devices lacks the ability to handle language that is natural to the user as opposed to language the device can understand.

UDA aims to build light-weight device specific natural language question-answering systems which are automatically generated using machine learning techniques from training data. Although it is possible to construct such systems manually there are two associated problems: firstly, there is a high cost involved in building and evaluating hand crafted systems. Secondly, the range of vocabulary and linguistic coverage of existing systems is very low. The UDA project will address both these issues by employing large QA (question-answer) datasets generated using a broad coverage industrial strength system such as Lexicle's natural language understanding system. Lexicle's system will be modified to automatically generate variations of essentially the same question. Lexicle's system will also be used to automate building device specific question-answer pairs. Machine learning techniques, which are mature, will be employed to learn, from the datasets, device specific (probabilistic) automata that can be deployed on low-power devices.

The use of large training datasets addresses the issue of linguistic coverage. The use of machine learning techniques means that device specific language understanding systems can be built at low cost with minimal programming effort.

Coupled to the QA system will be a lightweight generic dialogue management system that can guide the user, understand responses from the QA system in context and resolve ambiguities. Each node will have the ability to pass on questions to other nodes in the neighbourhood if the current node is not able to understand it. This will allow a group of nodes to co-operate in the question-answering task.