The application of CMMs has been an ongoing activity within the group.
The
application areas are summarised below, in three main areas:
The applications in computer vision have developed over the years. Initial work only applied ADAM to simple images. ADAM consisting of two CMMs connected back to back. The work also investigated rotation and scale invariance in CMM based systems. From this work it was clear that an array of CMMs would serve image analysis problems better, allowing many objects within a complex multi-object scene to be recognised. For this purpose the C-NNAP architecture was developed and investigated in great depth by Chistos Orovas in his thesis [
t10]. He examined how many CMMs connected in an array could be trained on new images. The architecture used by Orovas was similar to a Cellular Automaton and this relationship has been investigated by Aaron Turner [
m5], showing how CA can be implemented with CMM based systems. The Spiking Cellular Associative Neural Network, developed by Grant Brewer
[thesis],
extended these ideas, locating the position of stored objects in a scene and using a spike based representation of uncertainty to associate a measure of confidence with each object identified.
The problem of recognising FAX data was brought to us and explored in a thesis by Simon O'Keefe [
t7]. His system was similar to that used by Orovas. The constraints imposed by a rigid 2D array (as with all the work to this point) was relaxed through work on chemical structure matching. This work paid its attention to interpreting 2 and 3D image data as graphs. This work developed a robust relaxation by elimination method that allowed communicating CMMs to filter good and bad matches from the image data. The methods were exploited by Sujeewa Alwis [
t12
] in a PhD thesis on trade mark databases. He added the ability to fuse the information from many graph matching engines working on different data extracted from the same image. He also explored the use of gestalt methods to pick out important features in the image. These methods are used in a textile data base problem in collaboration with Ned Graphics and David Evans (printers).
More details of these applications can be found by selecting one of the
following links:
The use of CMMs in access to data in large databases has been a more recent
and very successful application of the AURA architecture. The initial work
was undertaken in text data bases for address matching, now we are looking
at 3D molecular databases and web based text datasets. The general approach
is overviewed in [
123]. Details of specific
applications are given via the following links.
This was one of the first applications of the AURA method
(thus the name Advanced Uncertain Reasoning Architecture). It build on
the theoretical research in this area given in the theory
section.
The main work was developed in the EPSRC project under the
Advanced Intelligent Knowledge Manipulation Systems (AIKMS) programme.
The AURA I project (as it is now known) aimed to develop reasoning methods
based on CMMs. The work aimed to develop the theory, underlying hardware
and applications for the use of CMMs in reasoning systems. This work paved
the way for the development of the hardware and other application areas.
The main thrust of the work in reasoning is to develop systems that can
use very large numbers of rules in a fast inferencing system. The problem
also attempted to answer the question on how neural networks could be
used to represent rules.
The initial views on how to represent rules is given in [Austin (1994), Austin, Lees & Kennedy (1994), Austin (1995), Austin, Kennedy & Lees (1995)].
Early work in a PhD thesis by Tom Jackson [t4]
and by Russell Beale [t2] suggested that
complex reasoning in these systems was possible. This explains the use
of CMMs and how simple rule structures can be presented to a CMM, allowing
matching of preconditions and firing of the post conditions of a rule [Filer & Austin (1996)]. The use of distributed methods has
been central to the development of rule based reasoning in the AURA systems.
This along with the general area of Statistical
Parallelism is given in the theory
section. There remains many challenges in this work. Recent work has
demonstrated fully distributed reasoning in CMM based architectures (forth
coming paper).
Initial ideas on the implementation of reasoning in hardware
is given in [Kennedy & Austin (1995), Kennedy, Austin & Cass (1995), Austin, Kennedy & Lees (1995)]. Later implementation details are
given in the implementation section of this web
site.
Access to Large Databases
This project has investigated the problem of accessing large
address databases using ill-specified data. Typical examples would be
"Curnerr Hoose Firm, Youk, Heslonton" for "Corner House
Farm, Heslington, York, YO10 4DR". The words are spelled wrong, the
fields are not labelled with 'house', 'road', 'town' and the post code
is missing. The problem was brought to us by the Post Office Research
Centre, who needed a system to validate addresses as they are read from
the sorting machines. The work achieved a processing rate of 11 addresses
per second on a 32 node parallel processor against 25 million postal addresses.
The method used the AURA methods to archive this. Initial research on
the problem was addressed in David Lomas MSc thesis [M1]
and in a subsequent thesis (forthcoming).
This project has investigated the problem of building more flexible and
powerful full text search engines. The main innovations have been the
addition of a spell checking front end, a system for synonym matching
and a back end AURA based match engine [Hodge & Austin, 1999].
In addition a new method for building synonyms automatically has been
developed [Hodge & Austin (2000)].
This project has applied AURA based systems to chemical databases. More
details can be found here.
Financial time series prediction has typically been undertaken by building
a mode of the data using statistical methods and using this model to undertake
predictions. This method, based on AURA, uses k-NN
based methods to search a large database of past information, find
the most similar events to the most recent, then model from this small
amount of data. Initial work on this was undertaken by Dan Kustrin and
reported in a PhD thesis [t8] and a paper [Kustrin, Sanders & Austin (1997)].
Computer Vision
ADAM and Occlusion Analysis
The first work developed the associative memory ADAM (Advanced Distributed Associative Memory) to analyse occluded
objects in 2D binary images [
Austin (1986),
Austin (1987),
Austin & Stonham (1987)]. The basic ADAM memory (its name)
was introduced in [
Austin (1987)]. More details
of ADAM can be found in the
theory pages.
An example application for determining the codes on products can be found
in [
Austin (1994)].
Rotation and Scale Invariant Recognition
The problem with most simple approaches to computer vision fall down when
the image is rotated or changed in scale. Neural networks often fail on
these problems. Some early work was undertaken to overcome this problem
using distributed methods [
9,
11].
It turned out that these methods used distributed methods since exploited
in
Statistical Parallelism.
Map Matching to Airborne Imagery
Some of the first application work for ADAM and CMMs was undertaken in an
IED project (
Vision
by Associative Reasoning). This was a large multi-site project involving
York, Surrey, Rutherford Appleton Labs and BAe and was aimed at developing
a system to match maps to image taken from a flying aircraft. The approach
was to use arrays of ADAM memories to store the map and match these to images
where the roads had been identified [
Austin & Turner (1991),
Smith & Austin (1992)] .
To do this some work on texture analysis using CMMs was explored [
Smith & Austin (1992)].
The overall approach was summarised in [
Buckle & Austin (1995),
Austin, Kelly, Buckle & Brown (1993),
Buckle & Austin (1994)].
Similar problems were investigated in [
Finch & Austin (1994),
Austin (1995)]. A review of the work appeared in [
Austin (1997)].
The C-NNAP Architecture
The use of an array of associative memories was explored in
Orovas [
t10]
where the problem was to recognise a simple scene made up of a number of
overlapping 2D images. The system was to label every point in the image
with a label to show what it belonged to. The work developed a novel training
method to learn the rules in the system [
Orovas & Austin (1997),
Orovas & Austin (1997)]. The general problem involved
learning a large set of rules relating different parts of the image to each
other and then parsing the image using these rules, as explained in [
Orovas & Austin (1998),
Austin (1998),
Orovas & Austin (1998)].
The system has been targeted at the
C-NNAP hardware
and the
Cortex-1 parallel processor.
FAX Logo Recognition
The problems of binary image recognition were explored in
Simon O'Keefe's
thesis [
t7] where logos contained in fax images
needed to be recognised. Because of the high performance of the CMM based
approaches, and the size of the images to be analysed, the approach seemed
good. An overview of the system appeared in the Image Processing magazine [
O'Keefe & Austin (1995)]. The method [
O'Keefe & Austin (1994),
O'Keefe & Austin (1995),
O'Keefe & Austin (1995),
O'Keefe & Austin (1995),
O'Keefe & Austin (1996),
O'Keefe & Austin (1997),
O'Keefe & Austin (1999),
O'Keefe & Austin (1999)] used an array of memories and
an accumulator approach based on the Hough transform. The performance of
the system was good, limited only by large rotations of the logos in the
image.
The potential for CMMs in database tasks is large, particularly where the
data is complex and unstructured. A project with Glaxo-Wellcome, funded
under the EPSRC Neural Networks initiative investigated the use of
CMMs for chemical structure databases (final report can be found here).
In these systems the user must present an image of a molecule and pull back
the most similar from a database of more than 100,000 examples. The task
is made difficult due to the need to partial match the molecule as well
as coping with rotations and flexibility [
Turner & Austin (1996)].
The work initially developed the basic storage capacity of CMMs [
Turner & Austin (1997)] . It then went on to study a new method for matching molecules based on graph
representations of the structures. The method, called relaxation by elimination
(RBE), developed out of standard relaxation methods, but was suited to CMM
implementation and was less sensitive to initial conditions [
Austin & Turner (1997)].
The use of CMMs for the task has been explained in [
Austin, Turner & Lees (1999)]
and evaluations have been given in [
Austin, Turner & Lees (2000)].
The CMM based RBE methods have been exploited in the Trade Mark image
database problem and the textile
database system.
This project investigated if CMMs could be used in a system that exploited
Gestalt psychology for trade mark databases. The problem of trade mark matching
is complex, as it involves a user selecting the trade mark that 'looks like'
the stored example. This is a fairly subjective decision. To overcome this
we aimed at using gestalt psychology methods of image grouping on a number
of image features. This was followed by matching using the RBE based matcher
developed in the
Chemical Structure Matching
project. The work is published in a Thesis by
Sujeewa Alwis [
t12].
The papers [
Alwis & Austin (1998), Alwis & Austin (1998), Alwis & Austin (1999),
Alwis & Austin (1999)]
explain many of the features behind the system.
Textile Print Databases
The project in collaboration with Edwin Hancock at York and Graham Finlayson
at University of East Anglia has looked at the construction of image database
systems. The project incorporated the image matching system developed
in the trade mark project. The work was undertaken
in collaboration with David Evans Ltd who print on silk and Ned Graphics
Ltd who supply image processing software. The aim was to automate the search
for textile designs, which currently number over 100,000 at David Evans.
This was a European project to investigate the use of neural networks
in remote sensing images [Austin, Giancinto, Kanellopoulos, Lees, Roli, Vernazza & Wilkinson (1997)] [115:1997].
The work was published as a book [Austin (1997)].
Images are a typical form of complex unstructured data. Along with specific
work looking at image databases for Trade Marks
and Textiles, early work in a PhD by Mertinez
[t3] investigated the possibility of
using ADAM systems for 3D databases.