February 13th

General Similarity Metrics
Derek Bridge
University of York

Abstract

In [1] we introduced a formal framework for constructing ordinal similarity measures, and suggested how this might also be applied to cardinal measures. In this talk we will place this approach in a more general framework, called similarity metrics. In this framework ordinal similarity metrics (where comparison returns a boolean value) can be combined with cardinal metrics (returning a numeric value) and, indeed, with metrics returning values of other types, to produce new metrics.

[1] A Case Base Similarity Framework. Hugh Osborne and Derek Bridge. In: Proceedings of EWCBR 96, Lecture Notes in Artificial Intelligence 1168.


February 26th (Departmental Seminar)

An Experiment into Stereochemistry-based Drug Design using Inductive Logic Programming
C. David Page
PRG, University of Oxford

Abstract

Previous applications of Inductive Logic Programming (ILP) to drug design have not addressed stereochemistry, or the three-dimensional aspects of molecules. While some success is possible without consideration of stereochemistry (particularly when dealing with issues of toxicity/carcinogenicty), researchers within the pharmaceutical industry consider stereochemistry to be central to most drug design problems. This talk reports on an experimental application of the ILP system Progol to stereochemistry-based drug design.

The experiment tests whether Progol can identify the structure responsible for the activity of ACE (angiotensin-converting enzyme) inhibitors from 28 positive examples, that is, from descriptions of 28 molecules that exhibit the activity of ACE inhibition. ACE inhibitors are a widely-used form of medication for the treatment of hypertension. The experiment consists of two parts. The first part is a ``blind test'', proposed by a researcher within the pharmaceutical industry to see if Progol can re-discover a proposed structural explanation for ACE-inhibition given a single conformation (3D arrangement) for each active molecule. The second is a test to see whether Progol can handle multiple conformations for each molecule. In the first part of the experiment, Progol successfully identifies the proposed structural explanation for ACE-inhibition and notes that there are no other candidate explanations using the given conformations. In the second part of the experiment, Progol identifies the original structure and also two new potential explanations for ACE-inhibition. These additional explanations currently are being evaluated by a computational chemist.

(Joint work with Stephen Muggleton and Ashwin Srinivasan)


February 27th

Feature-Based Grammars as Constraint Grammars
Alan Frisch
University of York

Abstract

This talk clarifies the foundations of feature-based grammars by making three contributions:

Finally, the talk will discuss how constraint grammars could be formulated in a general way so as to subsume both feature-based grammars and definite clause grammars.


March 6th

Inductive Logic Programming (with Applications in Biology)
Stephen Muggleton
Machine Learning Group
Oxford University Computing Laboratory


March 20th

A Postscript Document Management System
John Brown
University of York

Abstract

The seminar is based on a paper I presented at Iscis XI, Nov. 1996. Postscript is an ideal, as-yet academically unexplored storage format for document management: its advantages are listed in comparison with raster files, and with text marked up with SGML or HTML. A prototype system is described, written in 20,000 lines of C and interfacing to the portable Display Postscript interpreter and X Graphics User Interface. Screen snapshots show the document catalogue, boolean query facility on author, title and document text, the document viewer and automatically generated contents list and alphabetic index. An interactive reading path addresses the "lost in hyperspace" problem, and a browsable semantic network supports users unsure of the exact query terms they seek. Indexing performance data is presented for a large document. Analysis methods for indexing Postscript are discussed, and a general method, with acceptable indexing time, is proposed for Postscript coming from any source. The wider applicability of these techniques to computer aided learning/teaching, semi-automatic multimedia preparation, and enterprise modelling are briefly discussed.


Tuesday 13th May - 1:15 in XD007

Knowledge Management from an AI Perspective
Andy Dearden
University of York

Abstract

This is a dry run for a talk I shall be giving to a meeting on Knowledge Management, organised by the Northern Organisational Analysis Research Network.

My viewpoint will emphasise the nature of knowledge based systems as communication media between authors and users of knowledge. From this viewpoint, important questions are:


Wednesday 21st May - 2:15 in XD007 (Departmental Seminar)

Planning and Cognitive Robotics
Murray Shanahan
Department of Computer Science
Queen Mary and Westfield College

Abstract

In 1969 Cordell Green presented his seminal description of planning as theorem proving with the situation calculus. In this talk I will endeavour to reinstate the ideal of planning via theorem proving in a more modern guise. In particular, I will show that if we adopt the event calculus as our logical formalism and employ abductive logic programming as our theorem proving technique, then the computation performed mirrors closely that of a hand-coded partial order planning algorithm. Furthermore, if we extend the event calculus in a natural way to accommodate compound actions, then using exactly the same abductive theorem prover we obtain a hierarchical planner.

The planner I will describe has been developed as part of a Cognitive Robotics project, the aim of which is to build robots whose architectures are based on the manipulation of sentences of logic. In addition to describing the planner itself, I will outline the other components of the system which will hopefuly be deployed on our small fleet of miniature robots.


Wednesday 28th May - 2:15 in XD007 (Departmental Seminar)

Genetic Algorithms for Timetabling: A Good Idea or Not?
Peter Ross
Department of Artificial Intelligence
University of Edinburgh

Abstract

Genetic algorithms can be used to search very large spaces, and it would seem natural to use them for tackling the nastier kinds of timetabling problem. We completed an EPSRC-funded project on this last year, and distribute a free package that handles a good range of lecture and exam timetabling problems, including some whole-universty-sized ones; constraints it caters for include room capacities, ordering constraints, spread-'em-out constraints and inter-site travel times. The Department of AI has used this approach for years to handle the timetabling of MSc exams, whose pattern varies considerably from year to year. You might be inclined to ask yourself either "why should it work?" or "why shouldn't it work?". This talk might answer both questions for you. For problems with only binary constraints, graph-colouring methods are often much more effective; for others a GA may be the only game in town. But even in the realm of graph-colouring problems, there are some surprises. In studying a parameterised space of guaranteed-solvable timetabling-like problems, our GA can solve all highly-constrained problems quickly but may fail to solve some very lightly constrained ones. It turns out that some well-regarded non-evolutionary algorithms exhibit the same weaknesses. Some close study hints at why.


Thursday 3rd July - 1:15 in XD007

The Problems of Large Scale Mechanical Verification
Dave Stringer-Calvert
University of York

Abstract

Formal verification was once hailed as the solution to all software correctness issues. Experience has shown that existing verification environments are tedious to use, require high levels of expertise, and thus industry has adopted half-way house `formalised' methods where necessary, such as Z specifications and hand proof sketches. This seminar addresses these failures of formal verification from two angles - the sensible choice of application domains, and the necessary mechanical support.

Dave Stringer-Calvert is a Research Student in the High Integrity Systems Engineering Group at the University of York. He is funded by Siemens Plessey Systems, and also holds the position of International Fellow at the Computer Science Laboratory, SRI International. He has performed numerous verification examples using SRI's verification system `PVS', and is the author of an advanced tutorial on theorem proving in PVS.
His current work involves the verification of compilers for small imperative languages, described in a paper to be presented at FME in September.


Thursday 31st July - 1:15 in ???

Modelling Evaluative Judgements
Jane Moorhouse
University of Reading

Abstract

Many of the existing decision aids used in marketing and management make use of statistical techniques based on an extrapolation of historical data or are designed within the confines of economic rationality. The models tend to assume a selection from an entire set of available options, focusing on the alternatives being chosen rather than on the decision maker. The models afford no understanding of the characteristics of individuals with different values and cognitive behaviours, and usually assume a fixed method for comparing alternatives. In order to provide more effective decision aid and product management tools, a model is needed which incorporates the contingent decision behaviour of individual consumers and the consumer's value perceptions of products at both the surface level and the deeper level of the product's underlying intrinsic features with respect to the purpose of the decision. The conceptual model that we present in this paper is a computational extension of a combination of theoretical models of decision making, in which we have integrated theories from the disciplines of Decision Science, Consumer Behaviour and AI. Our framework transforms the utility function into a more flexible and robust purpose-oriented hierarchical structure. Bounded rationality is dealt with by introducing decision strategies from the consumer behaviour discipline.


Thursday, October 23rd, 1:15pm - Room CS202J

An Application of Machine Learning to Analytical Chemistry
Chris Bryant
University of York

Abstract

For a pharmaceutical company, the process of getting a new drug to the marketing stage is long and expensive. The work involved may take 20 years and cost more than 150 million pounds by 1990 estimates. Anything which can help to reduce this cost is important and relevant to the pharmaceutical industry.

If two molecules are non-superimposable mirror images of each other then together they form an enantiomer pair. The two members of an enantiomer pair may have different pharmacological properties. For example in the USA dextromethorphan is a drug available without prescription, whereas levomethorphan, its enantiomeric partner, is a controlled narcotic. Less dramatic examples abound; 12 of the 20 most prescribed drugs in the USA and 114 of the top 200 have one or more asymmetric centres in the drug molecule. Consequently the pharmaceutical industry is becoming more and more interested in methods to separate pairs of enantiomers in order that individual enantiomers may be subjected to pharmacological testing. Reducing the cost of these separations is therefore of increasing importance. A computer system which is capable of recommending appropriate apparatus for the separation of a given pair of enantiomers would save time and money. This seminar describes a project which aims to take a first step towards such a system.

Two Machine Learning programs have been applied to data from a database of published enantioseparations performed on commercially available Chiral Stationary Phases (CSP) with the aim of inducing generalisations that recommend particular CSP chiral selectors based on the structural features of an enantiomer pair. The two programs, Golem and Progol, are from the field of Inductive Logic Programming and are available in the public domain. A preliminary assessment of the usefulness of the resulting generalisations is made using their accuracy, size, ease of interpretation and chemical justification.


Thursday, November 6th, 1:15pm - Room CS202J

Computing Inverse Entailment
Stephen Muggleton
University of York

Abstract

Inductive Logic Programming involves construction and testing of logical hypotheses from data. A key approach to the construction of hypothesis involves inversion of the logical entailment relationship. This approach is used within the Progol system. Recently Yamamoto has shown that the mechanisms used in Progol are incomplete for inverting entailment, though complete for inverting relative subsumption. In this talk it will be shown how Progol's mechanisms for inverting entailment can be extended. The extended mechanism is proved complete for function-free logic programs.


The Fourth Annual Knowledge Representation and Reasoning Distinguished Lecturer

Thursday, 13th November, *4:15pm* - Room *CS103* ( Departmental Seminar)

Genefinding and Protein Structure Prediction with Hidden Markov Models
David Haussler
University of California at Santa Cruz

Abstract

With the Human Genome Project and other model organism genome sequencing projects now in full swing, databases of DNA, RNA and protein sequences are growing at an explosive rate, and the need for statistical/computational methods for biosequence analysis has become acute.

Right now we need effective methods for locating genes in DNA sequences, along with their splice sites and regulatory binding sites, and for classifying new proteins to detect weak homologies to known proteins and predict their possible functions. Tools available for this analysis range from simple and general text search methods to detailed protein folding models of the type used in protein threading and ab initio protein structure prediction. Hidden Markov Models (HMMs) lie somewhere in the middle of this spectrum. They are computationally efficient enough for use with large databases, yet flexible enough to be used in constructing specific, detailed statistical models of the sequence variation within a particular protein family, or within a family of related DNA binding sites. We will describe what HMMs are and how they are used in biosequence analysis. Then we will report how they performed in comparison to other methods in the CASP2 international test of protein structure prediction methods.

Friday, 14th November, *3:00pm* - *Room 7:30 School of Computer Studies, University of Leeds*

Will Share Prices Rise? Worst-case Prediction of Individual Sequences
David Haussler
University of California at Santa Cruz

Abstract

Suppose you are given a set of functions, here called a "comparison set", in which each function is a different method for predicting whether the stock market will go up or down on a given day, given the values of certain economic indicators for the period preceding that day. Suppose further that over the next N days, you must predict the direction of the stock market each day, and your goal is to predict as well as that function in the comparison set that does best in predicting the market for these coming N days, no matter what the market does. Thus it would be as if you had somehow known the best prediction method to use for these particular N days ahead of time. Of course, this goal is unattainable, so you settle for a prediction method that is guaranteed to predict "almost" as well as the function that does best over the coming N days. Such a prediction method is called a worst-case (or "universal") predictor for individual sequences, and can be applied to any sequence data: stock market, weather, etc. We discuss what kinds of universal predictors exist, and how they may be used in machine learning applications.


Thursday, 20th November, 1:15pm - CS202J

Phase Transitions, Constrainedness and Search
Toby Walsh
University of Strathclyde

Abstract

There has been much interest in threshold phenomena, so called "phase transitions", in a variety of different NP-complete problems. Whilst random problems are often very easy to solve, problem instances from around the phase boundary tend to be hard to solve. I will present a unifying framework for studying such phase transition phenomena. I will introduce a parameter that measures the ``constrainedness'' of search problems. This constrainedness parameter can be used to locate phase transitions in many different NP-complete problem classes. It can also be used in a heuristic to guide search. This heuristic makes the most constrained choice by minimizing the constrainedness of the subproblem into which we branch. Many previously proposed heuristics can be seen as minimizing constrainedness or proxies for it. (This is joint work with Ian Gent, Ewan MacIntyre and Pat Prosser)


Friday, 28th November, 1:15pm - CS202J

Saso Dzeroski
Jozef Stefan Institute


Thursday 11th December, 1:15pm - Room CS202J

Learning Natural Language Syntax
Stephen Watkinson
University of York

Abstract

This will be a review of literature from three areas of Natural Language acquistion research: theoretical linguistic work, symbolic machine learning work and statistical learning methods.

The aim will be to provide a critical appraisal of the work in these areas and to show where useful overlap exists.


Friday, December 12th, 2:15pm - Room CS103

Learning in rational agents
Stuart Russell
Computer Science Division
University of California, Berkeley

Abstract

The long-term goal of artificial intelligence (AI) research is the creation and understanding of intelligence. For most of its history, AI has relied on a formal conception of intelligence as *rationality*, the ability to select optimal actions. I will outline several recent developments in methods for learning to behave rationally. Then I will argue that this line of research is ultimately doomed: agents cannot act rationally because of the complexity of selecting optimal actions. I will propose an alternative formal definition of intelligence that takes into account resource limitations, and I will sketch one possible research agenda arising from this definition. In particular, I will examine whether, and in what form, our understanding of learning survives in the new regime.


Wednesday, December 17th, 1:15pm - Room CS202J

Is there a better way to extract information from protein databases?
Miklos Cserzo
University of Birmingham

Abstract

The determination of the 3D structure of a protein from its amino-acid sequence is a long-standing biochemical problem. So far only limited-precision predictions for general cases, or more accurate ones for special cases, have been achieved. The predictive power of early methods was based on single amino-acid parameters and these were significantly improved by the use of multiple alignment and neural-net-based algorithms, with a limit of around 70%-75% overall efficiency. There is no obvious way to improve these methods and alternative algorithms for structure prediction need to be explored.

Traditionally, the protein structure database is analyzed extensively, searching for clues to the sequence - structure relationship. Unfortunately, the sample size of that database is small due to the experimental difficulties involved in determining protein structures. Consequently the efficiency of the prediction methods is limited too. On the other hand, calculations detected strong clustering effects of amino acids in the protein sequences. These effects are observed for all protein sequences in the database and in a fairly uniform manner. Further, these general rules were found to be related to the hydrophobic properties of the amino acids, which is the major driving force of protein folding. Such a correlation is unlikely by chance, and systematic analysis of the raw data could provide further insight into protein sequence - structure relationships. Such an analysis would provide an alternative - information theory based - approach to the protein folding problem.

Last updated on 10 March 2011