Databases of chemical structures are an important tool for the pharmaceutical
industry because they allow virtual screening of molecules for particular
biological effects. The molecules in these databases are typically represented
by atom-bond graphs. The aim of this work is to apply graph matching methods
to retrieval, activity prediction and generation of new molecules.
Both classic graph matching techniques and spectral methods can be applied
to this problem.
Current research
Clustering is an important application in large databases, and allows similar
molecules to be grouped together. Previous work has looked at the clustering
of graph structures based on distance measures, techniques which could be
applied to chemical structure databases.
Activity prediction of compounds in a database allows for virtual screening
which can give an indication of which molecules may have potential for a
particular application. There exist many powerful methods for achieving
this on vectorial data. These methods could be extended to molecular graphs
either using graph embedding methods or spectral features.
The generation of new compounds which are potentially active against a particular
target is an important goal in computational chemistry. The aim here is to
use the information in a database of compounds and their activities to
construct a generative model which can generate new compounds which may
have similar activities.
This is coupled to our work on generative models.