UoY - CS - AIG: Best AI Paper Awards

These are the annual awards for the best papers produced by group members.

There is one award for the best paper with a student as a first author and one award for the best paper with an RA as first author.

Best Student Paper 2011

Reddy, Siva and Klapaftis, Ioannis and McCarthy, Diana and Manandhar, Suresh, "Dynamic and Static Prototype Vectors for Semantic Composition", Proceedings of 5th International Joint Conference on Natural Language Processing, 2011

Abstract

Compositional Distributional Semantic methods model the distributional behavior of a compound word by exploiting the distributional behavior of its constituent words. In this setting, a constituent word is typically represented by a feature vector conflating all the senses of that word. However, not all the senses of a constituent word are relevant when composing the semantics of the compound. In this paper, we present two different methods for selecting the relevant senses of constituent words. The first one is based on Word Sense Induction and create a static multiprototype vector representing the senses of a constituent word. The second creates a single dynamic prototype vector for each constituent word based on the distributional properties of the other constituents in the compound. We use these prototype vectors for composing the semantics of noun-noun compounds and evaluate on a compositionality-based similarity task. Our results show that: (1) selecting relevant senses of the constituent words leads to a better semantic composition of the compound, and (2) dynamic prototypes perform better than static prototypes.

Best Student Paper 2010

Devlin S., and Kudenko D.:"Theoretical Considerations of Potential-Based Reward Shaping for Multi-Agent Systems, Proceedings of the 10th International Conference on Automous Agents and Multiagent Systems (AAMAS), 2011, ACM Press.

Abstract

Potential-based reward shaping has previously been proven to both be equivalent to Q-table initialisation and guarantee policy invariance in single-agent reinforcement learning. The method has since been used in multi-agent reinforcement learning without consideration of whether the theoretical equivalence and guarantees hold. This paper extends the existing proofs to similar results in multi-agent systems, providing the theoretical background to explain the success of previous empirical studies. Specifically, it is proven that the equivalence to Q-table initialisation remains and the Nash Equilibria of the underlying stochastic game are not modified. Furthermore, we demonstrate empirically that potential-based reward shaping affects exploration and, consequentially, can alter the joint policy converged upon.

Best Student Paper 2009

Marek Grzes and Daniel Kudenko, "Theoretical and Empirical Analysis of Reward Shaping in Reinforcement Learning", Proceedings of the 4th International Conference on Machine Learning and Applications, pp. 337–344, 2009

Abstract

Reinforcement learning suffers scalability problems due to the state space explosion and the temporal credit assignment problem. Knowledge-based approaches have received a significant attention in the area. Reward shaping is a particular approach to incorporate domain knowledge into reinforcement learning. Theoretical and empirical analysis of this paper reveals important properties of this principle, especially the influence of the reward type, MDP discount factor, and the way of evaluating the potential function on the performance.

Award Citation

This paper addresses an important question: speeding up the convergence of reinforcement learning (RL) and addresses this question with extensive theoretical and empirical analysis. The writing is also concise and clear.

Best RA Paper 2009

Mark Bartlett, Iain Bate and Dimitar Kazakov, "Guaranteed Loop Bound Identification from Program Traces for WCET", Proceedings of the 15th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2009.

Abstract

Static analysis can be used to determine safe estimates of Worst Case Execution Time. However, overestimation of the number of loop iterations, particularly in nested loops, can result in substantial pessimism in the overall estimate. This paper presents a method of determining exact parametric values of the number of loop iterations for a particular class of arbitrarily deeply nested loops. It is proven that values are guaranteed to be correct using information obtainable from a finite and quantifiable number of program traces. Using the results of this proof, a tool is constructed and its scalability assessed.

Award Citation

The paper presents an application of machine learning to a problem in the field of Real-time Systems. More specifically, it utilises inductive logic programming to help calculate the maximum execution time of programs. It is notable for its cross disciplinary nature.

Best Student Paper 2008

Marek Grzes and Daniel Kudenko, "Plan-Based Reward Shaping for Reinforcement Learning" Proceedings of the 4th IEEE International Conference on Intelligent Systems (IS 2008), volume 2, pp. 22–29, 2008.

Abstract

Reinforcement learning, while being a highly popular learning technique for agents and multi-agent systems, has so far encountered difficulties when applying it to more complex domains due to scaling-up problems. This paper focuses on the use of domain knowledge to improve the convergence speed and optimality of various RL techniques. Specifically, we propose the use of high-level STRIPS operator knowledge in reward shaping to focus the search for the optimal policy. Empirical results show that the plan-based reward shaping approach outperforms other RL techniques, including alternative manual and MDP-based reward shaping when it is used in its basic form. We show that MDP-based reward shaping may fail and successful experiments with STRIPS-based shaping suggest modifications which can overcome encountered problems. The STRIPS-based method we propose allows expressing the same domain knowledge in a different way and the domain expert can choose whether to define an MDP or STRIPS planning task. We also evaluate the robustness of the proposed STRIPS-based technique to errors in the plan knowledge.

Award Citation

The paper works on improving reinforcement learning by adding in domain knowledge using a technique known as 'reward shaping' using high-level knowledge expressed using STRIPS operators. The empirical section compares the proposed approach with many others and the proposed approach comes out top. More importantly, there is considerable analysis of why the empirical results are the way they are. The paper is notable for its mix of theory and experiment.