
Date: Wednesday 23rd May 2012, 11:15, CSE102 Computer Science, Univ. of York
Speaker: Kyriakos Efthymiadis, Department of Computer Science, Univ. of York
Topic: Overcoming Incorrect Knowledge in Plan-Based Reward Shaping (Practice talk for ALA Conference)
Reward shaping has been shown to significantly improve an agent's performance in reinforcement learning. Plan-based reward shaping is a successful approach in which a STRIPS plan is used in order to guide the agent to the optimal behaviour. However, if the provided knowledge is wrong, it has been shown the agent will take longer to learn the optimal policy. Previously, in some cases, it was better to ignore all prior knowledge despite it only being partially incorrect. This paper introduces a novel approach in overcoming incorrect domain knowledge when provided to an agent receiving plan-based reward shaping by the use of knowledge revision. Empirical results show that an agent using this method can outperform the previous agent receiving plan-based reward shaping without knowledge revision.
Speakers: Daniel Kudenko & Sam Devlin, Department of Computer Science, Univ. of York
Topic: Unidentified Research Object: Multi-Agent Knowledge-Based Reinforcement Learning
In this talk we will discuss approaches to modelling multi-agent specific knowledge and how they may be used to guide reinforcement learning. We will also discuss how the learning agents could then revise how we originally perceived the system and why having a human in the loop of knowledge revision/guidance may be beneficial.
  Date: Wednesday 30th May 2012, 11:15, CSE102 Computer Science, Univ. of York
Speaker: Sam Devlin, Department of Computer Science, Univ. of York
Topic: Dynamic Potential-Based Reward Shaping (practice talk for AAMAS Conference)
Potential-based reward shaping can significantly improve the time needed to learn an optimal policy and, in multi-agent systems, the performance of the final joint-policy. It has been proven to not alter the optimal policy of an agent learning alone or the Nash equilibria of multiple agents learning together. However, a limitation of existing proofs is the assumption that the potential of a state does not change dynamically during the learning. This assumption often is broken, especially if the reward-shaping function is generated automatically. In this talk, I will prove and demonstrate a method of extending potential-based reward shaping to allow dynamic shaping and maintain the guarantees of policy invariance in the single-agent case and consistent Nash equilibria in the multi-agent case.
Speaker: Kleanthis Malialis, Department of Computer Science, Univ. of York
Topic: Reinforcement Learning of Throttling for DDoS Attack Response (practice talk for ALA Conference)
Distributed denial of service attacks (DDoS) constitute a serious and evolving threat in the current Internet. The most common type of these attacks is the flooding DDoS attack, which is designed to exhaust computer or network resources. Router throttling is a popular approach in the battle against these attacks, which views the flooding DDoS problem as a resource management or congestion problem. In this paper, we introduce a learning throttling approach which provides a highly adaptive response to such attacks. We compare our proposed approach against two other throttling approaches from the literature. It is shown that our approach effectively mitigates the impact of flooding DDoS attacks, and that it overcomes potential stability and convergence problems that the two throttling approaches suffer from.
  Date: Wednesday 6th June 2012, 11:15, CSE102 Computer Science, Univ. of York
(no seminar)Date: Wednesday 13th June 2012, 11:15, CSE102 Computer Science, Univ. of York
Speaker: Waleed Alsanie, Department of Computer Science, Univ. of York
Topic:
  
Date: Wednesday 20th Jun 2012, 11:15, CSE102 Computer Science, Univ. of York
Speaker: Jon Timmis, Department of Computer Science, Univ. of York
Topic: