Structured and Continuous Reinforcement Learning (FWF project P 26219-N15)
Structured and Continuous Reinforcement Learning
FWF project P 26219-N15 (April 2014 - May 2016)



an MDP



Project leader: Ronald Ortner

Department für Mathematik und Informationstechnologie
Lehrstuhl für Informationstechnologie
Montanuniversität Leoben
Franz-Josef-Straße 18
A-8700 Leoben

Tel.   +43-3842-402-1503
Fax   +43-3842-402-1502
E-mail   ronald.ortner@unileoben.ac.at





 

About the project
    In the precursor project (see below), we were able to define very general similarity structures for reinforcement learning problems in finite domains and to achieve improved theoretical regret bounds when the underlying similarity structure is known. The developed techniques and algorithms also led to the first theoretical regret bounds for reinforcement learning in continuous domains (see the NIPS 2012 paper below). In the current project we want to take the research on continuous reinforcement learning - a setting which is of particular importance for applications - a step further, not only by improving over the known bounds, but also by the development of efficient algorithms. Moreover, we also want to investigate in more general settings where the learner does not have direct access to the domain information, but only to a set of possible models. Also for this setting, the precursor project has produced first theoretical results, assuming finite domains and that the set of possible models contains the correct model (see ICML 2013 and AISTATS 2013 paper below). In the current project, we aim at generalizing this to infinite domains and loosening the assumption on the model set, which shall not necessarily contain the correct model, but only a good approximation of it.

  • Abstract
  • Project proposal
  • Final Report
  • Final Review



Precursor project (Erwin Schrödinger scholarship, FWF project J3259-N13)


Jobs
    Currently, no jobs are available.


Publications (also of the precursor project)





Created on October 14th, 2013, last modified on July 21st, 2016.