Non-parametric approximate linear programming for MDPs

dc.contributor.advisor

Parr, Ronald

dc.contributor.advisor

Conitzer, Vincent

dc.contributor.advisor

Maggioni, Mauro

dc.contributor.advisor

Ferrari, Silvia

dc.contributor.author

Pazis, Jason

dc.date.accessioned

2013-01-16T20:47:45Z

dc.date.issued

2012

dc.department

Computer Science

dc.description.abstract

One of the most difficult tasks in value function based methods for learning in Markov Decision Processes is finding an approximation architecture that is expressive enough to capture the important structure in the value function, while at the same time not overfitting the training samples. This thesis presents a novel Non-Parametric approach to Approximate Linear Programming (NP-ALP), which requires nothing more than a smoothness assumption on the value function. NP-ALP can make use of real-world, noisy sampled transitions rather than requiring samples from the full Bellman equation, while providing the first known max-norm, finite sample performance guarantees for ALP under mild assumptions. Additionally NP-ALP is amenable to problems with large (multidimensional) or even infinite (continuous) action spaces, and does not require a model to select actions using the resulting approximate solution.

dc.identifier.uri

https://hdl.handle.net/10161/6189

dc.title

Non-parametric approximate linear programming for MDPs

dc.type

Master's thesis

pubs.organisational-group

Duke

pubs.organisational-group

Duke

pubs.organisational-group

Trinity College of Arts & Sciences

pubs.organisational-group

Duke

pubs.organisational-group

Trinity College of Arts & Sciences

pubs.organisational-group

Computer Science

pubs.publication-status

Published

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Pazis_duke_0066N_11628.pdf
Size:
411.47 KB
Format:
Adobe Portable Document Format

Collections