Using Unsupervised Learning Methods to Extract Structure in Value Functions.
David Foster   Peter Dayan
Machine Learning, in press.
Abstract
Solving in an efficient manner many different optimal control tasks
within the same underlying environment requires decomposing the
environment into its computationally elemental fragments. We suggest
how to find fragmentations using unsupervised, mixture model,
learning methods on data derived from optimal value functions for
multiple tasks, and show that these fragmentations are in accord
with observable structure in the environments. Further, we present
evidence that such fragments can be of use in a practical
reinforcement learning context, by facilitating online,
actor-critic learning of multiple goals MDPs.
compressed postscript   pdf