Feudal Reinforcement Learning
Peter Dayan   Geoff Hinton
In NIPS 5, 271-278.
Abstract
One way to speed up reinforcement learning is to enable learning to
happen simultaneously at multiple resolutions in space and time. This
paper shows how to create a Q-learning managerial hierarchy in which
high level managers learn how to set tasks to their sub-managers who,
in turn, learn how to satisfy them. Sub-managers need not initially
understand their managers' commands. They simply learn to maximise
their reinforcement in the context of the current command.
We illustrate the system using a simple maze task.. As the system
learns how to get around, satisfying commands at the multiple levels,
it explores more efficiently than standard, flat, Q-learning and
builds a more comprehensive map.
compressed postscript   pdf
See also Dayan (1998).