Nici Schraudolph   Peter Dayan   Terry
      Sejnowski 
  
In LC Jain & N Baba, editors,  Soft Computing  Techniques in  Game
	Playing. Berlin, Germany:  Springer-Verlag. 
 Abstract 
The game of Go has a high branching factor that defeats the tree
search approach used in computer chess, and long-range spatiotemporal
interactions that make position evaluation extremely
difficult. Development of conventional Go programs is hampered by
their knowledge-intensive nature. We demonstrate a viable alternative
by training neural networks to evaluate Go positions via temporal
difference (TD) learning.
     Our approach is based on neural network architectures that
     reflect the spatial organization of both input and reinforcement
     signals on the Go board, and training protocols that provide
     exposure to competent (though unlabelled) play. These techniques
     yield far better performance than undifferentiated networks
     trained by self-play alone. A network with less than 500 weights
     learned within 3000 games of 9x9 Go a position evaluation
     function superior to that of a commercial Go program.
 compressed postscript     pdf