<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="6.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">M. Pfeiffer</style></author><author><style face="normal" font="default" size="100%">B. Nessler</style></author><author><style face="normal" font="default" size="100%">R. Douglas</style></author><author><style face="normal" font="default" size="100%">W. Maass</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Reward-modulated Hebbian Learning of Decision Making</style></title><secondary-title><style face="normal" font="default" size="100%">Neural Computation</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">In Press</style></year></dates><abstract><style face="normal" font="default" size="100%">&lt;p&gt;We introduce a framework for decision making in which the learning of decision   making is reduced to its simplest and biologically most plausible form:   Hebbian learning on a linear neuron. We cast our Bayesian-Hebb learning rule   as reinforcement learning in which certain decisions are rewarded, and prove   that each synaptic weight will on average converge exponentially fast to the   log-odd of receiving a reward when its pre- and post-synaptic neurons are   active. In our simple architecture, a particular action is selected from the   set of candidate actions by a winner-take-all operation. The global reward   assigned to this action then modulates the update of each synapse. Apart from   this global reward signal our reward-modulated Bayesian Hebb rule is a pure   Hebb update that depends only on the co-activation of the pre- and   postsynaptic neurons, and not on the weighted sum of all presynaptic inputs   to the post-synaptic neuron as in the perceptron learning rule or the   Rescorla-Wagner rule. This simple approach to action-selection learning   requires that information about sensory inputs be presented to the Bayesian   decision stage in a suitably pre-processed form resulting from other adaptive   processes (acting on a larger time scale) that detect salient dependencies   among input features. Hence our proposed framework for fast learning of   decisions also provides interesting new hypotheses regarding neural nodes and   computational goals of cortical areas that provide input to the final   decision stage.&lt;/p&gt;</style></abstract></record></records></xml>