Abstract:
When subjects must choose repeatedly between two or more alternatives, each of
which dispenses reward on a probabilistic basis (two-armed bandit), their behavior
is guided by the two possible outcomes, reward and nonreward. The simplest
stochastic choice rule is that the probability of choosing an alternative
increases following a reward and decreases following a nonreward (reward
following). We show experimentally and theoretically that animal subjects behave
as if the absolute magnitudes of the changes in choice probability caused by
reward and nonreward do not depend on the response which produced the reward or
nonreward (source independence), and that the effects of reward and nonreward are
in constant ratio under fixed conditions (effect-ratio invariance)-- properties
that fit the definition of satisficing. Our experimental results are either not
predicted by, or are inconsistent with, other theories of free-operant choice such
as Bush-Mosteller, molar maximization, momentary maximizing, and melioration
(matching).