What Suboptimal Choice Tells Us About the Control of Behavior. Thomas R. Zentall. Comparative Cognition & Behavior Reviews, Volume 14: pp. 1–18. http://comparative-cognition-and-behavior-reviews.org/2019/volume-14-pp-1-18/
Abstract: When animals make decisions that are suboptimal, it helps us to identify the processes that have evolved to produce this behavior. In an earlier article, I discussed three examples of suboptimal choice or bias (Zentall, 2016): (a) sunk cost, the tendency to continue on a losing project because of the amount already invested; (b) unskilled gambling, in which the loss is greater than the return; and (c) justification of effort, the bias to prefer conditioned stimuli that in training required more effort to obtain. Here I discuss three additional examples of suboptimal choice that we have studied in animals: (a) when less is better, in which animals prefer one piece of food (one preferred item) over two pieces of food (one preferred item plus one less preferred item); (b) suboptimal choice on the ephemeral choice task, in which animals prefer one piece of food now over two pieces of the same food, one now but the second briefly delayed; and (c) suboptimal choice in the midsession reversal task, errors of anticipation and perseveration. Each of these examples may help to identify the relative limits on behavioral flexibility found when animals are exposed to conditions that may be different from those that they would normally encounter in their natural environment. They also may help us to understand the origins of similar behavior when it occurs in humans.
Keywords: suboptimal choice, less is better, ephemeral reward, midsession reversal
---
Abstract: When animals make decisions that are suboptimal, it helps us to identify the processes that have evolved to produce this behavior. In an earlier article, I discussed three examples of suboptimal choice or bias (Zentall, 2016): (a) sunk cost, the tendency to continue on a losing project because of the amount already invested; (b) unskilled gambling, in which the loss is greater than the return; and (c) justification of effort, the bias to prefer conditioned stimuli that in training required more effort to obtain. Here I discuss three additional examples of suboptimal choice that we have studied in animals: (a) when less is better, in which animals prefer one piece of food (one preferred item) over two pieces of food (one preferred item plus one less preferred item); (b) suboptimal choice on the ephemeral choice task, in which animals prefer one piece of food now over two pieces of the same food, one now but the second briefly delayed; and (c) suboptimal choice in the midsession reversal task, errors of anticipation and perseveration. Each of these examples may help to identify the relative limits on behavioral flexibility found when animals are exposed to conditions that may be different from those that they would normally encounter in their natural environment. They also may help us to understand the origins of similar behavior when it occurs in humans.
Keywords: suboptimal choice, less is better, ephemeral reward, midsession reversal
---
Those of us who study the behavior of animals assume that
they have evolved to maximize their success (e.g., at finding food), and
much of learning theory (Skinner, 1938; Thorndike, 1911) is based on
this premise. Animals select those responses that lead to the increased
probability of reinforcement over those that do not. When animals’
behavior is consistent with this theory, it strengthens our belief in
the validity of the theory. However, when animals show a preference for
alternatives that result in less food over those that result in more
food, it is important to try to understand why they do.
Kacelnik (2006) suggested that rationality in decision
making can be defined in different ways. When defined by philosophers
and psychologists, it has been judged in terms of the reasoning or
thought processes that accompany the decisions. When defined by
economists, it does not require thought processes but refers to behavior
that is internally consistent and is compatible with expected utility
maximization. When defined by biologists, it is broader and goes beyond
the organism to allow for inclusive fitness (including benefit to one’s
kin).
Sometimes, what appears to be an irrational choice may
reflect a change in state. An animal’s preference for one kind of food
over another may reverse if it has been sated on the preferred food, or
an animal that has a choice between eating and being with conspecifics
may choose the latter because being close to others may enhance feeding
rate or may offer safety from predation (Kacelnik, 2006). Alternatively,
the condition that the animal is in may cause it to choose less food
over more food. For example, an animal may choose a low probability but
possibly larger amount of food over a frequent but smaller amount of
food but one that will not allow it to survive through the night
(Stephens, 1981; see also Houston, McNamara, & Steer, 2007).
When animals prefer an alternative that provides them with
less food over one that provides them with more (i.e., they choose
suboptimally), on one hand, it may cause us to question the processes
that underlie that behavior. In an earlier article in this journal
(Zentall, 2016), I described a task in which pigeon showed a strong
preference for one alternative that on 20% of the trials provided them
with a signal for reinforcement and on 80% of the trials provided them
with a signal for the absence of reinforcement, over a second
alternative that always provided them with a signal for 50%
reinforcement (Stagner & Zentall, 2010). With this procedure, not
only do pigeons quickly show a preference for the 20% signaled over the
50% unsignaled reinforcement but they show no evidence that they learn
to correct that preference with extensive training. Furthermore, that
preference is not simply controlled by the uncertainty of the
reinforcement associated with the higher probability of reinforcement
alternative, because even when the alternatives are between 50% signaled
reinforcement and 100% reinforcement, pigeons do not show a preference
for the optimal alternative (McDevitt, Dunn, Spetch, & Ludvig, 2016;
Smith & Zentall, 2016). In addition, a similar pattern of
suboptimal choice can be shown when reinforcement magnitude is
manipulated. For example, pigeons prefer a 20% chance of obtaining a
signal for 10 pellets of food over a 100% chance of obtaining a signal
for three pellets of food (Zentall & Stagner, 2011).
When animals choose suboptimally, it may tell us something
about the natural environment in which the animals have evolved (Fortes,
Pinto, Machado, & Vasconcelos, 2018). Several mechanisms may be
responsible for this suboptimal choice. First, in nature, when an animal
approaches a stimulus that signals the presence of food, it is likely
that the probability of reinforcement will increase. Not so in this
choice task in which choice frequency has no effect on the probability
of reinforcement. Second, in nature, when an animal encounters a signal
for the absence of food, that signal can generally be ignored, because
the animal will simply reject it and look elsewhere for food (Fortes et
al., 2018; Vasconcelos, Machado, & Pandeirada, 2018). That is, in
nature there is no need to remain in its presence, so it does not
acquire inhibitory value, whereas the animal must remain in its presence
in the laboratory choice experiment.
Although the predictive value of the conditioned reinforcer
that follows choice of each alternative, independent of its probability
of occurrence, appears to predict choice (Smith & Zentall, 2016),
evidence suggests that there may be a third factor (Case & Zentall,
2018; McDevitt et al., 2016). Case and Zentall (2018) found that when
pigeons are given a choice between 50% signaled reinforcement and 100%
reinforcement, they initially show indifference between the two
alternatives; however, with continued training they show a significant
preference for the suboptimal alternative (see also Kendall, 1974). Case
and Zentall suggested that the preference for the suboptimal
alternative may result from positive contrast between the expected value
of reinforcement following choice of the suboptimal alternative and the
value of the conditioned reinforcer that follows on half of the trials.
Positive contrast would not be expected between choice of the optimal
alternative and the conditioned reinforcer that follows, because the
expected value of reinforcement is consistent with the value of
reinforcement that follows. A similar mechanism was suggested by
McDevitt et al. (2016), who proposed that the conditioned reinforcement
that followed choice of the suboptimal alternative represented “good
news,” whereas the conditioned reinforcement that followed choice of the
optimal alternative was not newsworthy. Although identifying the
predispositions responsible for suboptimal choice with this procedure
will likely require further research, the inability of the pigeons to
learn to choose optimally suggests that there are conditions under which
pigeons do not appear to have the flexibility to overcome these
predispositions.
In the earlier article (Zentall, 2016), I identified two
other cases in which pigeons fail to choose optimally. The first was
research on the sunk cost effect in which
pigeons prefer to complete pecking on one reinforcement schedule over
changing to another reinforcement schedule, even though changing to the
other schedule would have reduced the time and effort (amount of
pecking) to reinforcement. For example, pigeons first learned to peck
30 times for food when the color was green and 10 times for food when
the color was red. They then learned that after pecking green a variable
number of times, they would be given a choice between completing the
pecks to green and switching to peck the red 10 times. Surprisingly, the
pigeons preferred to return to pecking green, even when returning to
green required as many as 25 more pecks (Pattison, Zentall, &
Watanabe, 2012; see also Magalhães & White, 2014; Navarro &
Fantino, 2005).
The second additional line of research described in the
Zentall (2016) article actually involved a bias rather than a
suboptimality. Pigeons were trained to peck a light to receive a choice
between two colors. On some trials, a single peck was required and the
choice was between, for example, red and yellow and choice of red was
reinforced. On other trials, 20 pecks were required and the choice was
between, for example, green and blue and choice of green was reinforced.
On probe trials, pigeons were given a choice between red and green, the
two colors both associated with reinforcement. Surprisingly, the
pigeons showed a preference for green, the color that during training
they had to work harder to obtain. When a similar effect has been found
in humans (e.g., Aronson & Mills, 1959), it has been referred to as
the justification of effort effect; however, we prefer to interpret this preference as a contrast
effect. That is, the positive contrast between 20 pecks and green was
greater than the positive contrast between one peck and red.
In the present article I examine three additional
phenomena, each of which demonstrates a behavior that is suboptimal. The
first is commonly referred to as the less is better effect; the second
is the failure to learn to choose optimally on a task in which choice of
one alternative provides two reinforcements, whereas the other provides
only one (the ephemeral reward task); and the third is the failure to
choose optimally on the midsession reversal task.
The Less Is Better Effect
Economists have traditionally held that when humans are
given sufficient information, they generally make rational choices
(Persky, 1995). This is the basis of rational choice theory (Becker,
1976). However, Tversky and Kahneman (1974) challenged this notion by
showing that humans tend to use various affective heuristics in making
decisions and those heuristics can be shown to lead to suboptimal
decisions. Such an example is the less is better effect (sometimes
referred to as the less is more effect), demonstrated in several
experiments by Hsee (1998). In one example, Hsee asked subjects to
estimate the value of a set of 24 dishes, all in good condition, or to
estimate the value of a set of 40 dishes, but only 31 were in good
condition. Surprisingly, the set of 24 dishes was valued higher than the
set of 40 dishes. Apparently, the nine dishes of poor quality
depreciated the value of the 31 good-quality dishes. The average quality
of the set, as a whole, apparently overshadowed the objective judgment
of the value of the set. But this effect may be unique to humans, who
may be sensitive to the aesthetics of the two sets of dishes.
In another study, subjects were asked to imagine that a
friend had given them a $55 wool coat from a store where coats cost
between $50 and $500, or alternatively a $45 wool scarf from a store
where scarves cost between $5 and $50 (Hsee, 1998). The subjects said
that they would be happier with the scarf than with the coat because the
purchase of the scarf would reflect greater generosity than the
purchase of the coat. The scarf was at the high end of the range,
whereas the coat was at the low end of the range. This finding suggests
that if gift givers want their gift recipients to perceive them as
generous, it would be better for them to give a high-value item from a
low-value product category (e.g., a $45 scarf) than a low-value item
from a high-value product category (e.g., a $55 coat).
Would animals show the same bias if food of different
quality was used rather than dishes or clothing? According to optimal
foraging theory (Stephens & Krebs, 1986), other factors being equal
(e.g., the possibility of predation), nature should select against any
tendency to prefer an alternative that provides less food. Kralik, Xu,
Knight, Khan, and Levine (2012) tested this hypothesis. They found that
monkeys readily would eat grapes and sliced cucumbers, but when offered a
choice between them, they preferred the grapes. When the monkeys were
offered a choice between a grape by itself or a grape and a slice of
cucumber, however, they generally showed a strong preference for the
grape alone.
A similar effect was found by Beran, Ratliff, and Evans
(2009) for two of four chimpanzees when given a choice between a slice
of banana and a similar slice of banana plus a slice of apple.
Similarly, chimpanzees were indifferent between a preferred pellet and a
similar pellet plus either a less preferred piece of carrot or a less
preferred piece of apple (Sanchez-Amaro, Pereto, & Call, 2016). And
when Beran, Evans, and Ratliff (2009) manipulated the quantity rather
than the quality of the combined option, four chimpanzees preferred a
20 g slice of banana over the same 20 g slice of banana plus an
additional 5 g slice of banana.
Dogs, too, have been found to show a less is better effect
(Pattison & Zentall, 2014). Several dogs were found to eat a slice
of carrot or a slice of cheese, but when given a choice, they preferred
the cheese. However, when given a choice between the cheese and a
combination of the cheese and the carrot, these dogs preferred the
cheese alone (see Figure 1).
[full paper and figures in the link above]
No comments:
Post a Comment