When and Why Do Third Parties Punish Outside of the Lab? A Cross-Cultural Recall Study. Eric J. Pedersen et al. Social Psychological and Personality Science, December 16, 2019. https://doi.org/10.1177/1948550619884565
Abstract: Punishment can reform uncooperative behavior and hence could have contributed to humans’ ability to live in large-scale societies. Punishment by unaffected third parties has received extensive scientific scrutiny because third parties punish transgressors in laboratory experiments on behalf of strangers that they will never interact with again. Often overlooked in this research are interactions involving people who are not strangers, which constitute many interactions beyond the laboratory. Across three samples in two countries (United States and Japan; N = 1,294), we found that third parties’ anger at transgressors, and their intervention and punishment on behalf of victims, varied in real-life conflicts as a function of how much third parties valued the welfare of the disputants. Punishment was rare (1–2%) when third parties did not value the welfare of the victim, suggesting that previous economic game results have overestimated third parties’ willingness to punish transgressors on behalf of strangers.
Keywords: third-party punishment, anger, cooperation, bystander intervention
WTR = welfare trade-off ratio
Discussion
Here, we proposed that a major function of third-party punishment
is to deter aggressors from harming individuals with
whom the punisher shares a fitness interest and that the psychological
mechanisms that regulate punishment take into account
the punisher’s perceived welfare interdependence with the disputants
in a conflict (Pedersen et al., 2018). To test these
hypotheses, we asked U.S. students, U.S. Mechanical Turk
workers, and Japanese students to recall how they responded
the last time they observed a conflict. The recall study method
ensures a wide sampling of situations and thus high generalizability
to real-life conflicts. We found that third parties’ WTRs
for the victim in a conflict indeed predicted anger, intervention,
and punishment on behalf of the victim. We also found that
third parties’ WTRs for the transgressor were negatively associated
with anger toward the transgressor but not with intervention
or punishment as we had predicted. Besides the possibility
that WTR for the transgressor truly does not predict intervention
and punishment, one possibility for the lack of these associations
is that third parties who intervene or punish may
temporarily hold a negative WTR for the transgressor—that
is, they are willing to incur costs to inflict costs. Because our
WTR Scale only went down to 0, any negative WTRs would
have manifested as zeros and thus the variability of the scale
could have been restricted (see Figure S2, which suggests this
may have been the case), which would limit our power to detect
an effect.
These findings were generally consistent across our three
samples and never differed in kind, only magnitude. For intervention
and punishment, the effect of third parties’ WTR for
the victim was constant across all samples, though Japanese
students intervened and punished less often than either U.S.
sample. For anger, there were minor differences among the
samples in the magnitude of the effects of third parties’ WTRs
for the victim and the transgressor, but they remained in the
same, predicted directions in all samples. Thus, we have initial
evidence that our findings are at least somewhat generalizable
beyond a U.S. student population, both to a more general U.S.
population and to Japanese students.
The low model-predicted probabilities of punishment
( .02) we found when WTR for the victim was 0 suggest that
the frequency of third-party punishment has likely been overstated
in the literature that has focused on results from
laboratory-based experimental economics games (for similarly
low rates of punishment in naturalistic settings, see Balafoutas
et al., 2014, 2016). Thus, in addition to providing support for
our hypotheses that third-party anger, intervention, and punishment
vary as a function of the prospective punisher’s WTRs
toward disputants in a conflict, the present study adds to a
growing body of evidence suggesting that direct third-party
punishment on behalf of strangers is not a common feature of
human cooperation (Guala, 2012; Krasnow et al., 2012,
2016; Kriss, Weber, & Xiao, 2016; Pedersen et al., 2013,
2018; Phillips & Cooney, 2005).
We recognize that some might view our design choices here
as restrictive because we limited our scope to conflicts where
there was a direct harm to a victim and only considered intervention
and punishment that occurred in the moment. These
were intentional choices to mimic the types of interactions that
are created in the third-party punishment game (Fehr & Fischbacher,
2004), which typically shows that a majority of people
anonymously engage in immediate, uncoordinated, costly punishment
on behalf of victims. These findings have been generalized
to draw conclusions about humans’ willingness to
directly punish transgressions and what this implies for the evolution
of cooperation in humans (Fehr & Fischbacher, 2003;
Henrich et al., 2010, 2006; for review, see Pedersen et al.,
2018). Our results suggest that people are much less likely to
engage in this type of punishment than a direct generalization
of previous laboratory experiments would imply, though perhaps
future studies will show higher rates of after-the-fact punishment
with low-WTR parties than we found here. Thus, it is
important to note that our data cannot speak directly to a
broader range of social norm violations, some of which could
be likely to evoke punishment. Additionally, we did not focus
on indirect types of retaliation, such as gossip, or other mechanisms
that are likely important to maintaining cooperation and
social norms, such as partner choice.
Additionally, the higher rate of intervention than punishment
we observed here comports well with evidence suggesting
that people prefer alternatives to punishment (e.g.,
helping the victim) when they are available (Balafoutas et al.,
2014, 2016; Chavez & Bicchieri, 2013). It also suggests that
shifting focus beyond punishment could be a fruitful approach
to more fully understanding how third parties respond to conflicts
in the real world. We do notice that the amount of
reported intervention could have been inflated due to our asking
subjects to report whether they had “helped” either person
involved in the conflict, though this was asked after subjects
had already chosen a particular conflict to recall and thus
probably did not bias the choice of event in the first place.
It is also possible that our prompt elicited different recollections
between the U.S. and Japanese samples, which could
explain the difference intervention and punishment rates
between the countries.
This study had some limitations. First, memory limitations
may have prevented people from accurately recalling the
details of past events. For example, subjects’ WTRs for the victims
and transgressors were retrospective; consequently, they
might have been disproportionately reflective of their current
WTRs for the victims and transgressors. Indeed, it is possible
that choosing to intervene or punish increased subjects’ commitment
toward victims and thus could have increased their
WTRs. Although we cannot rule this possibility out given the
nature of our data, we do note that recalled WTRs varied
expectedly as a function of the relationship between subjects
and the victims (see Supplement Material), which suggests that
reported WTRs did at least moderately correspond to the existing
relationships.
Second, subjects’ reports might have been distorted by
socially desirable responding. The low levels of punishment
speak against this concern, but it might have played a role in
intervention responses. The possibility of socially desirable
responding in combination with our exclusion of cases a priori
from situations in which the costs of intervening were very
steep (e.g., conflicts involving guns, multiple transgressors)
leads us to believe that the current study did not underestimate
intervention and punishment frequency. Finally, we did not
code for consolation—attempting to make the victim feel better
after the conflict had ended—and instead treated it as the
same as doing nothing because it had no material effect on the
conflict as it was occurring. Although consolation is certainly
a much less costly helping behavior, it nevertheless may help
the victim and is an important area for future research (De
Waal, 2008).
To conclude, the present investigation moved beyond the
question, “do people punish on behalf of strangers,” to ask,
“when and why do people intervene on behalf of others?” Our
method sampled intervention and punishment decisions
across a wide range of situations and multiple populations,
complementing studies that have examined punishment (and
the desire to punish) in specific real-life situations (Balafoutas
et al., 2014, 2016; Hofmann et al., 2018). Our results converged
with results from these other studies, suggesting that
intervention is much more common than punishment in everyday
life. Perceived welfare interdependence with the victim
emerged as the strongest predictor of intervention and punishment,
signaling its promise as an explanation of involvement
of others’ affairs.
Bipartisan Alliance, a Society for the Study of the US Constitution, and of Human Nature, where Republicans and Democrats meet.
Monday, December 16, 2019
Personality traits of the most intelligent: They have higher internal consistency estimates, greater scale variances, and slightly larger scale ranges
A test of the differentiation of personality by intelligence hypothesis using the Big Five personality factors. Julie Aitken Schermer, Denis Bratko, Jelena Matić Bojić. Personality and Individual Differences, Volume 156, April 1 2020, 109764. https://doi.org/10.1016/j.paid.2019.109764
Abstract: The hypothesis that personality is more differentiated, or variable, for individuals higher in intelligence was tested in a large sample (N = 1,050) of young Croatian adults. Participants completed a measure of the Big Five personality factors in self-report format. Also administered was a verbal ability test as an estimate of intelligence. As the verbal ability scores had a normal distribution, tertile splits were created and the lower group's means, standard deviations, scale ranges, and the coefficient alpha for each scale. The higher ability tertile had higher internal consistency estimates, greater scale variances, and slightly larger scale ranges. The results therefore provide some support for the differentiation of personality by intelligence hypothesis and do suggest that personality scale responses may differ depending on the intelligence level of the sample.
Abstract: The hypothesis that personality is more differentiated, or variable, for individuals higher in intelligence was tested in a large sample (N = 1,050) of young Croatian adults. Participants completed a measure of the Big Five personality factors in self-report format. Also administered was a verbal ability test as an estimate of intelligence. As the verbal ability scores had a normal distribution, tertile splits were created and the lower group's means, standard deviations, scale ranges, and the coefficient alpha for each scale. The higher ability tertile had higher internal consistency estimates, greater scale variances, and slightly larger scale ranges. The results therefore provide some support for the differentiation of personality by intelligence hypothesis and do suggest that personality scale responses may differ depending on the intelligence level of the sample.
The traditional perspective on the ideology-prejudice relationship suggests that conservatism and associated traits (e.g. low cognitive ability, low openness) are associated with prejudice
Ideological (A)symmetries in prejudice and intergroup bias, Jarret T Crawford, Mark J Brandt. Current Opinion in Behavioral Sciences, Volume 34, August 2020, Pages 40-45. https://doi.org/10.1016/j.cobeha.2019.11.007
Highlights
• The traditional perspective on the ideology-prejudice relationship suggests that conservatism and associated traits (e.g. low cognitive ability, low openness) are associated with prejudice.
• The worldview conflict perspective challenges the traditional perspective by testing prejudice against a more heterogeneous array of target groups.
• Research from the worldview conflict perspective shows that both liberals and conservatives (as well as those low and high in several associated traits) are prejudiced against dissimilar groups.
• There are obvious ways, in which these perspectives differ, but also some common ground (e.g. presumption of some underlying psychological differences between liberals and conservatives).
• This still leaves open questions related to the robustness of underlying assumptions, differences between elites and the public, precise causal processes, and worldview conflict reduction.
Abstract: The traditional perspective on the political ideology and prejudice relationship holds that political conservatism is associated with prejudice, and that the types of dispositional characteristics associated with conservatism (e.g. low cognitive ability, low Openness) explain this relationship. This conclusion is limited by the limited number and types of groups studied. When researchers use a more heterogeneous array of targets, people across the political spectrum express prejudice against groups with dissimilar values and beliefs. Evidence for this worldview conflict perspective emerges in both politics and religion, as well as individual differences such as Openness, disgust sensitivity and cognitive ability. Although these two perspectives differ substantially, there is some identifiable common ground between them, particularly the assumption of some psychological differences between liberals and conservatives. We discuss some remaining open questions related to worldview conflict reduction, causal processes, the robustness of the assumptions of the traditional perspective, and differences between political elites and the public.
Highlights
• The traditional perspective on the ideology-prejudice relationship suggests that conservatism and associated traits (e.g. low cognitive ability, low openness) are associated with prejudice.
• The worldview conflict perspective challenges the traditional perspective by testing prejudice against a more heterogeneous array of target groups.
• Research from the worldview conflict perspective shows that both liberals and conservatives (as well as those low and high in several associated traits) are prejudiced against dissimilar groups.
• There are obvious ways, in which these perspectives differ, but also some common ground (e.g. presumption of some underlying psychological differences between liberals and conservatives).
• This still leaves open questions related to the robustness of underlying assumptions, differences between elites and the public, precise causal processes, and worldview conflict reduction.
Abstract: The traditional perspective on the political ideology and prejudice relationship holds that political conservatism is associated with prejudice, and that the types of dispositional characteristics associated with conservatism (e.g. low cognitive ability, low Openness) explain this relationship. This conclusion is limited by the limited number and types of groups studied. When researchers use a more heterogeneous array of targets, people across the political spectrum express prejudice against groups with dissimilar values and beliefs. Evidence for this worldview conflict perspective emerges in both politics and religion, as well as individual differences such as Openness, disgust sensitivity and cognitive ability. Although these two perspectives differ substantially, there is some identifiable common ground between them, particularly the assumption of some psychological differences between liberals and conservatives. We discuss some remaining open questions related to worldview conflict reduction, causal processes, the robustness of the assumptions of the traditional perspective, and differences between political elites and the public.
‘Good genes’ is the default explanation for the evolution of elaborate ornaments, despite abundant evidence that the most attractive mates are seldom those that produce the most viable offspring
It’s Not about Him: Mismeasuring ‘Good Genes’ in Sexual Selection. Angela M. Achorn, Gil G. Rosenthal. Trends in Ecology & Evolution, December 16 2019, https://doi.org/10.1016/j.tree.2019.11.007
Highlights
. ‘Good genes’ remains the default explanation for the evolution of elaborate ornaments, despite abundant evidence that the most attractive mates are seldom those that produce the most viable offspring.
. ‘Good genes’, in which preferred traits predict offspring viability, is often conflated with other indirect benefits, including genetic compatibility, heterozygosity, and offspring attractiveness.
. Few studies in fact test the key predictions of ‘good genes’ models and, as predicted by theory, they show scant evidence for additive effects of mating decisions on offspring viability.
. Direct tests of indirect genetic benefits should measure the attractiveness and viability of offspring from a large number of matings, distinguish between additive and nonadditive benefits, and control for differential investment in offspring.
Abstract: What explains preferences for elaborate ornamentation in animals? The default answer remains that the prettiest males have the best genes. If mating signals predict good genes, mating preferences evolve because attractive mates yield additive genetic benefits through offspring viability, thereby maximizing chooser fitness. Across disciplines, studies claim ‘good genes’ without measuring mating preferences, measuring offspring viability, distinguishing between additive and nonadditive benefits, or controlling for manipulation of chooser investment. Crucially, studies continue to assert benefits to choosers purely based on signal costs to signalers. A focus on fitness outcomes for choosers suggests that ‘good genes’ are insufficient to explain the evolution of mate choice or of sexual ornamentation.
Keywords: genetic qualitymate choiceindirect benefitsgenetic benefits
Reevaluating the Evidence: ‘Good Genes’ in Context
The question of whether sexual selection is good or bad for choosers and populations is a central one in evolutionary biology, animal communication, and conservation biology. By focusing on signals and signal costs, studies often fail to test the basic premise that ornaments are the target of mate choice [46], let alone that they confer any benefits on choosers.
When studies do test for ‘good genes’, evidence suggests that this process accounts for a modest fraction of variance in sexual fitness [2,6,13]. A recent meta-analysis [6] showed that attractiveness was highly heritable, consistent with FLK models, but good genes received mixed support. Attractiveness did not correlate with traits directly associated with fitness (life-history traits). However, attractiveness did positively correlate with physiological traits, such as immunocompetence and condition.
Similarly, recent studies provide at best mixed support for the intuition that the prettiest males have the best genes, although perhaps the most tenacious males do. The clearest evidence for good genes (Table 1) has found them for traits where choosers have limited agency to make mating decisions. Persistent courtship or mating is likely to increase courter success no matter how choosers behave before mating [47], and frequently impose direct costs on females [48]. The one study that controlled for differential allocation [38] yielded equivocal results. There is widespread evidence that choosers invest more in the offspring of attractive males, but more-ornamented courters may manipulate choosers into investing in a mating beyond the lifetime fitness optima of the choosers.
For conspicuous display traits, weak signals of good genes should be the rule. A seeming paradox of good genes models is that preferences for good genes are most likely to be maintained if genetic effects on viability are weak, since this slows the depletion of genetic variation by selection [49]. ‘Good genes’ are likely to be less important to preference evolution than self-reinforcing coevolution channeled by mating biases [13] and direct selection on mating decisions [13,14] (see Outstanding Questions).
More rigorous measures of ‘good genes’ speak to another central question, namely whether sexual selection is more likely to confer a positive or negative effect on population mean fitness [50]. On the one hand, populations can benefit from accelerated purifying selection through sexual selection on the courting sex, meaning that sexual selection can increase population fitness if there is a positive correlation between preference and fitness. On the other hand, sexual selection can decrease population fitness through reduced viability as a consequence of sexual conflict.
The answer likely depends on the nature of selection experienced by populations. When competing males are parasitized, sexually successful male fruit flies (Drosophila melanogaster) sire more parasite-resistant offspring, while the opposite holds true for winners of contests between unparasitized males [24] (Table 1). Along these lines, a recent meta-analysis [50] used 459 effect sizes from 65 experimental studies in which researchers manipulated the presence or strength of sexual selection, encompassing both intrasexual selection and mate choice, and then measured some aspect of fitness. The results indicate that sexual selection tends to increase population fitness, particularly when populations are exposed to novel environmental conditions.
However, in contrast to Prokop and colleagues’ meta-analysis [6], fitness traits related to immunocompetence were an exception: sexual selection covaried with weaker immunity.
Concluding Remarks
Few studies evaluate the critical predictions of benefit models of mate choice. Those that do suggest that good genes have an important role in adapting to novel environments, but that they may be more important in terms of glimpses we have suggests that good genes provide an important, if circumscribed, contribution to chooser fitness.
We can begin to tease apart the selective forces shaping mating preferences, but tests of good genes hypotheses must assess meaningful measures of offspring viability. Assessing offspring viability is conceptually straightforward, if not always easy in wild populations. This can be done by using molecular markers to reconstruct pedigrees and correlating preferred trait expression with survivorship to maturity [36], or by examining proxy measures of viability, such as juvenile growth rate or size [39]. This approach does not rule out the possibility of differential allocation. If choosers invest more in the offspring of attractive partners even at the expense of their own lifetime reproductive success, then mates may not be providing a net increase in average offspring viability [51,52]. Artificial insemination or, in externally fertilizing species, in vitro assays [8] can control for differential allocation.
A counterintuitive point about ‘good genes’ and ‘genetic quality’ is that, by definition, they are difficult, if not impossible, to infer from courter genotypic data alone. Notably, a gene favored by selection (e.g., a gene that buffers oxidative stress and helps produce attractive offspring) can carry a higher genetic load with respect to other components of viability [53,54]. An allele that is ‘good’ with respect to courter function may be in linkage disequilibrium with alleles that reduce (or increase) offspring viability. Again, direct measures of fitness are required to measure ‘good genes’. A major challenge to testing for good genes is that these effects are predicted to be weak [6,55] and, therefore, require large sample sizes. For large, long-lived animals with small populations (e.g., nonhuman primates), longitudinal samples across generations provide a feasible, if slow, approach to detecting viability consequences of mate choice. Simply measuring correlations between ornament elaboration and other courter phenotypes does not distinguish among models of signal evolution through mate choice.
‘Good genes’ is appealing because it assigns utilitarian explanations to seemingly extravagant traits and the desires that shape them. There is allure to the idea that mating preferences can increase population mean fitness and local adaptation. A loose construction of ‘good genes’ and ‘genetic quality’ remains the default explanation for mating preferences and sexual dimorphisms outside the immediate field of sexual selection and in the popular literature. We suggest that the persistence of this default view comes from a sloppy conception of these terms that leads to insufficient empirical tests of adaptive hypotheses.
Unfortunately, ‘good genes’ especially lends itself to what Bateson [56], writing about the term ‘mate selection,’ termed ‘unconscious punning’. The term conjures up so much more than ‘breeding value for viability.’ There is a precise technical term, coined by Galton in 1883 [57], which means ‘good genes’ or ‘true genes’ in Greek. Eugenics is tainted forever by the policies it incited, but we remain entranced by the intuition that Beauty marches in lockstep with Truth. Thus, evidence for ‘good genes’ and ‘genetic quality’ in the vernacular sense, is conflated with support for precise evolutionary models. We would hesitate to study the foraging ecology of koalas exclusively by grinding up eucalyptus leaves, but this is all too often the logic we invoke to study mate choice [46]. A perspective centered on choosers, rather than on the signatures that their choices leave on courters, is essential for understanding mate choice and its consequences.
Highlights
. ‘Good genes’ remains the default explanation for the evolution of elaborate ornaments, despite abundant evidence that the most attractive mates are seldom those that produce the most viable offspring.
. ‘Good genes’, in which preferred traits predict offspring viability, is often conflated with other indirect benefits, including genetic compatibility, heterozygosity, and offspring attractiveness.
. Few studies in fact test the key predictions of ‘good genes’ models and, as predicted by theory, they show scant evidence for additive effects of mating decisions on offspring viability.
. Direct tests of indirect genetic benefits should measure the attractiveness and viability of offspring from a large number of matings, distinguish between additive and nonadditive benefits, and control for differential investment in offspring.
Abstract: What explains preferences for elaborate ornamentation in animals? The default answer remains that the prettiest males have the best genes. If mating signals predict good genes, mating preferences evolve because attractive mates yield additive genetic benefits through offspring viability, thereby maximizing chooser fitness. Across disciplines, studies claim ‘good genes’ without measuring mating preferences, measuring offspring viability, distinguishing between additive and nonadditive benefits, or controlling for manipulation of chooser investment. Crucially, studies continue to assert benefits to choosers purely based on signal costs to signalers. A focus on fitness outcomes for choosers suggests that ‘good genes’ are insufficient to explain the evolution of mate choice or of sexual ornamentation.
Keywords: genetic qualitymate choiceindirect benefitsgenetic benefits
Reevaluating the Evidence: ‘Good Genes’ in Context
The question of whether sexual selection is good or bad for choosers and populations is a central one in evolutionary biology, animal communication, and conservation biology. By focusing on signals and signal costs, studies often fail to test the basic premise that ornaments are the target of mate choice [46], let alone that they confer any benefits on choosers.
When studies do test for ‘good genes’, evidence suggests that this process accounts for a modest fraction of variance in sexual fitness [2,6,13]. A recent meta-analysis [6] showed that attractiveness was highly heritable, consistent with FLK models, but good genes received mixed support. Attractiveness did not correlate with traits directly associated with fitness (life-history traits). However, attractiveness did positively correlate with physiological traits, such as immunocompetence and condition.
Similarly, recent studies provide at best mixed support for the intuition that the prettiest males have the best genes, although perhaps the most tenacious males do. The clearest evidence for good genes (Table 1) has found them for traits where choosers have limited agency to make mating decisions. Persistent courtship or mating is likely to increase courter success no matter how choosers behave before mating [47], and frequently impose direct costs on females [48]. The one study that controlled for differential allocation [38] yielded equivocal results. There is widespread evidence that choosers invest more in the offspring of attractive males, but more-ornamented courters may manipulate choosers into investing in a mating beyond the lifetime fitness optima of the choosers.
For conspicuous display traits, weak signals of good genes should be the rule. A seeming paradox of good genes models is that preferences for good genes are most likely to be maintained if genetic effects on viability are weak, since this slows the depletion of genetic variation by selection [49]. ‘Good genes’ are likely to be less important to preference evolution than self-reinforcing coevolution channeled by mating biases [13] and direct selection on mating decisions [13,14] (see Outstanding Questions).
More rigorous measures of ‘good genes’ speak to another central question, namely whether sexual selection is more likely to confer a positive or negative effect on population mean fitness [50]. On the one hand, populations can benefit from accelerated purifying selection through sexual selection on the courting sex, meaning that sexual selection can increase population fitness if there is a positive correlation between preference and fitness. On the other hand, sexual selection can decrease population fitness through reduced viability as a consequence of sexual conflict.
The answer likely depends on the nature of selection experienced by populations. When competing males are parasitized, sexually successful male fruit flies (Drosophila melanogaster) sire more parasite-resistant offspring, while the opposite holds true for winners of contests between unparasitized males [24] (Table 1). Along these lines, a recent meta-analysis [50] used 459 effect sizes from 65 experimental studies in which researchers manipulated the presence or strength of sexual selection, encompassing both intrasexual selection and mate choice, and then measured some aspect of fitness. The results indicate that sexual selection tends to increase population fitness, particularly when populations are exposed to novel environmental conditions.
However, in contrast to Prokop and colleagues’ meta-analysis [6], fitness traits related to immunocompetence were an exception: sexual selection covaried with weaker immunity.
Concluding Remarks
Few studies evaluate the critical predictions of benefit models of mate choice. Those that do suggest that good genes have an important role in adapting to novel environments, but that they may be more important in terms of glimpses we have suggests that good genes provide an important, if circumscribed, contribution to chooser fitness.
We can begin to tease apart the selective forces shaping mating preferences, but tests of good genes hypotheses must assess meaningful measures of offspring viability. Assessing offspring viability is conceptually straightforward, if not always easy in wild populations. This can be done by using molecular markers to reconstruct pedigrees and correlating preferred trait expression with survivorship to maturity [36], or by examining proxy measures of viability, such as juvenile growth rate or size [39]. This approach does not rule out the possibility of differential allocation. If choosers invest more in the offspring of attractive partners even at the expense of their own lifetime reproductive success, then mates may not be providing a net increase in average offspring viability [51,52]. Artificial insemination or, in externally fertilizing species, in vitro assays [8] can control for differential allocation.
A counterintuitive point about ‘good genes’ and ‘genetic quality’ is that, by definition, they are difficult, if not impossible, to infer from courter genotypic data alone. Notably, a gene favored by selection (e.g., a gene that buffers oxidative stress and helps produce attractive offspring) can carry a higher genetic load with respect to other components of viability [53,54]. An allele that is ‘good’ with respect to courter function may be in linkage disequilibrium with alleles that reduce (or increase) offspring viability. Again, direct measures of fitness are required to measure ‘good genes’. A major challenge to testing for good genes is that these effects are predicted to be weak [6,55] and, therefore, require large sample sizes. For large, long-lived animals with small populations (e.g., nonhuman primates), longitudinal samples across generations provide a feasible, if slow, approach to detecting viability consequences of mate choice. Simply measuring correlations between ornament elaboration and other courter phenotypes does not distinguish among models of signal evolution through mate choice.
‘Good genes’ is appealing because it assigns utilitarian explanations to seemingly extravagant traits and the desires that shape them. There is allure to the idea that mating preferences can increase population mean fitness and local adaptation. A loose construction of ‘good genes’ and ‘genetic quality’ remains the default explanation for mating preferences and sexual dimorphisms outside the immediate field of sexual selection and in the popular literature. We suggest that the persistence of this default view comes from a sloppy conception of these terms that leads to insufficient empirical tests of adaptive hypotheses.
Unfortunately, ‘good genes’ especially lends itself to what Bateson [56], writing about the term ‘mate selection,’ termed ‘unconscious punning’. The term conjures up so much more than ‘breeding value for viability.’ There is a precise technical term, coined by Galton in 1883 [57], which means ‘good genes’ or ‘true genes’ in Greek. Eugenics is tainted forever by the policies it incited, but we remain entranced by the intuition that Beauty marches in lockstep with Truth. Thus, evidence for ‘good genes’ and ‘genetic quality’ in the vernacular sense, is conflated with support for precise evolutionary models. We would hesitate to study the foraging ecology of koalas exclusively by grinding up eucalyptus leaves, but this is all too often the logic we invoke to study mate choice [46]. A perspective centered on choosers, rather than on the signatures that their choices leave on courters, is essential for understanding mate choice and its consequences.
The harsher grading policies in STEM courses disproportionately affect women; restrictions on grading policies that equalize average grades across classes helps to close the STEM gender gap as well as increasing overall enrollment
Equilibrium Grade Inflation with Implications for Female Interest in STEM Majors. Thomas Ahn, Peter Arcidiacono, Amy Hopson, James R. Thomas. NBER Working Paper No. 26556. December 2019. https://www.nber.org/papers/w26556
Abstract: Substantial earnings differences exist across majors with the majors that pay well also having lower grades and higher workloads. We show that the harsher grading policies in STEM courses disproportionately affect women. To show this, we estimate a model of student demand courses and optimal effort choices of students conditional on the chosen courses. Instructor grading policies are treated as equilibrium objects that in part depend on student demand for courses. Restrictions on grading policies that equalize average grades across classes helps to close the STEM gender gap as well as increasing overall enrollment in STEM classes.
5.3 Grade estimates
The estimated αs, the department-specific ability weights, are given in Table 6. These are calculated by taking the reduced-form θs, undoing the normalization on the γs, and subtracting off the part of the reduced form that θs that reflect the study time (taken from ψ). The departments are sorted such that those with the lowest female estimate are listed first. Note that in all departments the female estimate is negative. This occurs because females study substantially more than males yet receive only slightly higher grades. Given that sorting into universities takes place on both cognitive and non-cognitive skills and that women have a comparative advantage in non-cognitive skills, males at UK have higher cognitive skills than their female counterpart even though in the population cognitive skills are similar between men and women. Negative estimates are also found for Hispanics. While Hispanics have higher grades than African Americans, our estimates of the study costs suggested that they also studied substantially more. Given the very high estimate of Hispanic study time we would have expected Hispanics to perform even better in the classroom than they actually did if their baseline abilities were similar to African Americans. With the estimates of the grading equation, we can reported expected grades for an average student. We do this for freshmen, separately by gender, both unconditionally and conditional on taking courses in that department in the semester we study. Results are presented in Table 7. Three patterns stand out. First, there is positive selection into STEM courses: generally those who take STEM classes are expected to perform better than the average student. This is the not the case for many departments. Indeed, the second pattern is that negative selection is more likely to occur in departments with higher grades. Finally, women are disproportionately represented in departments that give higher grades for the average student. Of the seven departments that give the highest grades for the average student (female or male), all have a larger fraction female than the overall population. In contrast, of the five departments that give the lowest grades (STEM and Economics), females are under-represented relative to the overall population in all but one (Biology).
Abstract: Substantial earnings differences exist across majors with the majors that pay well also having lower grades and higher workloads. We show that the harsher grading policies in STEM courses disproportionately affect women. To show this, we estimate a model of student demand courses and optimal effort choices of students conditional on the chosen courses. Instructor grading policies are treated as equilibrium objects that in part depend on student demand for courses. Restrictions on grading policies that equalize average grades across classes helps to close the STEM gender gap as well as increasing overall enrollment in STEM classes.
5.3 Grade estimates
The estimated αs, the department-specific ability weights, are given in Table 6. These are calculated by taking the reduced-form θs, undoing the normalization on the γs, and subtracting off the part of the reduced form that θs that reflect the study time (taken from ψ). The departments are sorted such that those with the lowest female estimate are listed first. Note that in all departments the female estimate is negative. This occurs because females study substantially more than males yet receive only slightly higher grades. Given that sorting into universities takes place on both cognitive and non-cognitive skills and that women have a comparative advantage in non-cognitive skills, males at UK have higher cognitive skills than their female counterpart even though in the population cognitive skills are similar between men and women. Negative estimates are also found for Hispanics. While Hispanics have higher grades than African Americans, our estimates of the study costs suggested that they also studied substantially more. Given the very high estimate of Hispanic study time we would have expected Hispanics to perform even better in the classroom than they actually did if their baseline abilities were similar to African Americans. With the estimates of the grading equation, we can reported expected grades for an average student. We do this for freshmen, separately by gender, both unconditionally and conditional on taking courses in that department in the semester we study. Results are presented in Table 7. Three patterns stand out. First, there is positive selection into STEM courses: generally those who take STEM classes are expected to perform better than the average student. This is the not the case for many departments. Indeed, the second pattern is that negative selection is more likely to occur in departments with higher grades. Finally, women are disproportionately represented in departments that give higher grades for the average student. Of the seven departments that give the highest grades for the average student (female or male), all have a larger fraction female than the overall population. In contrast, of the five departments that give the lowest grades (STEM and Economics), females are under-represented relative to the overall population in all but one (Biology).
Wealth Taxation in the United States: The effect of the Swiss tax and Warren tax on wealth inequality is miniscule, lowering the Gini coefficient by at most 0.0005 Gini points
Wealth Taxation in the United States. Edward N. Wolff. NBER Working
Paper No. 26544. December 2019. https://www.nber.org/papers/w26544
Abstract: The paper analyzes the fiscal effects of a Swiss-type tax on household wealth, with a $120,000 exemption and marginal tax rates running from 0.05 to 0.3 percent on $2,400,000 or more of wealth. It also considers a wealth tax proposed by Senator Elizabeth Warren with a $50,000,000 exemption, a two percent tax on wealth above that and a one percent surcharge on wealth above $1,000,000,000. Based on the 2016 Survey of Consumer Finances, the Swiss tax would yield $189.3 billion and the Warren tax $303.4 billion. Only 0.07 percent of households would pay the Warren tax, compared to 44.3 percent for the Swiss tax. The Swiss tax would have a very small effect on income inequality, lowering the post-tax Gini coefficient by 0.004 Gini points. The effect of the Swiss tax and Warren tax on wealth inequality is miniscule, lowering the Gini coefficient by at most 0.0005 Gini points.
Abstract: The paper analyzes the fiscal effects of a Swiss-type tax on household wealth, with a $120,000 exemption and marginal tax rates running from 0.05 to 0.3 percent on $2,400,000 or more of wealth. It also considers a wealth tax proposed by Senator Elizabeth Warren with a $50,000,000 exemption, a two percent tax on wealth above that and a one percent surcharge on wealth above $1,000,000,000. Based on the 2016 Survey of Consumer Finances, the Swiss tax would yield $189.3 billion and the Warren tax $303.4 billion. Only 0.07 percent of households would pay the Warren tax, compared to 44.3 percent for the Swiss tax. The Swiss tax would have a very small effect on income inequality, lowering the post-tax Gini coefficient by 0.004 Gini points. The effect of the Swiss tax and Warren tax on wealth inequality is miniscule, lowering the Gini coefficient by at most 0.0005 Gini points.
The (In)accuracy of Forecast Revisions in a Football Score Prediction Game: Better go with your gut instinct
Going with your Gut: The (In)accuracy of Forecast Revisions in a Football Score Prediction Game. Carl Singleton, James Reade, Alasdair Brown. Journal of Behavioral and Experimental Economics, December 16 2019, 101502. https://doi.org/10.1016/j.socec.2019.101502
Highlights
• Judgement revisions led to worse performance in a football score prediction game
• This is robust to the average forecasting ability of individuals playing the game
• Revisions to the forecast number of goals scored in matches are generally excessive
Abstract: This paper studies 150 individuals who each chose to forecast the outcome of 380 fixed events, namely all football matches during the 2017/18 season of the English Premier League. The focus is on whether revisions to these forecasts before the matches began improved the likelihood of predicting correct scorelines and results. Against what theory might expect, we show how these revisions tended towards significantly worse forecasting performance, suggesting that individuals should have stuck with their initial judgements, or their ‘gut instincts’. This result is robust to both differences in the average forecasting ability of individuals and the predictability of matches. We find evidence this is because revisions to the forecast number of goals scored in football matches are generally excessive, especially when these forecasts were increased rather than decreased.
6 Summary and further discussion
In this paper, we have analysed the forecasting performance of individuals who each applied
their judgement to predict the outcomes of many fixed events. The context of this analysis was
the scoreline outcomes of professional football matches. We found that when individuals made
revisions their likelihood of predicting a correct scoreline, which they achieved around 9% of
the time when never making a revision, significantly decreased. The same applied for forecast
revisions to the result outcomes of matches. Not only were these findings robust to unobserved
individual forecasting ability and the predictability of events, but also there is evidence that
performance would have improved had initial judgements been followed.
As already mentioned, these results have some similarities with those found previously in
the behavioural forecasting literature. One explanation could be that game players anchor their
beliefs, expectations and, consequently, their forecasts on past or initial values. However, this
behaviour would not be consistent with our finding that on average forecasters made revisions
which not only improved on their goals scored forecast errors but which were also excessive.
There are several areas for further research, which could be explored with extensions of
the dataset used here. First, it appears to be a relatively open question as to how sources of
bias among sports forecasters interact with how they make revisions, such as the well-known
favourite-longshot bias. Second, players of the forecasting game studied here do reveal which EPL
team they have the greatest affinity for, though we are yet to observe this information ourselves. It
is an interesting question as to whether any wishful-thinking by the players manifests itself more
greatly before or after they revise their forecasts. Third, an aspect which could be studied from
these current data is whether players improve their forecasts over time, and if they learn how to
play more to the rules of the game itself, which should lead them to favour more conservative goals
forecasts. Fourth, these results concern a selective random sample of players who “completed”
the game. These are likely to be individuals who extract significant utility from making forecasts
of football match scorelines, who are thus more likely to return to their initial forecasts and make
revisions. It would be interesting whether more casual forecasters are better at sticking with their
gut instincts or better off from doing so. Finally, our results suggested an innovation to the game
which could improve the crowd’s forecasting accuracy and which could be easily tested: before
making forecasts, some of the game players could be informed that sticking with their initial
judgement, or gut instinct, is likely to improve their chances of picking a correct score.
Highlights
• Judgement revisions led to worse performance in a football score prediction game
• This is robust to the average forecasting ability of individuals playing the game
• Revisions to the forecast number of goals scored in matches are generally excessive
Abstract: This paper studies 150 individuals who each chose to forecast the outcome of 380 fixed events, namely all football matches during the 2017/18 season of the English Premier League. The focus is on whether revisions to these forecasts before the matches began improved the likelihood of predicting correct scorelines and results. Against what theory might expect, we show how these revisions tended towards significantly worse forecasting performance, suggesting that individuals should have stuck with their initial judgements, or their ‘gut instincts’. This result is robust to both differences in the average forecasting ability of individuals and the predictability of matches. We find evidence this is because revisions to the forecast number of goals scored in football matches are generally excessive, especially when these forecasts were increased rather than decreased.
6 Summary and further discussion
In this paper, we have analysed the forecasting performance of individuals who each applied
their judgement to predict the outcomes of many fixed events. The context of this analysis was
the scoreline outcomes of professional football matches. We found that when individuals made
revisions their likelihood of predicting a correct scoreline, which they achieved around 9% of
the time when never making a revision, significantly decreased. The same applied for forecast
revisions to the result outcomes of matches. Not only were these findings robust to unobserved
individual forecasting ability and the predictability of events, but also there is evidence that
performance would have improved had initial judgements been followed.
As already mentioned, these results have some similarities with those found previously in
the behavioural forecasting literature. One explanation could be that game players anchor their
beliefs, expectations and, consequently, their forecasts on past or initial values. However, this
behaviour would not be consistent with our finding that on average forecasters made revisions
which not only improved on their goals scored forecast errors but which were also excessive.
There are several areas for further research, which could be explored with extensions of
the dataset used here. First, it appears to be a relatively open question as to how sources of
bias among sports forecasters interact with how they make revisions, such as the well-known
favourite-longshot bias. Second, players of the forecasting game studied here do reveal which EPL
team they have the greatest affinity for, though we are yet to observe this information ourselves. It
is an interesting question as to whether any wishful-thinking by the players manifests itself more
greatly before or after they revise their forecasts. Third, an aspect which could be studied from
these current data is whether players improve their forecasts over time, and if they learn how to
play more to the rules of the game itself, which should lead them to favour more conservative goals
forecasts. Fourth, these results concern a selective random sample of players who “completed”
the game. These are likely to be individuals who extract significant utility from making forecasts
of football match scorelines, who are thus more likely to return to their initial forecasts and make
revisions. It would be interesting whether more casual forecasters are better at sticking with their
gut instincts or better off from doing so. Finally, our results suggested an innovation to the game
which could improve the crowd’s forecasting accuracy and which could be easily tested: before
making forecasts, some of the game players could be informed that sticking with their initial
judgement, or gut instinct, is likely to improve their chances of picking a correct score.
Replication of the "Asch Effect" in Bosnia and Herzegovina: Evidence for the Moderating Role of Group Similarity in Conformity
Replication of the "Asch Effect" in Bosnia and Herzegovina: Evidence for the Moderating Role of Group Similarity in Conformity. Muamer Ušto, Saša Drače, Nina Hadžiahmetović. Psychological Topics , Vol 28, No 3 (2019). www.pt.ffri.hr/index.php/pt/article/view/507
Abstract: In the present study, we tried to replicate a classic Asch effect in the cultural context of BosniaHerzegovina and to explore the potential impact of group similarity on conformity. To answer these questions Bosniak (Muslim) students (N = 95) performed classic Asch's line judgment task in the presence of five confederates (the majority) who were ostensibly either of a similar ethnic origin (ingroup), different ethnic origin (out-group) or no salient ethnic origin. The task involved choosing one of three comparison lines that was equal in length to a test line. Each participant went through 18 test trials including 12 critical trials in which confederates provided an obviously wrong answer. In line with past research, the results revealed a clear-cut and powerful "Asch effect" wherein participants followed the majority in 35.4% of critical trials. More importantly, this effect was moderated by group similarity. Thus, in comparison to no salient group identity condition, conformity was maximized in the in-group majority condition and minimized in the out-group majority condition. Taken together, our results support the universal finding of "Asch effect" and provide clear evidence that similarity with the majority plays an important role in the conformity phenomenon.
Keywords: conformity; Asch effect; self-categorization theory; group similarity
Discussion
In line with prior findings (e.g., Nicholson et al., 1985) we replicated the Asch
conformity effect. More than sixty years after Asch originally showed that American
students' judgments in an objective perception task were affected by the erroneous
estimates given by unanimous majority group, Bosnian students were similarly
influenced under the same experimental circumstances. Interestingly, the conformity
in our sample even exceeded the usual level found in other replications of the Asch
experiment (20-30%, cf. Nicholson et al., 1985; Ross, Bierbrauer, & Hoffman, 1976;
Walker & Andrade, 1996). As we can see in Table 1, participants generally followed
the majority in 4 out of 12 critical trials (33.3%). However, when we look only at the
standard condition, where ethnic identity was not salient, we can see that the number
of errors was even bigger, approaching the conformity level obtained by Asch in
similar condition. One reason for these findings could be in cross-cultural differences
on the dimension of individualism-collectivism (Bond & Smith, 1996). In general,
individualistic cultures tend to prioritize independence and uniqueness as cultural
values. Collectivistic cultures, on the other hand, tend to see people as connected
with others and embedded in a broader social context. As such, they tend to
emphasize interdependence, family relationships, and social conformity. Given that
Bosnia and Herzegovina is closer to collectivistic values (probably due to
communism residues) than North America and Western European countries, this
could explain higher levels of conformity in our sample.
Despite this converging evidence in favour of conformity phenomenon, some
authors (e.g., Friend, Rafferty, & Barmel, 1990) pointed out that most people are not
conformists, but that only some individuals tend to conform due to individual
differences in personality. Therefore, it is possible that those conformist personalities
tend to maximize conformity rate, which may also explain the results in our study. If
this hypothesis is true, then the Asch effect should occur only for participants having
conformity disposition, but not for the rest of them: a hypothesis which was
disconfirmed by our results. Indeed, the follow-up analysis conducted without
participants who conformed on each stimulus revealed the overall level of
conformity of 36.29%. Moreover, the fact that 59.2% of subjects conformed at least
at one critical trial indicates that the majority of people exposed to the influence of
others tend to display conformist behaviour. Thus, the results we observed point to
conformity as a rather global phenomenon, which could not be attributed to the
idiosyncratic features of our subjects.
Besides the cross-cultural replication, another important aspect of our study is
that it showed that the Asch effect was clearly moderated by group similarity.
Consistent with the assumption of the SCT (Turner, 1991; Turner et al., 1987),
participants exposed to the in-group majority showed the increase in conformity in
comparison to the standard condition in which group identity was not salient. On the
opposite, when the majority was presented as the out-group, the conformity effect
significantly dropped. Thus, we replicated and extended past research (Abrahams et
al., 1990), showing that self-categorization could play a determining role in
conformity even in more minimal conditions, in which salient in- and out-group
characteristics (i.e., ethnicity) were completely irrelevant for the task at hand. As
such our findings could not be accounted for by the potential differences in objective
informational value (i.e., competence) but rather by the perception of similarity with
the majority. In addition, it should be noted that by including in- and out-groups,
which reflect prototypical ethnic divisions of Bosnian society, we created conditions
that enhanced the ecological validity of the present study. From this point, our
findings could have interesting implications for the understanding of social influence
processes in real life. Indeed, after showing how similarity with particular ethnic
group moderates conformity in clearly unambiguous task, we can easily anticipate
the power of self-categorization process in situations where people have to deal with
more complex and uncertain social reality involving real group interests such as,
support of political decision or voting in the context in which group membership is
highly salient.
Abstract: In the present study, we tried to replicate a classic Asch effect in the cultural context of BosniaHerzegovina and to explore the potential impact of group similarity on conformity. To answer these questions Bosniak (Muslim) students (N = 95) performed classic Asch's line judgment task in the presence of five confederates (the majority) who were ostensibly either of a similar ethnic origin (ingroup), different ethnic origin (out-group) or no salient ethnic origin. The task involved choosing one of three comparison lines that was equal in length to a test line. Each participant went through 18 test trials including 12 critical trials in which confederates provided an obviously wrong answer. In line with past research, the results revealed a clear-cut and powerful "Asch effect" wherein participants followed the majority in 35.4% of critical trials. More importantly, this effect was moderated by group similarity. Thus, in comparison to no salient group identity condition, conformity was maximized in the in-group majority condition and minimized in the out-group majority condition. Taken together, our results support the universal finding of "Asch effect" and provide clear evidence that similarity with the majority plays an important role in the conformity phenomenon.
Keywords: conformity; Asch effect; self-categorization theory; group similarity
Discussion
In line with prior findings (e.g., Nicholson et al., 1985) we replicated the Asch
conformity effect. More than sixty years after Asch originally showed that American
students' judgments in an objective perception task were affected by the erroneous
estimates given by unanimous majority group, Bosnian students were similarly
influenced under the same experimental circumstances. Interestingly, the conformity
in our sample even exceeded the usual level found in other replications of the Asch
experiment (20-30%, cf. Nicholson et al., 1985; Ross, Bierbrauer, & Hoffman, 1976;
Walker & Andrade, 1996). As we can see in Table 1, participants generally followed
the majority in 4 out of 12 critical trials (33.3%). However, when we look only at the
standard condition, where ethnic identity was not salient, we can see that the number
of errors was even bigger, approaching the conformity level obtained by Asch in
similar condition. One reason for these findings could be in cross-cultural differences
on the dimension of individualism-collectivism (Bond & Smith, 1996). In general,
individualistic cultures tend to prioritize independence and uniqueness as cultural
values. Collectivistic cultures, on the other hand, tend to see people as connected
with others and embedded in a broader social context. As such, they tend to
emphasize interdependence, family relationships, and social conformity. Given that
Bosnia and Herzegovina is closer to collectivistic values (probably due to
communism residues) than North America and Western European countries, this
could explain higher levels of conformity in our sample.
Despite this converging evidence in favour of conformity phenomenon, some
authors (e.g., Friend, Rafferty, & Barmel, 1990) pointed out that most people are not
conformists, but that only some individuals tend to conform due to individual
differences in personality. Therefore, it is possible that those conformist personalities
tend to maximize conformity rate, which may also explain the results in our study. If
this hypothesis is true, then the Asch effect should occur only for participants having
conformity disposition, but not for the rest of them: a hypothesis which was
disconfirmed by our results. Indeed, the follow-up analysis conducted without
participants who conformed on each stimulus revealed the overall level of
conformity of 36.29%. Moreover, the fact that 59.2% of subjects conformed at least
at one critical trial indicates that the majority of people exposed to the influence of
others tend to display conformist behaviour. Thus, the results we observed point to
conformity as a rather global phenomenon, which could not be attributed to the
idiosyncratic features of our subjects.
Besides the cross-cultural replication, another important aspect of our study is
that it showed that the Asch effect was clearly moderated by group similarity.
Consistent with the assumption of the SCT (Turner, 1991; Turner et al., 1987),
participants exposed to the in-group majority showed the increase in conformity in
comparison to the standard condition in which group identity was not salient. On the
opposite, when the majority was presented as the out-group, the conformity effect
significantly dropped. Thus, we replicated and extended past research (Abrahams et
al., 1990), showing that self-categorization could play a determining role in
conformity even in more minimal conditions, in which salient in- and out-group
characteristics (i.e., ethnicity) were completely irrelevant for the task at hand. As
such our findings could not be accounted for by the potential differences in objective
informational value (i.e., competence) but rather by the perception of similarity with
the majority. In addition, it should be noted that by including in- and out-groups,
which reflect prototypical ethnic divisions of Bosnian society, we created conditions
that enhanced the ecological validity of the present study. From this point, our
findings could have interesting implications for the understanding of social influence
processes in real life. Indeed, after showing how similarity with particular ethnic
group moderates conformity in clearly unambiguous task, we can easily anticipate
the power of self-categorization process in situations where people have to deal with
more complex and uncertain social reality involving real group interests such as,
support of political decision or voting in the context in which group membership is
highly salient.