Do humans agree on which body odors are attractive, similar to the agreement observed when rating faces and voices? Megan Nicole Williams, Coren Lee Apicella. Evolution and Human Behavior, February 14 202. https://doi.org/10.1016/j.evolhumbehav.2023.02.002
Abstract: Studies of mate choice from an evolutionary perspective often begin by investigating whether individuals of one sex share similar preferences for mates. Evidence for shared preferences is often interpreted as support for the hypothesis that preferences are adaptations that have evolved to select high-quality mates. To date, the importance of body odor in human mate choice is uncertain because fundamental questions, such as whether preferences for body odor are shared, have not yet been systematically explored. Here, we asked groups of heterosexual men and women from the University of Pennsylvania to rate the attractiveness of body odors, faces, and voices of opposite-sex individuals. We used our data to produce quantitative estimates of the amount of rater agreement for each of the three modalities of attractiveness, applying a uniform methodology that facilitates cross-modality comparisons. Overall, we found evidence of agreement within all three modalities. Yet, our data also suggest a larger component of attractiveness judgments that can be attributed to personal preferences and idiosyncratic noise. Importantly, our results provide no evidence that agreement regarding odor attractiveness is substantially quantitatively different from the amount of agreement found in other modalities that have been the focus of most previous work. To the extent that evidence exists of shared preferences for faces and voices, our results reveal evidence of shared preferences for body odors.
Keywords: OlfactionBody odorMate choiceFace attractivenessVoice attractivenessMultimodal perception
4. Discussion
Possibly the most conclusive and replicable finding in social psychology is that attractiveness is an important factor in social interactions (for review see, Grammer et al., 2003). Symons (1979) suggested shared attractiveness preferences are evolved adaptations for choosing fitness-enhancing mates, and since the 1990s, evidence has accumulated demonstrating shared attractiveness preferences for others' faces (e.g., Grammer & Thornhill., 1994; Langlois & Roggman, 1990; Mealey et al., 1999; Perrett et al., 1999; Rhodes, Proffitt, Grady and Sumich, 1998, Rhodes, Sumich and Byatt, 1999; Rhodes & Tremewan, 1996), bodies (e.g., Singh, 1993; Singh et al., 2010; Singh & Young, 1995), and voices (Collins, 2000; Feinberg et al., 2005; Puts et al., 2013). Here, we investigated whether there is evidence of shared attractiveness preferences for body odor, as has been observed in these other modalities.
To provide a benchmark from which we could assess our evidence for agreement in judgments of body odors, we used the same methodology, sample, and analysis to also examine agreement in judgments of faces and voices. Thus, any differences in variance attributable to agreement between modalities could not be caused by differences in the sample or analysis. We found no significant differences in levels of agreement in attractiveness ratings between modalities. However, there was evidence of little agreement overall in female ratings of men's attractiveness when using the individual-agreement ICC. Yet, we do report fair to good agreement in all attractiveness modalities using the average-agreement ICC (k = 4). The average-agreement ICC removes measurement noise caused by any one rater's ratings; thus, we expected that the average-agreement estimates would be higher than the individual-agreement estimates. For male ratings of women's attractiveness, we found that agreement in the attractiveness modalities was statistically distinguishable from zero, but low. Our samples of male and female raters were not large enough to detect sex differences in agreement in attractiveness preferences. We estimated the magnitude of the difference between male and female rater agreement for each modality, but the standard error of each estimated sex difference was too large to allow for conclusions from this observation. Few studies have evaluated sex differences in rater agreement for judgments of attractiveness, however there is some research to suggest there are no significant differences in agreement (e.g., Coetzee et al., 2014) or greater consensus among men (e.g., Rhodes et al., 1998). Higher agreement in men is consistent with the idea that attractiveness plays a larger role in male mate choice whereas, for example, social status is more important for female mate choice (Buss, 1989). However further research is necessary using samples large enough to detect small sex differences and evaluating the underlying fitness markers influencing attractiveness judgments in each sensory modality. Again, encouragingly our findings indicate statistically equivalent levels of agreement in judgments of attractiveness for each modality of attractiveness (i.e., face, voice, and odor) within both sexes. So, although we cannot make a strong claim for evidence of evolved attractiveness preferences, especially because we are unsure of how much agreement would constitute evidence, our data do demonstrate that to the degree that shared preferences exist for faces and voices, they also exist for body odors.
While our estimated agreement for within-sex judgments of opposite-sex attractiveness in each modality seems lower than estimates reported in earlier studies, the parameters we used were different and not necessarily at odds. For example, Thornhill and Gangestad (1999) measured consistencies for male (n = 61) and female (n = 48) ratings of opposite-sex body odors using Cronbach's alpha (α = 0.66, high-fertility female raters; α = 0.90, low-fertility female raters; α = 0.90, male raters). Similarly, Lobmaier, Fischbacher, Wirthmüller, and Knoch (2018) reported an ICC of 0.983 for male (n = 55) ratings of women's (n = 28) body odors. As discussed at length elsewhere (Hehman, Sutherland, Flake & Slepian, 2017; Hönekopp, 2006), high alphas and average-agreement ICC estimates do not necessarily provide evidence of strong interrater agreement. The fundamental difficulty is that these parameters are strongly influenced by the number of items (here, raters), which often varies across studies, hampering comparability. Likewise, an ICC near one is very hard to interpret unless which of the many possible ICCs have been estimated is made explicit (McGraw & Wong, 1996). Through personal correspondence (June 8, 2020), we were able to determine that the parameter estimated by Lobmaier, Fischbacher, Wirthmüller, and Knoch (2018) was the average-agreement ICC for n = 55 male raters. Because their study estimated a different parameter than the present study, the lower estimates we have reported are not at odds with what they found. On the contrary, the value of ρA, 1 implied by Lobmaier, Fischbacher, Wirthmüller, and Knoch (2018) estimate of ρA, 55 is around 0.5 and hence, in the same ballpark as the estimates of individual agreement reported in the present study, see Bliese (2000) for the formulae needed to rescale parameter estimates for comparability.
Misinterpretations of Cronbach's alpha and the average-agreement ICC can cause overestimations of the strength of evidence for shared attractiveness preferences because the contribution of personal preference is typically unreported or defined as random noise (Hönekopp, 2006). Our analysis not only reported the average-agreement ICC (k = 4), but also the individual-agreement ICC, which reports the correlation between the individual judgments of two raters assigned to the same donor. The individual-agreement ICC parameters reported here show that there is some agreement between raters' judgments in each attractiveness modality that can be attributed to a shared preference, but a larger component also exists that can be attributed to personal preference and noise. Our individual-agreement ICC estimates are in line with recent research using statistical methods accounting for variances in attractiveness judgments of faces attributed to both donor (i.e., shared preference) and rater (i.e., personal preference) characteristics (e.g., Hehman, Sutherland, Flake, & Slepian, 2017; Hönekopp, 2006). For example, Hönekopp's (2006) pioneering study found that in contrast to the prevailing view that facial attractiveness judgments are largely based on donor characteristics and shared universally, variation in judgments of attractiveness were equally explained by perceiver characteristics (i.e., personal preference). In experiment 2, which used a heterogenous racial sample similar to our sample of participants, Hönekopp (2006) estimated that 56% of variance in attractiveness judgments is attributable to the rater (i.e., personal preference). Thus, future work should explore the relative contributions of personal and shared preferences for body odors attractiveness judgments and investigate the underlying fitness markers influencing each.
4.1. Study limitations
Though our findings support the hypothesis that shared preferences for body odors exist to the extent that shared preferences for faces and voices exist, convenience sampling limits the strength of our interpretation. The current study cannot fully distinguish between attractiveness preferences that persist today because past selection favored reliable developmental patterns and preferences that exist because selection favored labile and culturally responsive preferences, since we investigated preferences in a single society. In general, cross-cultural research on odor perception is scant, particularly for mate choice. However, evidence shows that in traditional societies where odor is more significant to daily activities, such as food foraging, olfactory performance and cognition are superior to those of individuals living in industrialized cities (Burenhult & Majid, 2011; Majid & Burenhult, 2014; Majid & Kruspe, 2018; Sorokowska, Sorokowski, Hummel, & Huanca, 2013; Wnuk & Majid, 2012). Future research should investigate body odor preferences cross-culturally. Facial averageness and symmetry are generally accepted as cues of mate quality, in part because both predict attractiveness judgments across different societies (e.g., Apicella, Little, & Marlowe, 2007; Cunningham, Roberts, Barbee, Druen, & Wu, 1995; Jones & Hill, 1993; Little, Apicella, & Marlowe, 2007; Rhodes et al., 2001). Although demonstrating that, to a degree, some men and women generally smell more attractive than others is a promising first step, additional steps must be taken before we can conclude body odor preferences are adaptations for optimal mate selection.
In addition, we used a racially heterogenous sample to estimate agreement in judgments of attractiveness. Therefore, our estimates of agreement are possibly deflated in comparison to if we had used a racially homogenous sample. Race has been demonstrated to influence attractiveness preferences for faces and voices (e.g., Wheatley et al., 2014). Unfortunately, it was not feasible with our sample to perform a robustness analysis estimating interrater reliability within independent homogenous subgroups of participants.
Further, outside of a controlled laboratory setting, humans often wear fragrances, shower, and choose to eat food regardless of their aromatic properties. While controlling for these variables by instituting a two-day washout period before odor sampling is standard procedure in this literature, we are unaware of studies demonstrating that two days are adequate to return a donor's “natural” body odor. Thus, these methods could result in evidence that raters agree on odors but not necessarily “natural” body odors.
Finally, we did not control for potential menstrual cycle effects and oral contraception usage. We initially planned to analyze hormone data, which would have been used to assess women's oestradiol and progesterone levels. This hormone data would have been indicative of cycle phase. However, we were not able to assay our samples due to laboratory and labor disruptions associated with the COVID-19 pandemic. Although menstrual cycle effects are heavily debated (Gildersleeve, Haselton, & Fales, 2014; Harris, 2011, Harris, 2013; Harris, Chabot, & Mickes, 2013; Harris, Pashler, & Mickes, 2014; Wood & Carden, 2014; Wood, Kressel, Joshi, & Louie, 2014), some studies demonstrate that menstrual cycle phase and hormonal contraceptive use affect women's perceptions of men's body odor (Grammer, 1993; Havliček, Roberts, & Flegr, 2005; Hummel, Gollisch, Wildt, & Kobal, 1991; Sorokowska, Sorokowski, & Szmajke, 2012; Thornhill, Chapman, & Gangestad, 2013), faces (e.g., Ditzen, Palm-Fischbacher, Gossweiler, Stucky, & Ehlert, 2017; Johnston, Hagel, Franklin, Fink, & Grammer, 2001; Little, Burriss, Petrie, Jones, & Roberts, 2013; Little & Jones, 2012; Penton-Voak et al., 1999; Penton-Voak & Perrett, 2000), and voices (Feinberg et al., 2006; Pisanski et al., 2014; Puts, 2005, Puts, 2006). Conversely, other studies cast doubt on the existence of cycle shifts in preferences for faces (e.g., Jones et al., 2018; Marcinkowska, Hahn, Little, DeBruine, & Jones, 2019) and voices (e.g., Jünger et al., 2018). Yet, because we did not collect the necessary data to examine menstrual cycle effects in the current study, we cannot contribute to this important debate in a meaningful way.