Tuesday, June 22, 2021

The first large-scale assessment of ravens' cognitive abilities suggests that, by 4 months of age, ravens do about as well as adult chimps and orangutans on tests of causal reasoning, social learning, theory of mind, &c.

Ravens parallel great apes in physical and social cognitive skills. Simone Pika, Miriam Jennifer Sima, Christian R. Blum, Esther Herrmann & Roger Mundry. Scientific Reports volume 10, Article number: 20617, Dec 10 2020. https://www.nature.com/articles/s41598-020-77060-8

Abstract: Human children show unique cognitive skills for dealing with the social world but their cognitive performance is paralleled by great apes in many tasks dealing with the physical world. Recent studies suggested that members of a songbird family—corvids—also evolved complex cognitive skills but a detailed understanding of the full scope of their cognition was, until now, not existent. Furthermore, relatively little is known about their cognitive development. Here, we conducted the first systematic, quantitative large-scale assessment of physical and social cognitive performance of common ravens with a special focus on development. To do so, we fine-tuned one of the most comprehensive experimental test-batteries, the Primate Cognition Test Battery (PCTB), to raven features enabling also a direct, quantitative comparison with the cognitive performance of two great ape species. Full-blown cognitive skills were already present at the age of four months with subadult ravens’ cognitive performance appearing very similar to that of adult apes in tasks of physical (quantities, and causality) and social cognition (social learning, communication, and theory of mind). These unprecedented findings strengthen recent assessments of ravens’ general intelligence, and aid to the growing evidence that the lack of a specific cortical architecture does not hinder advanced cognitive skills. Difficulties in certain cognitive scales further emphasize the quest to develop comparative test batteries that tap into true species rather than human specific cognitive skills, and suggest that socialization of test individuals may play a crucial role. We conclude to pay more attention to the impact of personality on cognitive output, and a currently neglected topic in Animal Cognition—the linkage between ontogeny and cognitive performance.


Discussion

Here, we provide the first quantitative, large-scale investigation of physical and social cognitive skills in a large-brained songbird species—ravens. We particularly examined the effect of development on cognitive performance, and revisited the claim that corvids rival non-human primates in their cognitive abilities34,40. To achieve these goals, we fine-tuned one of the most elaborate large-scale cognitive test batteries—the PCTB10—to raven features. The results demonstrated that our ravens showed comparable cognitive performance in the domains of social and physical cognition. The performance was highest in tests of quantitative and lowest in tasks of spatial skills. Full-blown cognitive skills were already present at the age of four months, and did not significantly change within the investigated time window. The quantitative cross-species comparison showed that, with the exception of spatial skills, the cognitive performance of our birds was on par with those of orang-utans and chimpanzees.

In the following, we will discuss these findings in detail.

Cognitive performance in physical and social cognitive scales

Overall, we found that our ravens’ physical cognitive performance was very similar to their social cognitive performance, with highest performance scores in quantitative skills and lowest performance scores in spatial skills. These results are not in line with our prediction suggesting that ravens perform differently in the domains of physical and social cognition48.

There are several possible explanations. First, differences in physical and social cognitive performance may have simply been obscured by the use of a cognitive test battery designed to tackle potential drivers of human cognitive evolution (see for similar accounts18,89). For instance, task design in the PCTB is anchored in the challenges faced by humans and great apes in their daily lives: to find and locate food, use tools and cope with conspecifics. In contrast, although ravens also have to deal with the challenges of discovering and locating food and manoeuvring in a complex social world, they extensively scatter-hoard carcass meat and are non-habitual tool-users47,90. The test battery may therefore have not been suitable to pinpoint differences in ravens’ physical and social cognitive skills. However, if this explanation is true, we would have expected to find no differences between scales which does not accord with our observations (but see for a recent study on parrots56).

Second, differences in physical and social cognitive performance may only develop later than 16 months of age, and were thus not detected across the four investigated time points. If this explanation were true, we would have expected to find no differences between any tested physical and social cognitive scale across the four different time points, but this was not the case (see Table S4). In addition, recent studies on the development of gaze following skills77 and sensorimotor abilities of ravens72 showed that the general developmental pace is very fast compared to that of other bird and mammal species.

Third, the assumption that ravens have specialized in the social rather than the physical domain48 is simply due to shortage of data. Indeed, due to ravens living in complex societies characterized by fission–fusion dynamics researchers have been fascinated with their social cognitive abilities (see for recent reviews40,49). In addition, studies examining single cognitive aspects have provided many crucial aspects to the remarkable tool-kit of ravens’ physical and social cognitive skills (e.g. 42,46,91,92). Furthermore, ravens are renowned for caching and hoarding food40, combining both sophisticated social (e.g., being highly sensitive to the presence of predators and/or conspecifics that may pilfer caches40,47), and physical cognitive skills (such as remembering where and how much food was cached4047). Hence, our results reveal that ravens are both social and physical intellects, and strengthen recent suggestions that ravens cognitive skills are an expression of general rather than domain specific intelligence36.

In addition, a recent reanalysis of the original PCTB dataset of chimpanzees and children75 using a confirmatory factor analysis (CFA) did not support the original division of the test battery into a social and a physical cognitive domain. Instead, it identified a spatial cognition factor (see also93), suggesting to move beyond the idea that social cognition might be dissociable from physical cognition and evolved separately. The study, thereby, also adds important fuel to the recent debate on cognitive test batteries in animal cognition research (e.g. 18,56,89). For instance, some scholars stress to pay more attention to overlooked task demands that may affect performance (e.g., tracking the movement of human experimenters94), while others suggest to improve test batteries on multiple fronts such as the design of the tasks, the domains targeted and the species tested95. Furthermore, scholars emphasized the importance of addressing the same conceptual question by using tasks that a given species can solve50. In addition, Völter and colleagues96 proposed a psychometric approach involving a three-step program consisting of (1) tasks that reveal signature limits in performance (i.e. the way individuals make mistakes), (2) assessments of the reliability of individual differences in task performance, and (3) multi-trait multi-method test batteries.

The development of cognitive skills

The results showed that our ravens’ cognitive performance did not change across the four investigated time points of four, eight, twelve and 16 months respectively. These findings support the prediction that ravens undergo a relatively rapid cognitive development. They further expand recent results on single cognitive skills and sensorimotor development68,72 in ravens to the physical cognitive scales of Causalities and Quantities and the manifold domain of social cognition. For instance, Schloegl and colleagues77, combining natural observations and behavioural experiments, showed that ravens, shortly after fledging (between 8–15 weeks of age), started to follow the gaze (look where others look) of a conspecific and a human experimenter. This developmental period coincides with ravens still living with their family groups, and the parents still (partially) providing for them. Similarly, studies on two primate species, macaques (Macaca nemestrina) and chimpanzees, revealed that individuals of these species started to follow the look-ups of human experimenters at the end of infancy97,98. Furthermore, our results are also in line with recent studies on other corvid species linking object permanence abilities to general development. For instance, Pollok and colleagues67 showed that magpies master Piagetian Stages 4 and 5 before nutritional independence. Hoffmann and colleagues99 investigated whether object permanence abilities are a function of the duration of development across four corvid species. Taking the hatching-to-fledging time as an indicator for development, they showed that Eurasian jays needed by far the shortest time for passing Stage 5 (6 weeks of age) and Stage 6 (7 weeks of age), with carrion crows (Stage 5: 11 weeks of age; Stage 6: 13 weeks of age) and ravens (Stage 5: 11 weeks of age; Stage 6: 14 weeks of age) following several weeks later.

These results are in contrast to findings on individuals of two psittacine species (Cyanoramphus auriceps, Psittacus erithacus), which show considerably slower developmental paces and achieve Piagetian Stage 5 only after independence (between 19 weeks of age, respectively 18 weeks of age)67. The differences in developmental speed and the linkage to general developmental patterns may reflect a general difference in maturing executive functions and hence cognitive trajectories of corvids and parrots99. However, it may also be possible that rapid cognitive development has been selected for in food-storing species, which use memory to retrieve stored food and have a larger hippocampus relative to the rest of the telencephalon than do species that store little or no food14,59.

Since ravens’ survival and reproductive output relies heavily on successful cooperation and alliances40,47, the rapid pace of ravens’ cognitive toolkit in the physical and social domain may thus also represent a selective response to manoeuvring in a world characterized by the complex challenges of an ever-changing ecological environment and governed by highly cooperative motives46,47.

Comparison of cognitive performance of ravens and great apes

With the exception of spatial skills, the quantitative comparison of performance scores of our ravens and the great ape individuals showed considerable similarities across the two domains of physical and social cognition. These results are also in line with a recent study using the PCTB to test cognitive performance of two Old World monkey species with chimpanzees showing higher performance scores than macaques in tasks of spatial understanding and tool-use only18. Since ravens perform impressive flight acrobatics, rely heavily on caching and pilfering of food-stores40,47, and have been shown to master stage 6 of object permanence68, the relatively low performance scores in the Space scale are surprising. Similarly, a recent study using the PCTB to investigate and compare cognitive skills of four parrot species (Ara glaucogularis, Ara ambiguus, Primolius couloni, Psittacus erithacus) showed that the parrots’ performance was also relatively poor in the scale Space (but also across all other scales tested). Individuals were significantly above chance only in the object permanence (Ara glaucogularis, Primolius couloni, Psittacus erithacus), and the rotation task (Ara glaucogularis56. Hence, our findings may echo Köhler who noted that “the success of the intelligence tests in general will be more likely endangered by the person making the experiment than by the animal” (p 265100). Since, ravens’ and other corvids’ social life is highly competitive101, all aspects of their cognitive abilities have likely been shaped by the need to out-compete conspecifics in general. It thus may be possible that our ravens’ performance in the scale Space—but also all other physical cognitive scales—was overshadowed by a social component with the ravens perceiving the experimenter as a competitor for the food reward. These findings may add a new aspect to proposals suggesting to integrate a competitive component into experimental designs71,102.

In contrast to our ravens’ performance, however, the parrots tested by Krasheninnikova and colleagues56 performed at chance level across all three physical and all three social cognitive scales. These results are in stark contrast to previous findings on parrots’ remarkable cognitive capacities (see for reviews49,103). They also  emphasize Tinbergen’s notion that the same test for a different species may therefore not be the same test104. Furthermore, differences in test performance between individuals of the parrot and our study may also be due to differences in socialization such as hand-raising, habituation and training procedures, and social bond strength between the birds and the experimenters (see also77,105). For instance, the birds in the present study were tested by two highly familiar people who had also hand-raised them. In contrast, tests in the study of Krasheninnikova and colleagues56 had been conducted by ten familiar experimenters, which had not hand-raised them, and four unfamiliar assistants. Hence, future studies should investigate the impact of these factors on cognitive performance in more detail to minimize possible counterproductive effects. In addition, analyses of why species fail in certain tests in combination with informed accounts of their ecological and social validity will aid in getting a better understanding of whether distinct tasks are too easy or too difficult for a given species to be solved18,89,102.

Furthermore, it is certainly an issue that the test battery was constructed and administered by humans10, influencing cognitive performance of our ravens overall. For instance, Schloegl and colleagues77 investigated the ontogeny of gaze following in ravens by using observations of spontaneously occurring gaze following behaviour between conspecifics and controlled experiments involving human experimenters. They found that visual co-orientation with conspecifics emerged around eight weeks of age, while gaze following behaviour to human-given cues could only be observed seven weeks later. Schloegl and colleagues82 suggested that human models may not be capable of providing the same stimulus quality as a conspecific due to emphasizing different aspects for eliciting gaze following behaviour. In contrast, Heinrich47 suggested that there is something unique about ravens that permits an uncanny closeness to develop with humans, thereby allowing insights in skills that could otherwise never be discovered.

Taken together, the present experiments provide evidence that our ravens’ experimental performance was on par with those of adult great apes in the similar tasks. They thus strengthen the idea that ravens evolved a general and flexible neural system for higher cognition36,106 rather than being highly specialized in a few domains only107. Yet, we do not claim that the cognitive abilities of ravens and great apes are generally similar since similarity at the behavioural level does not need to reflect the same underlying cognitive mechanisms50. This may be particular true for complex cognitive abilities such as tool use, cooperation, or referential signalling that involve different cognitive building blocks36. For example, referential signalling may involve aspects of learning, memory, empathy, and theory of mind, but the degree to which each of the abilities are involved and has advanced may differ between species and taxonomic groups46,108,109. In addition, it may also be the case that the cognitive competencies in the items tested in the PCTB simply did not differ substantially18. Furthermore, proponents of situated cognition argue that cognition reaches beyond the brain and tackle the relation between cognitive processes, on the one hand, and their neuronal, bodily, and worldly basis, on the other (for a review see110). This means that choices made via non-homologous body parts—beaks (ravens), hands (great apes), and eyes (ravens) combining panoramic sight with excellent stereoscopic vision111—not only involve different effectors but also different processors possibly influencing cognitive processing and output.

In addition, we do not claim that the cognitive performance of our eight ravens can be generalized to the species as a whole and corvids in general. For instance, some random effects seem to have influenced task performance suggesting to pay special attention in future studies to personality, task-performance across age and thus ontogeny of test-subjects (see e.g.112). Hence, the present study may pave the way to future collaborative studies and data sharing across research labs encouraging a ManyBirds project (see for related efforts113,114). It may thus aid in 1) tackling one of the biggest obstacles in Animal Cognition research, to obtain sufficient sample sizes, and 2) improving and adapting distinct tasks of test-batteries to better implement and mimic the ecology of the respective model species (see also115,116). Therefore, future studies should expand the range of investigated skills in a given test-battery beyond social interactions with humans and foraging contexts, and situate the findings within a comparative evolutionary framework (see also95,96,116). Furthermore, we hope to inspire more research into the impact of ontogeny on cognitive performance, which, although constituting one of Tinbergen’s four why’s, is especially lagging behind in studies of Animal Cognition117,118. 

Results are therefore inconsistent with claims in the literature that rats are altruistically motivated to share food with other rats, even when food is abundant

Failure to Find Altruistic Food Sharing in Rats. Haoran Wan, Cyrus Kirkman, Greg Jensen and Timothy D. Hackenberg. Front. Psychol., June 22 2021. https://doi.org/10.3389/fpsyg.2021.696025

Abstract: Prior research has found that one rat will release a second rat from a restraint in the presence of food, thereby allowing that second rat access to food. Such behavior, clearly beneficial to the second rat and costly to the first, has been interpreted as altruistic. Because clear demonstrations of altruism in rats are rare, such findings deserve a careful look. The present study aimed to replicate this finding, but with more systematic methods to examine whether, and under what conditions, a rat might share food with its cagemate partner. Rats were given repeated choices between high-valued food (sucrose pellets) and 30-s social access to a familiar rat, with the (a) food size (number of food pellets per response), and (b) food motivation (extra-session access to food) varied across conditions. Rats responded consistently for both food and social interaction, but at different levels and with different sensitivity to the food-access manipulations. Food production and consumption was high when food motivation was also high (food restriction) but substantially lower when food motivation was low (unlimited food access). Social release occurred at moderate levels, unaffected by the food-based manipulations. When food was abundant and food motivation low, the rats chose food and social options about equally often, but sharing (food left unconsumed prior to social release) occurred at low levels across sessions and conditions. Even under conditions of low food motivation, sharing occurred on only 1% of the sharing opportunities. The results are therefore inconsistent with claims in the literature that rats are altruistically motivated to share food with other rats.

Discussion

The present experiment was designed to replicate and extend some key conditions described by Ben-Ami Bartal et al. (2011), in which rats chose between social release and food. The present research focused on two main findings from that study and their related conclusions: (1) rats chose food and social release with similar latencies, and therefore, food and social release are equally valued; and (2) rats willingly share food with their social partner, even if it comes at a cost to the individual. Taken together, these findings provide key support for the authors' claims of altruistic food sharing. Because occurrences of such unreciprocated food sharing are rare in the published literature (Clutton-Brock, 2009Taborsky et al., 2016), they warrant further scrutiny.

With respect to the first claim of equal reward value of social release and food, we found that relative value of food and social release varied systematically across conditions. More specifically, when food motivation was low (i.e., the focal rat had unrestricted homecage access to chow in their home cage) and food quantity was high (4–5 pellets per trial), food and social release were chosen about equally often (Conditions 4, 6, and 7), consistent with the (Ben-Ami Bartal et al., 2011) findings. When food motivation was high (restricted access to food outside the session), however, rats clearly preferred food over social release (Conditions 1–3). This finding is consistent with the Hiura et al. (2018) findings, showing strong and reliable preference for food over social release when food is restricted outside the session (see also Blystad et al., 2019). Taken as a whole, the presents results show that relative preference between social and food is not invariant, but rather, is subject to reward and motivational variables (food quantity and overall food access). The relative value of social release and food are always subject to these (and other) variables, and it would therefore be premature to draw broad conclusions about their relative value from sampling only a limited range of conditions. In any case, a motivational view of social and food rewards helps explain discrepant findings from prior research.

The changes in preference across manipulation of food quantity in the first three conditions were driven mainly by changes in the number of food choices per session. This is partly due to economic factors (i.e., decreasing unit price of food) and partly due to satiation. Given the low price (1 response) and the dozens of choice opportunities each session, rats produced and consumed large numbers of sucrose pellets each session when chow in their home cage was restricted (37–284 pellets, mean = 131 across rats). By contrast, when home cage chow was unlimited and food motivation was low, subjects consumed substantially fewer pellets (16–107 pellets, mean = 66 across rats). And when coupled with unlimited food access outside the session in Condition 4, the procedures combined to produce conditions of low food need. Indeed, our rats had such an abundance of food, there was often food left at the end of the food collection periods of Conditions 5–7 (up to 26% of all pellets in Condition 5), even if there was no restrained rat with which to share them. That rats did not consume rewards as highly valued as sucrose pellets suggests a high degree of satiation.

Despite such low levels of food need, there was very little evidence of food sharing – the second and more controversial claim set forth by Ben-Ami Bartal et al. (2011). Behavior that met our operational definition of sharing (i.e., producing food and then releasing the rat while food remained available) was infrequent across all conditions in the experiment, with zero shared pellets being the most common outcome across sessions and the mean being about 1 pellet per session. It did not matter whether food access outside the session was restricted (Conditions 1–3) or not (Conditions 4–7); nor did it matter how many pellets were produced per response (Condition 1–3): rats rarely shared with the other rat any of the abundant supply of food pellets they produced each session. Even in the final two conditions, with procedures that most closely matched the original study (i.e., symmetrically arranged social and food locations, 5 sucrose pellets, and unrestricted access to food and social contact outside the session), sharing was seldom observed (see also Supplementary Video 1). Thus, on the whole, we found no evidence to support the 2011 claim by Ben-Ami Bartal et al. that a rat willingly shares food with another rat.

There is no simple way to reconcile the food sharing reported by Ben-Ami Bartal et al. (2011) with the near complete absence of sharing in the present study. Low levels of food sharing cannot be explained in terms of reduced opportunities for sharing, as the number of social releases (hence, sharing opportunities) remained fairly constant across conditions for individual rats (see Figures 36). This was accomplished by providing repeated exposure to a consistent duration of social contact (30 s) across the experiment. With long sessions and repeated trials, rats had ample opportunities to share the food they had produced; they simply did not do so. The discrepant results also cannot be explained in terms of differing definitions of sharing between experiments. Ben-Ami Bartal et al. (2011) used a less stringent indirect measure of sharing (difference between food consumed with and without a rat available to release) than our behavioral definition of sharing (produce food, then social release with food remaining). This alone cannot be responsible for the different results, however, for even if we adopt the less stringent criterion, our rats showed no differences in food consumption with or without a rat available to release (Figure 7). This is important, as evidence of sharing-related costs are crucial to an altruistic food sharing explanation. Thus, by neither definition did our rats engage in sharing.

We recognize that the present study relies on a small sample size of three rats. Even so, the evidence against food sharing is strong. Across all of the conditions in which sharing was possible, our rats earned an average of 2,171 rewards each (1,662 to 2,772 across rats) of which they shared an average of 47 (0 to 85, across rats), or 1% of the total rewards earned. Because each earned reward provided a sharing opportunity, our rats had vastly greater food sharing opportunities than rats in the Ben-Ami Bartal et al. (2011) experiment. Precise estimates of food sharing opportunities in that experiment are difficult, both because opportunity per reward cannot be derived from the data presented in the paper (% trials with sharing), and because it took the rats several sessions to learn to open the door for either option (before which rewards were not actually available to share). Nonetheless, the theoretical maximum would be 60 food sharing opportunities (five rewards per trial for 12 trials) per rat, roughly 1% of the number food-sharing opportunities in the present study. In addition to vast opportunities for food sharing, the present procedures produced consistent patterns of preferences across animals and over consecutive sessions. Thus, while the small number of rats in the present study limits our ability to generalize to the population of all rats, we have considerable confidence in the results with these particular rats: all were strongly disinclined to share food with their partners across all conditions and thousands of sharing opportunities. Perhaps only some rats engage in altruistic food sharing, differing from their non-sharing conspecifics for some reasons yet to be discovered, and that our sample happened to include only selfish rats who happen to be selfish in very similar ways. This seems unlikely, but it will nevertheless be important to replicate with larger samples of rats in future research.

Another difference between the studies is the food itself. Ben-Ami Bartal and colleagues used a single presentation of 5 chocolate chips, which amounts to approximately 12 calories of food, contained in about 1.62 g. By comparison, each of our sucrose pellets constitutes approximately 0.17 calories, each with a mass of 0.045 g. When subjects had unlimited home cage access to chow, they therefore tended to consume about 11.2 calories of food. Furthermore, rats in Condition 5 (with no opportunity for social access) left about 6 pellets (worth about 1 calorie) behind, despite having no external motivation to do so. This points to subjects with free access to chow leaving high-quality food unconsumed due to satiation, usually doing so shy of 12 calories. If rats with low food motivation are inclined to leave food unconsumed relatively frequently in the absence of conspecifics, it is difficult to argue that losing such food due to sharing can be understood as a “cost.” Ben-Ami Bartal and colleagues give no rationale for their choice of 5 chocolate chips, but based on the patterns of non-social food intake observed in the present study, it seems likely that, had they used 3 chocolate chips, that would have observed almost no sharing, whereas if they had used 7 chocolate chips, they would have observed relatively frequent sharing.

There are other differences between the procedures, and the only way to know for certain which factors are responsible for the discrepant results would be to begin with a direct replication, an exact reproduction of the original procedures, and thereafter change one variable at a time. We chose instead to conduct a systematic replication (Sidman, 1960), in which some, but not all, of the original procedures are reproduced. Systematic replications are useful in assessing the generality of a finding, and this fit with our broader objectives of providing a more thorough characterization of preference and sharing. We sought not only to replicate but to extend, to assess the generality of the findings by exploring behavior across a range of conditions, including but not limited to, those of the original study. In particular, the lack of adequate control conditions leaves the original study open to multiple interpretations. Sampling independent variables under varying conditions puts replication efforts into a broader context, changing the focus from binary questions with yes-no answers (e.g., Do rats value social release over food? Do rats share food with another rat?) toward conditional questions (e.g., Under what conditions is social release favored over food, and vice versa? Under what conditions does sharing occur?). Viewed in this way, Ben-Ami Bartal and collaborators are not so much incorrect as they are interpreting incomplete evidence; their results are part of more general relationships between preference and sharing and the variables of which they are a function.

Exploring such functional relationships across a parametric range can also shed light on theoretical disputes. For example, when examined at only a single point on a function, social release can be interpreted either in terms of social reward (response-contingent access to social interaction) or in terms of empathy (acting out of concern for the other rat): both accounts make the same prediction that door opening will occur. The accounts begin to differ, however, as behavior is examined while other experimental parameters change. For example, in procedures similar to those used here, Vanderhooft et al. (2019) first trained social release in rats, then systematically increased the price of social release (number of responses to produce it) across sessions, generating demand functions. Overall, the functions (27 in all) were well-described by the Hursh and Silberberg (2008) essential value model, a model that has proven useful in quantifying the value of numerous other rewards, including food, water, and drugs (Hursh and Roma, 2016). In other words, rates of social release behavior were predictable, with a high degree of quantitative precision, on the basis of these social reward functions. It is less clear, however, what, if anything, an empathy account would have to say about these data: it makes no obvious predictions about how empathy is affected by price – or other variables known to affect reward value (e.g., magnitude, delay, or probability), about which social reward makes clear and testable predictions. And if predictions could be derived from an empathy account (e.g., by assuming that empathy mirrors social reward functions), they would be indistinguishable from the more parsimonious social reward account, and would therefore add little to the explanation. This is not to deny the importance of empathy as a topic worthy of scientific study; it is, rather, to demand more stringent tests of it, especially in domains in which simpler explanations already exist.

Learning coherence is likely to emerge in individuals and triads, but not in dyads; this coherence in turn leads to higher performance

Harada T (2021) Three heads are better than two: Comparing learning properties and performances across individuals, dyads, and triads through a computational approach. PLoS ONE 16(6): e0252122. https://doi.org/10.1371/journal.pone.0252122

Abstract: Although it is considered that two heads are better than one, related studies argued that groups rarely outperform their best members. This study examined not only whether two heads are better than one but also whether three heads are better than two or one in the context of two-armed bandit problems where learning plays an instrumental role in achieving high performance. This research revealed that a U-shaped correlation exists between performance and group size. The performance was highest for either individuals or triads, but the lowest for dyads. Moreover, this study estimated learning properties and determined that high inverse temperature (exploitation) accounted for high performance. In particular, it was shown that group effects regarding the inverse temperatures in dyads did not generate higher values to surpass the averages of their two group members. In contrast, triads gave rise to higher values of the inverse temperatures than their averages of their individual group members. These results were consistent with our proposed hypothesis that learning coherence is likely to emerge in individuals and triads, but not in dyads, which in turn leads to higher performance. This hypothesis is based on the classical argument by Simmel stating that while dyads are likely to involve more emotion and generate greater variability, triads are the smallest structure which tends to constrain emotions, reduce individuality, and generate behavioral convergences or uniformity because of the ‘‘two against one” social pressures. As a result, three heads or one head were better than two in our study.


Discussion

One of the interesting findings in our study was that a relationship between performance and group size was validated to be U-shaped. As the regression analysis revealed, the causes for this performance difference could be attributed to higher values of the inverse temperatures β in both models. In dyads, group effects regarding the inverse temperatures in both models did not generate higher values to surpass their averages, which might lead to lower performance. In contrast, triads gave rise to higher values of the inverse temperatures than their averages of group members. These differences are responsible for the U-shaped relationship in performance. Although the model selection tests did not differentiate between the simple and asymmetric Q learning models, both shared the same results that the inverse temperature β accounted for higher performance. Thus, our results are robust to model specifications.

At individual levels, participants were more likely to perform the two-armed bandit game in an exploratory manner because their inverse temperatures were relatively lower to dyads and triads. The emphasis on exploration at individual levels indicate that rationality in terms of exploitation in the framework of the underlying learning model increased as more group members were added to the group decision-making processes. To achieve agreement in groups, logical reasoning and persuasion based on rational calculation would be required instead of exploration. Yet, in dyads, this increase in exploitation was not sufficient to make it significantly different from individuals. Indeed, group effects could not generate higher values of β than its averages. It could be inferred that dyads encountered learning incoherence, leading to smaller group effects regarding the inverse temperature.

According to Simmel [59], in dyads, social interaction is more personal, involving more affect or emotion, and generates greater variability. The negative aspect of social interaction seemed to appear in dyads in our experiments. On the other hand, Simmel [59] argued that triads are the smallest structure that tends to constrain emotions, reduce individuality, and generate behavioral convergences or uniformity because of the ‘‘two against one” social pressures. These forces form the basis for uniformity, emergent norms, and cohesion [60]. Consequently, while dyads failed to improve the inverse temperature beyond its average as a result of affective or emotional influences, the smallest social structure, in the form of a triad, improved efficiency due to social pressures and more exploitation. This is also consistent with the theoretical hypothesis in S1 Appendix where dyads are likely to adopt more randomized learning strategies, whereas individuals and triads adopt coherent learning strategies. Although individuals might use more exploratory behaviors, exploration itself is one of the coherent learning strategies. Hence, our empirical results support our hypothesis that learning incoherence takes place in dyads but not in triads.

Notably, the positivity biases were confirmed for individuals and triads, but no such learning biases existed for dyads. As related studies indicated [5055], learning biases are more likely in such leaerning situations. This result further evidences learning coherence in individuals and triads and learning incoherence in dyads.

Apart from this main result, the fact that group parameters achieved higher values than its means of individual members in most of the learning parameters deserves some attention in its own right. Not only triads, but also dyads, had these positive effects. Future studies should explore these group effects in more detail.

However, our findings are subject to several limitations. First, the results critically depend on the tasks that the groups perform and the learning situations where the TAB games are played. Different game settings could lead to different results. Second, learning properties could change over time through learning, therefore, their reliability might be subject to some limitations. Performance probably changed as participants undertook more TAB games, because of the stochastic nature of the rewards. However, it could be conjectured that its learning strategy tends to be relatively stable because participants could not fully detect the stochastic environments (i.e., which options are more likely to generate higher rewards), as the probability of obtaining higher gains was changed twice during the 100 trials. Hence, it seems that participants were less likely to change their learning strategies even when they undertook the TAB several times. This justifies the use of learning properties in this study. Nevertheless, the reliability of learning properties should be tested in a future study.

Third, although this study used a relatively large sample, different results could be found in different samples, in particular, in different cultural contexts. For example, Shen et al. [61] noted that, when examining the effects of risk-taking on convergent thinking, they found that risk-taking was negatively associated with convergent thinking in China, but these correlations were close to zero or negative in the Netherlands. Thus, cultural effects could alter the learning strategies in the TAB, and hence, the effects of group dynamics on group performance.

Despite these limitations, the findings in this study deserve some attention because previous studies did not evaluate and examine the effects of group dynamics in terms of learning properties. Moreover, the results are intuitive and consistent with the simple hypothesis that the U-shaped relationship with respect to performance emerged due to the coherence of learning strategies. Even though these results might not be supported in different experimental settings; our computational approach could still be applied and is expected to generate new results. Thus, the contribution in this study would be more methodological. This study encourages future research that examines the learning mechanism of group dynamics, according to the computational approach suggested in this study.

Does a 7-day restriction on the use of social media improve cognitive functioning and emotional well-being? Results from a randomized controlled trial show no benefits from a severe screen-time reduction

Does a 7-day restriction on the use of social media improve cognitive functioning and emotional well-being? Results from a randomized controlled trial. Marloes M.C . van Wezel, Elger L. Abrahamse, Mariek M. P.  Van den Abeele. Addictive Behaviors Reports, June 15 2021, 100365. https://doi.org/10.1016/j.abrep.2021.100365

Highlights

• We compared a 10% vs. 50% reduction in social media screen time in a RCT.

• The intervention had no effect on multiple indicators of attention and wellbeing.

• Self-control, impulsivity and FoMO did not moderate the relationships.

• Participants reported improved attention, but behavioral attention did not improve.

• Overall, a more severe screen time reduction intervention does not appear more beneficial.

Abstract

Introduction: Screen time apps that allow smartphone users to manage their screen time are assumed to combat negative effects of smartphone use. This study explores whether a social media restriction, implemented via screen time apps, has a positive effect on emotional well-being and sustained attention performance.

Methods: A randomized controlled trial (N= 76) was performed, exploring whether a week-long 50% reduction in time spent on mobile Facebook, Instagram, Snapchat and YouTube is beneficial to attentional performance and well-being as compared to a 10% reduction.

Results: Unexpectedly, several participants in the control group pro-actively reduced their screen time significantly beyond the intended 10%, dismantling our intended screen time manipulation. Hence, we analyzed both the effect of the original manipulation (i.e. treatment-as-intended), and the effect of participants’ relative reduction in screen time irrespective of their condition (i.e. treatment-as-is). Neither analyses revealed an effect on the outcome measures. We also found no support for a moderating role of self-control, impulsivity or Fear of Missing Out. Interestingly, across all participants behavioral performance on sustained attention tasks remained stable over time, while perceived attentional performance improved. Participants also self-reported a decrease in negative emotions, but no increase in positive emotions.

Conclusion: We discuss the implications of our findings in light of recent debates about the impact of screen time and formulate suggestions for future research based on important limitations of the current study, revolving among others around appropriate control groups as well as the combined use of both subjective and objective (i.e., behavioral) measures.

Keywords: screen timescreen time interventionsustained attentioncognitive performanceemotional well-beingself-report bias

4. Discussion

In the past decade, we have witnessed an increase in studies focusing on the complex associations between the use of the smartphone and its (mobile) social media apps on the one hand, and attentional functioning (Rosen et al., 2013Judd, 2014Kushlev et al., 2016Ward et al., 2017Wei et al., 2012Fitz et al., 2019Marty-Dugas et al., 2018) as well as emotional well-being (Twenge and Campbell, 2019Twenge et al., 2018Twenge and Campbell, 2018Escobar-Viera et al., 2018Brailovskaia et al., 2020Tromholt, 2016Stieger and Lewetz, 2018Aalbers et al., 2019Frison and Eggermont, 2017) on the other hand. While research in this field is not without criticism, among others for its over-reliance on self-report data and cross-sectional survey methodologies, the concerns over the potential harm of mobile social media use have nonetheless given impetus to the development of screen time apps that can help people to protect themselves from harm by restricting their social media use. The current study explored the effects of such a social media screen time restriction on sustained attention and emotional well-being.

The findings show that, first of all, the intervention did not have the intended effect. Specifically, we implemented a 50% restriction in social media screen time for an experimental group, and compared this to a control group with a 10% restriction. Yet, this screen time manipulation failed mostly because participants in the control group reduced their social media app use on average with 38%, which was much more than the intended 10%. We deliberately opted to not include a 0% reduction control group in our design, in order to avoid Hawthorne(-like) effects (cf. Taylor, 2004McCambridge et al., 2014) – hence, in order to provide also the control group participants with a full-blown sense of being involved in an experiment. The current finding that a non-zero percent reduction for a control group may trigger additional – and more problematic – side effects than the Hawthorne(-like) effects that we aimed to prevent with it, is an interesting finding in itself. It provides clear suggestions for optimal implementation of control groups in intervention studies of the current type, and deserves to be followed up as a target of investigation in itself. Indeed, some participants indicated that they felt uncomfortable when encountering a time limit. It is imaginable that participants reduced their screen time more than they needed to in order to avoid that situation. Alternatively, the failed manipulation may be due to a placebo effect (cf. Stewart-Williams & Podd, 2004). In this case, the mere expectation of receiving a social media reduction may have sufficed in promoting behavior change in the form of reduced social media use. Similar placebo effects were found in marketing research (Irmak, Block, & Fitzsimons, 2005).

To deal with the failed screen time manipulation, we provided analyses both for treatment-as-intended and treatment-as-is, with the latter set of analyses disregarding the intervention conditions but rather exploring linear associations between the degree of relative screen time reduction based on the data we obtained. Interestingly, neither analyses revealed a noticeable effect on the outcome measures. This finding suggests an alternative explanation for the lack of findings, namely that there may not be any negative association between social media screen time and the outcome measures to begin with. Indeed, the pre-test data – which are unaffected by the failed screen time manipulation – did not show any of the hypothesized correlations between social media screen time, emotional well-being and attentional performance. On the contrary, the only relationships found between social media screen time and the outcome measures ran counter to what one might expect: Heavier social media users reported experiencing less attentional lapses and negative emotions. The lack of any negative association between social media screen time and the outcome measures may explain why reducing this screen time has no causal impact: If social media screen time does not affect these outcomes much, altering it will unlikely cause much change in them.

This finding is interesting in light of recent debates in the field over the validity of screen time studies. A recurring concern voiced in these debates is that self-report measures of screen time are flawed to such an extent that their use can lead to biased interpretations (Kaye et al., 2020Sewall et al., 2020). A key strength of the current study is that we used a behavioral measure of screen time. The fact that this measure shows no relationship to cognitive performance nor emotional well-being, calls into question the ‘moral panic’ over social media screen time (Orben, 2020).

An alternative explanation that should be mentioned here, is that despite the randomization of participants, the control and experimental group were not fully equivalent in terms of their smartphone behavior in the week prior to the experiment. The control group appeared to consist of heavier Instagram users whereas the experimental group consisted of heavier WhatsApp users. It is thinkable that this non-equivalence has had some influence on our findings. After all, for the light Instagram users in the experimental group, a 50% reduction in Instagram use may not have been very impactful, whereas for the heavy Instagram users in the control group, the actually enforced relative reduction of 35% may have had a more profound impact, thus leveling out any difference between the two groups. Future researchers thus need to carefully consider their experimental procedures to maximize the chances of equivalence between conditions.

While we believe that a strength of our current study is the use of actual smartphone data and performance based measures of attention, the paucity of the use of such measures in previous work prevented us from conducting an appropriate a priori power analysis, resulting in a sample size that may have been too small – as indeed indicated by for example the accidental but significant differences between conditions in terms of their baseline app use (see above). We hope that our study can serve to that purpose in the future.

While the manipulation did not resort an effect, the findings of our study did show that – disregarding of the condition they were in – people reported experiencing less cognitive errors and attentional lapses at the post-test. This is interesting, given that their actual attentional performances did not improve. Again, these findings are interesting in light over the recent debates over the use of self-report measures in research on the associations between screen time and psychological functioning. Recent studies show that the use of self-report measures leads to an artificial inflation of effect sizes of these associations (Sewall et al., 2020Shaw et al., 2020), that self-reports of especially smartphone use are inaccurate (Boase and Ling, 2013Ellis et al., 2019Vanden Abeele et al., 2013), and that the discrepancies between self-reported and behavioral measures of smartphone use are themselves correlated with psychosocial functioning (Sewall et al., 2020). The mixed findings in research on the effects of screen time have led to a call for greater conceptual and methodological thoroughness (e.g., Whitlock and Masur, 2019Kaye et al., 2020Sewall et al., 2020Shaw et al., 2020), with a specific call to prioritize behavioral measures over self-report measures. The discrepancy between the behavioral and self-report attention measures may be an artifact of this shortcoming of self-report methodology.

The null-results of FoMO, self-control and impulsivity as influential moderators should be elaborated on here. It was expected that a screen time intervention would negatively impact the emotional well-being of individuals, especially those high on FoMO, since reduced social media screen time also reduces the possibility to stay up-to-date. However, our results could not corroborate this notion. Several authors have suggested that rather than being a predictor of social media use, FoMO may be a consequence of such online behavior (e.g., Alutaybi, Al-thani, McAlaney & Ali, 2020; Buglass, Binder, Bets, & Underwood, 2017; Hunt et al., 2018). In the three-week intervention study of Hunt et al. (2018) for example, reduced social media use actually reduced feelings of FoMO. With our data, we could test this possibility. Hence, we executed a repeated measures ANOVA with FoMO as within-subjects factor and condition as between-subjects factor. This analysis revealed that the intervention had no significant effect on experienced FoMO (i.e., the experimental group did not experience larger changes in FoMO than the control group: F(1,74)= 0.09, p=.762). However, there was an effect of time on FoMO: at the post-test, FoMO was significantly lower than at the pre-test (Mdif = 0.18, F(1,74)= 6.65, p= .012). Perhaps this is indicative of an “intervention effect”, since our manipulation had failed and all participant significantly reduced their social media use during the intervention week.

Also, an overall finding of this study, which aligns with what prior research has found, was that participants were not able to estimate their screen time accurately: While participants’ actual screen time decreased during the intervention week, their self-reported screen time did not differ over time. Interestingly, participants did report a decrease in habitual use and problematic use. This may suggest that people may have a vague sense of their behavior (“I reduced my smartphone use”), but are unable to convert this adequately into numbers such as screen time in minutes. Alternatively, participants may have provided a socially desirable answer. Either case, our findings aligned with both recent and older studies showing that subjective screen time measures deviate from objective measures (e.g., Andrews et al., 2015Boase and Ling, 2013Vanden Abeele et al., 2013Verbeij et al., 2021).

4.1. Limitations and Future Directions

This study is among the first to examine the effectiveness of a social media screen time reduction on sustained attention and emotional well-being. One of its strengths is the inclusion of behavioral measures, both for screen time and for sustained attention. The study is not without limitations, however. A number of methodological choices were made that significantly limit comparability with other findings in the field. The lack of a true control group (in which no intervention was implemented) and the limited sample size are major limitations to the current study. Future research should include more participants and should consider the use of a true control group, in which no intervention is implemented. Moreover, future research might look at different degrees of screen time reductions, ranging from no reduction to complete abstinence, to better address to what extent the magnitude of the restriction matters. To add, future work ought to consider how to account for individuals’ unique smartphone app repertoires. For instance, some individuals in our study were super users of mobile games rather than of social media. While this may lower generalizability, researchers might account for unique app repertoires by setting time restrictions on an individual’s top 5 apps, or on screen time in-total. Also, a one-week intervention is short. It is likely that a longer intervention is needed to produce an effect on the outcomes examined. Overall, a general observation that we make is that future research on screen time interventions needs to carefully question and compare (1) which types of interventions affect (2) which outcomes, (3) for whom and (4) under which conditions, and (5) because of which theoretical mechanisms.

An additional limitation is that, although they were kept blind about which condition they were in, participants were informed about what the experiment was about because willingness to set a restriction to one’s screen time was an important eligibility criterion; installing such a timer without the participants’ informed consent was deemed unethical. Given that the timers were installed on participants’ personal phones, it was easy for participants to look up what restriction was enforced on them. Future research might explore if participants can be kept in the blind. Perhaps this can be attained via the development of a screen time app tailored to this purpose. Notably, even though we found no increase in the use of social media on alternative devices, it should be acknowledged that social media can be accessed from other devices than smartphones alone, something that could be accounted for in future work. In this context, it is relevant to mention Meier and Reinecke’s (2020) taxonomy of computer-mediated communication. Meier and Reinecke advice researchers to carefully consider which level of analysis they are focusing on, most notably that of the device (i.e., a ‘channel-based’ approach) versus that of the functionality or interaction one has through the device. Decisions regarding the level of analysis are typically grounded in theoretical assumptions about the mechanisms explaining effects. We consider this observation relevant to researchers studying ‘digital detoxes’ or screen time interventions, as they similarly have to consider what it is exactly that they want participants to ‘detox’ from, the device, a particular app or functionality, or a type of interaction. Careful consideration of this issue is important, as it may be key to understanding why the extant research shows mixed evidence. In the current study we attempted to address the type of interaction people have with social media, targeting especially ‘passive social media use’ by enforcing only a partial restriction, but we only focused on mobile social media. Future researchers may wish consider more explicitly their level of analysis and how to operationalize it in an intervention.

Finally, as other research also shows (e.g., Ohme, Araujo, de Vreese, & Piotrowski, 2020), research designs that include behavioral measures of smartphone use are both ethically and methodologically challenging. In the current study, we only invited participants to the lab with smartphones running on recent versions of IoS or Android. However, some participants showed up unaware of the operating system of their phone. Others used older versions, on which the screen time monitoring features did not function, or had forgotten to activate the screen time monitoring feature prior to the baseline measurement (which we had also specified as an eligibility criterion). This led to exclusion of several participants. Additionally, in a pilot study of the experiment, we noticed that different phone brands and types use different interfaces to display screen time information. This led to confusion, for instance, over whether the displayed numbers were weekly or daily totals. Hence, to avoid errors, we chose not to let participants record their own screen time but rather explicitly asked participants to hand over their phone to a trained researcher who copied the information into a spreadsheet and installed the timers. Participants who felt uncomfortable with this procedure were invited to closely monitor the researcher, or – if desired – to navigate the interface themselves. Although only a handful of students chose this option, this shows that there are ethical implications to using data donation procedures that researchers have to consider.

To circumvent these issues in future studies, participants could be instructed to install the same app. However, this will increase the demands placed on participants. Participation in studies of this nature are already highly demanding and intensive, since participants have to undergo a multi-day intervention on behavior that is intrinsic to their daily lives, and with sharing of personal information. Additionally, asking participants to install a specific app that potentially remotely monitors their phone use can raise ethical concerns, especially when using a commercial app that makes profit of monitoring (and selling) user data.

Overall, it became clear that it is difficult to achieve the required sample size to investigate complex designs of this nature. Nonetheless, the contrasting findings in extant research call for more research on causal relations between social media use on the one hand, and emotional well-being and cognitive functioning on the other hand. This can only be achieved by the use of slow science and large resources.


Check also Reasons for Facebook Usage: Data From 46 Countries. Marta Kowal et al. Front. Psychol., April 30 2020. https://doi.org/10.3389/fpsyg.2020.00711

Sex Differences

Are there sex differences in Facebook usage? According to Clement (2019), 54% of Facebook users declare to be a woman. Research conducted by Lin and Lu (2011; Taiwan) showed that the key factors for men's Facebook usage are “usefulness” and “enjoyment.” Women, on the other hand, appear more susceptible to peer influence. This is concurrent with the findings of Muise et al. (2009; Canada), in which longer times spent on Facebook correlated with more frequent episodes of jealousy-related behaviors and feelings of envy among women, but not men. Similarly, in Denti et al. (2012), Swedish women who spent more time on Facebook reported feeling less happy and less content with their life; this relationship was not observed among men.


In general, women tend to have larger Facebook networks (Stefanone et al., 2010; USA), and engage in more Facebook activities than men do (McAndrew and Jeong, 2012; USA; but see Smock et al., 2011; USA, who reported that women use Facebook chat less frequently than men). Another study (Makashvili et al., 2013; Georgia) provided evidence that women exceed men in Facebook usage due to their stronger desire to maintain contact with friends and share photographs, while men more frequently use Facebook to pass time and build new relationships.