The Flynn effect for fluid IQ may not generalize to all ages or ability levels: A population-based study of 10,000 US adolescents. Jonathan M. Platt et al. Intelligence, Volume 77, November–December 2019, 101385. https://doi.org/10.1016/j.intell.2019.101385
Highlights
• When outdated norms are used, the Flynn Effect inflates IQs and potentially biases intellectual disability diagnosis
• In a large US-representative adolescent sample, a Flynn Effect was found for IQs ≥ 130, and a negative effect for IQs ≤ 70
• IQ changes also differed substantially by age group
• A negative Flynn Effect for those with low intellectual ability suggests widening disparities in cognitive ability
• Findings challenge the practice of generalizing IQ trends based on data from non-representative samples
Abstract: Generational changes in IQ (the Flynn Effect) have been extensively researched and debated. Within the US, gains of 3 points per decade have been accepted as consistent across age and ability level, suggesting that tests with outdated norms yield spuriously high IQs. However, findings are generally based on small samples, have not been validated across ability levels, and conflict with reverse effects recently identified in Scandinavia and other countries. Using a well-validated measure of fluid intelligence, we investigated the Flynn Effect by comparing scores normed in 1989 and 2003, among a representative sample of American adolescents ages 13–18 (n = 10,073). Additionally, we examined Flynn Effect variation by age, sex, ability level, parental age, and SES. Adjusted mean IQ differences per decade were calculated using generalized linear models. Overall the Flynn Effect was not significant; however, effects varied substantially by age and ability level. IQs increased 2.3 points at age 13 (95% CI = 2.0, 2.7), but decreased 1.6 points at age 18 (95% CI = −2.1, −1.2). IQs decreased 4.9 points for those with IQ ≤ 70 (95% CI = −4.9, −4.8), but increased 3.5 points among those with IQ ≥ 130 (95% CI = 3.4, 3.6). The Flynn Effect was not meaningfully related to other background variables. Using the largest sample of US adolescent IQs to date, we demonstrate significant heterogeneity in fluid IQ changes over time. Reverse Flynn Effects at age 18 are consistent with previous data, and those with lower ability levels are exhibiting worsening IQ over time. Findings by age and ability level challenge generalizing IQ trends throughout the general population.
Keywords: IntelligenceFlynn effectAdolescenceIntellectual disabilities
Cool charts and tables at the publisher's link above. Excerpts:
5. Discussion
The present study utilized data from a large US-representative
sample of adolescents to describe changes in IQ between 1989 and
2003. There were three central findings: 1) Overall, there was no evidence
of a Flynn Effect during the study period; 2) however, overall IQ
trends masked substantial heterogeneity in the presence and direction
of the Flynn Effect by both ability level and age; and 3) there was no
variation in the Flynn effect as a function of other sociodemographic
characteristics.
The overall lack of a Flynn Effect in our sample is concordant with
trends in the K-BIT, KBIT-2, the Kaufman Assessment Battery for
Children (K-ABC and KABC-II), and other individually administered
screening tests reported in a previous meta-analysis (Trahan et al.,
2014). It also conforms with the conclusion that gains have decreased
in more recent decades (Pietschnig & Voracek, 2015). However, studies
using other tests (e.g., Wechsler scales) did find substantial Flynn
Effects (Pietschnig & Voracek, 2015; Trahan et al., 2014). Explanations
for the Flynn Effect are diverse. Although genetic explanations focusing
on factors such as hybrid vigor (Mingroni, 2007; Rodgers & Wänström,
2007) have been proposed, environmental explanations predominate
(Dickens & Flynn, 2001), emphasizing societal changes in perinatal
nutrition (Lynn, 2009) and nutrition in general (Colom, Lluis-Font, &
Andrés-Pueyo, 2005), education (Teasdale & Owen, 2005), reduced
number of siblings (Sundet, Borren, & Tambs, 2008), the prevalence of
parasites and the burden of disease (Daniele & Ostuni, 2013; Eppig,
Fincher, & Thornhill, 2010), and increased environmental complexity
(Schooler, 1998).
By contrast, other studies have reported reverse Flynn Effects. In
discussing these negative trends in Scandinavian countries, Lynn and
colleagues hypothesized that they may be due to greater fertility among
low SES groups, immigrants, and older adults (Dutton et al., 2016;
Dutton & Lynn, 2013). However, a recent analysis in Norway to test
these claims largely rejects their hypotheses, reporting that Flynn
Effects were not consistent within families over time (Bratsberg &
Rogeberg, 2018). Further, a recent meta-analysis found no substantial
role of fertility on test score changes across an array of studies
(Pietschnig & Voracek, 2015), and recent empirical evidence suggests
that immigration effects do not play a meaningful role in explaining
Flynn Effect reversals (Pietschnig, Voracek, & Gittler, 2018).
We add to the evidence reported in previous studies, by reporting
heterogeneity in the Flynn Effect by ability level and age. We find
support for a reverse Flynn Effect for those of low ability and older age,
and a positive Flynn Effect for those of high ability and younger age.
These results have several implications. First, they signal a widening
disparity in the US in terms of cognitive ability, with those at the lower
end of the ability dimension not only exhibiting less gains than those at
the higher ends, but reversing direction entirely. Second, these results
have implications for considering demographic differences when adjusting
IQ test scores in the population.
Improvements in education, nutrition, prenatal and post-natal care,
and overall environmental complexity over the past century are
thought to contribute to the Flynn Effect in the overall population
(Dickens & Flynn, 2001; Lynn, 2009; Schooler, 1998; Teasdale & Owen,
2005). However, the disparities by ability level that we identified
suggest that the benefits from these societal improvements have been
more dramatic for those at the highest ability levels, potentially because
they are better able to take advantage of these societal changes. This
interpretation is in line with Fundamental Cause Theory (Phelan, Link,
& Tehranifar, 2010), which argues that when new knowledge or technology
is introduced into a society, those with the highest status are
most likely to take advantage first and benefit. Disproportionate utilization
by those with higher abilities may widen intellectual disparities,
leaving those at the lowest ability levels worse off than before. We note,
however, that the Flynn Effect did not differ across other measures of
status, such as poverty and parental education. The correlation analyses
we conducted revealed a positive association of moderate magnitude
between IQ and the size of Flynn Effect, for every age group between 13
and 18, regardless of whether that group showed an overall positive or
negative Flynn Effect. One possible interpretation of this pattern is that
adolescents with high fluid intelligence, not necessarily those with the
highest access to resources, have benefitted most from societal progress
over time.
Previous research on the stability of the Flynn Effect across ability
levels has produced inconsistent and inconclusive results (McGrew,
2015; Weiss, 2010). Sometimes it has been higher at low IQs, and
sometimes a reverse Flynn Effect has been found in high IQ samples
(Spitz, 1989; Teasdale & Owen, 1989; Zhou et al., 2010). A meta-analysis examining ability level as a moderator variable did not observe a
Flynn Effect for those with low IQ (Trahan et al., 2014). However,
previous studies differ in quality (Trahan et al., 2014) and often rely on
small sample sizes at the lower end of the IQ distribution (Zhou et al.,
2010). Specifically, Trahan and colleagues noted, “the distribution of
Flynn effects that we observed at lower ability levels might be the result
of artifacts found in studies of groups within this range of ability” (p.
1349).
We also identified variation in the Flynn Effect by age. The positive
Flynn Effect of 2.3 points per decade at age 13 approximately equals the
value obtained in a summary of studies of Raven's matrices for nearly
250,000 children in 45 countries (Brouwers, Van de Vijver, & Van
Hemert, 2009) and in a meta-analysis of about 14,000 children and
adults in the US and UK (Trahan et al., 2014). However, the 2-point
value is smaller than the traditional 3 points for global intelligence and
4 points for fluid intelligence (Pietschnig & Voracek, 2015). Likewise,
the reverse Flynn Effect that occurred at ages 15–18 was similar to
effects reported in Scandinavian countries among young adult males
during the same time period (Bratsberg & Rogeberg, 2018; Dutton &
Lynn, 2013; Sundet et al., 2004; Teasdale & Owen, 2005, 2008), and in
other countries as well, such as France (adults tested on WAIS-III and
WAIS-IV) and Estonia (young adults tested on Raven's Matrices)
(Dutton et al., 2016). The age effects are discordant with previous
metaanalyses. Pietschnig and Voracek (2015) evaluated age effects and
found stronger gains for adults than children. In their meta-analysis,
Trahan et al. (2014) did not find a significant relationship between
Flynn Effect and age in their examination of the mean ages across
heterogeneous and often small samples. Our methodology differed from
the techniques used in both meta-analyses, as we studied large samples
that were homogeneous by age.
The notable differences we identify among narrowly defined age
groups may be related to cognitive and neurodevelopmental changes
that occur during adolescence. Fluid reasoning abilities and cognitive
abilities that support reasoning (e.g., rule representation) develop rapidly during early adolescence (Crone et al., 2009; Crone, Donohue,
Honomichl, Wendelken, & Bunge, 2006; Ferrer, O'Hare, & Bunge, 2009;
Žebec, Demetriou, & Kotrla-Topić, 2015). Brain regions that play a
central role in reasoning and problem solving, including the dorsolateral and ventrolateral prefrontal cortex and superior and inferior
parietal cortex, also exhibit dramatic changes in structure and function
across adolescence (Bunge, Wendelken, Badre, & Wagner, 2004; Ferrer
et al., 2009; Gogtay et al., 2004; Wendelken, Ferrer, Whitaker, & Bunge,
2015; Wright, Matlen, Baym, Ferrer, & Bunge, 2008). The notably different Flynn Effects by age in our study caution against generalizing
findings for a specific sub-group (such as conscripted young adult
males, which comprise the Scandinavian samples) to the nation as a
whole (Dutton & Lynn, 2013).
The present study identified no meaningful relationship between
Flynn Effect and poverty, parental education other sociodemographic
variables and background factors, including parental nationality, birth
order, family size, age of birth mother and father. This finding is notable given that these demographic variables are associated with IQ
level (von Stumm & Plomin, 2015), including in our sample (Platt,
Keyes, et al., 2018).
The results of this study should be considered in light of several
limitations. First, the study data were obtained 15 years ago. However,
this period was an ideal time to evaluate the presence of a reverse Flynn
Effect in the US, given the reverse effects found in Denmark, Norway,
Finland, and several other countries (Dutton et al., 2016; Teasdale &
Owen, 2008). In more recent years, no reverse Flynn Effect has been
observed for Wechsler's scales, as gains on the WAIS-IV (Wechsler,
2008) and WISC-V (Wechsler, 2014). Full Scale IQ have been close to
the hypothesized value of 3 points per decade (J Grégoire & Weiss,
2019; Jacques Grégoire, Daniel, Llorente, & Weiss, 2016 Weiss,
Gregoire, & Zhu, 2016; Zhou et al., 2010), especially when test content
is held constant (J Grégoire & Weiss, 2019; Weiss et al., 2016).
Second, the K-BIT nonverbal test is a screening test that measures a
single cognitive ability. It is, however, an analog of Raven's popular
matrices test which is commonly used in Flynn Effect studies (Brouwers
et al., 2009; Flynn, 1998; Pietschnig & Voracek, 2015). The Flynn Effect
is known to differ for different cognitive abilities (e.g., fluid intelligence, short-term memory) (Pietschnig & Voracek, 2015; Teasdale
& Owen, 2008), which may contribute to heterogeneity in findings
across studies with differing IQ measures. However, the K-BIT and
KBIT-2 nonverbal IQ is substantially correlated with comprehensive IQ
tests, such as the Wechsler's Full Scale IQ (mid-.50s to mid-70s)
(Canivez et al., 2005; Kaufman & Kaufman, 1990, 2004), though it is
lower than the correlation between different comprehensive test batteries (Kaufman, 2009; Wechsler, 2014). The present findings are descriptive and any practical application regarding the adjustment of IQs
must be made with the awareness that clinical diagnosis, such as the
identification of individuals with intellectual disabilities, must be based
on comprehensive IQ tests such as Wechsler's scales or the WoodcockJohnson, which assess multiple cognitive abilities.
Third, the study included only adolescents, which represents a
narrow period that may not capture meaningful developmental
changes. Indeed, fluid reasoning changes between ages 13–18 are
minimal (Wechsler, 2008, 2014), including in the present 2003 K-BIT
norms sample (Keyes et al., 2016) and the original 1989 norms sample
Kaufman & Kaufman (1990, Table 4.7). This age pattern may partially
explain why we found no overall Flynn Effect in this sample.
Fourth, different procedures were used to develop the 1989 and
2003 norms. The 1989 norms were estimated based on aggregated data
across all age groups, in order to stabilize norms at all ages (Angoff &
Robertson, 1987). Although slightly different statistical techniques
were used to develop the 2003 norms, the general approach to norms
development was similar between samples, and one test author (ASK)
was involved in the development of both sets of norms. Both samples
were representative of the US distributions of sociodemographic, economic, and other key background variables at the time (Kaufman &
Kaufman, 1990; Kessler, Avenevoli, Costello, et al., 2009). Further, both
sets of norms are based on six-month age bands. These samples are at
least as convergent as similar studies comparing samples used to develop original vs. revised norms. Previous studies have differed substantially by key sociodemographic distributions, such as the WISC and
WISC-R (Wechsler, 1949, 1974), which were key samples in the development of the Flynn Effect theory (Flynn, 1984). In the present
study, we adjusted the Flynn Effect for an array of background variables
to further minimize any differences between the 1989 and 2003 norms
samples that may confound the Flynn Effect estimates.
Fifth, the Flynn Effect has had a non-linear trajectory over the past
century (Pietschnig & Voracek, 2015). Because our study included IQ
measurements at only two time points, we were not able to test the
linearity of change over time.
This study is strengthened by the use of a large and representative
adolescent sample, with IQs measured with reasoning items that are
widely accepted as prototypical measures of fluid intelligence (Dutton
et al., 2016). The use of two sets of norms based on a single
administration of a test avoids practice effects and bias that may arise
from use of different versions of a test.
In conclusion, this study reports important heterogeneity in the
Flynn Effect among a nationally-representative sample of US adolescents.
We confirmed previous reports of reverse Flynn Effects among
large samples of older adolescent males, and extended the same pattern
to females. We also found important differential Flynn Effects by ability
level. These results add to a growing body of evidence suggesting that
Flynn Effect findings from narrow age bands or ability levels may
produce divergent findings that do not generalize to the overall population.
However, given the potential life or death implications of this
research in determining intellectual status in capital punishment cases,
the strength of evidence needed for definitive conclusions is extremely
high. At this time, we do not have sufficient evidence to recommend
differential adjustments to IQ scores. Additional research is needed to
replicate the current findings on the full age range and across comprehensive
measures of intelligence.
No comments:
Post a Comment