Nyman, T. J., Lampinen, J. M., Antfolk, J., Korkman, J., & Santtila, P. (2019). The distance threshold of reliable eyewitness identification. Law and Human Behavior, 43(6), 527-541. Dec 2019. http://dx.doi.org/10.1037/lhb0000342
Abstract: Increased distance between an eyewitness and a culprit decreases the accuracy of eyewitness identifications, but the maximum distance at which reliable observations can still be made is unknown. Our aim was to identify this threshold. We hypothesized that increased distance would decrease identification, rejection accuracy, confidence and would increase response time. We expected an interaction effect, where increased distance would more negatively affect younger and older participants (vs. young adults), resulting in age-group specific distance thresholds where diagnosticity would be 1. We presented participants with 4 live targets at distances between 5 m and 110 m using an 8-person computerized line-up task. We used simultaneous and sequential target-absent or target-present line-ups and presented these to 1,588 participants (age range = 6–77; 61% female; 95% Finns), resulting in 6,233 responses. We found that at 40 m diagnosticity was 50% lower than at 5 m and with increased distance diagnosticity tapered off until it was 1 (±0.5) at 100 m for all age groups and line-up types. However, young children (age range = 6–11) and older adults (age range = 45–77) reached a diagnosticity of 1 at shorter distances compared with older children (age range = 12–17) and young adults (age range = 18–44). We found that confidence dropped with increased distance, response time remained stable, and high confidence and shorter response times were associated with identification accuracy up to 40 m. We conclude that age and line-up type moderate the effect distance has on eyewitness accuracy and that there are perceptual distance thresholds at which an eyewitness can no longer reliably encode and later identify a culprit.
Public Significance Statement
The present study advances earlier findings regarding the negative impact that increased distance has on eyewitness accuracy by providing evidence for an upper distance threshold at 100 m for correct identifications. Our findings highlight the perceptual limits of eyewitness identifications and are relevant for use in courts of law by providing evidence that objective distance can be used as an estimation of eyewitness reliability.
KEYWORDS: eyewitness, identification, distance, face recognition
Discussion
Considering that distance negatively impacts facial encoding and later identification (e.g., Lampinen et al., 2014, 2015), we investigated the effect of distance on eyewitness accuracy in different age groups and line-ups. To achieve this, we conducted an ecologically valid outdoor experiment in which we presented participants with four live targets at distances between 5 m and 110 m, followed by an immediate identification task.
The Effect of Distance and Age on Eyewitness Accuracy
Distance had a significant negative effect in all age groups on identification accuracy in TP line-ups and rejection accuracy in TA line-ups. This held true for both simultaneous and sequential line-ups. There were also significant differences between age groups with the young children (ages 6 to 11) being significantly worse at identifying targets in TP sequential line-ups compared with young adults (ages 18 to 44). Further, older adults (ages 45 to 77) were significantly worse at making correct rejections in both TA simultaneous and sequential line-ups compared with young adults. On our initial analyses (post hoc analyses) the cut-offs in the simultaneous line-ups were 61 m (76 m) for young children, 98 m (110 m) for older children, 77 m (89 m) for young adults, and 69 m (89 m) for older adults. In the sequential line-ups the cut-offs were 47 m (60 m) for young children, 75 m (96 m) for older children, 63 m (79 m) for young adults, and 52 m (69 m) for older adults.
The initial results, which assumed unbiased line-ups, gave us diagnosticity cut-offs that were on average between 10 m and 20 m below the cut-offs found when using the most selected TA filler as the innocent suspect. Arguably, this means that the cut-offs from our initial analyses are perhaps too conservative (i.e., low). However, we would like to emphasize that the decline in diagnosticity with increased distance was similar using either approach and all diagnosticity levels had fallen to 1 (±0.5) at 100 m for all age groups and line-up types. Furthermore, by using the higher cut-offs that are based on the post hoc analyses, we can with adequate certainty define probable upper thresholds where there was no information gained from the line-ups (Wells & Lindsay, 1980; Wells & Olson, 2002). Our findings illustrate that distance has a dramatically negative effect on eyewitness accuracy so that even small variations in distance can play an important role in eyewitness accuracy. Collectively, these results indicated that when assessing eyewitness identifications, an objective measure of the distance between the eyewitness and the culprit is an important gauge of the odds of identification accuracy.
Interestingly, correct rejection rates for young adults decreased rather than increased with increased distance. For other age groups, correct rejection rates remained relatively stable over distance. Instead of an increase in rejection rates, we found an increase in filler selections (see Appendix B in the online supplemental materials), which is also reflected in the shift toward a more liberal response bias as distance increased. Only older children appear to have shifted to a slightly more conservative response bias at 90 m and above. Nevertheless, the overall increase in choosing suggests that participants were not good at taking into account the difficulty of the task and this could be taken as support for the hypotheses that when memory strength is low, more of the photographs match the target equally well, so participants may tend to choose rather than reject. In real life scenarios, where an eyewitness is asked to take part in a police line-up, the identification task inflates choosing rates due to less pristine conditions or the witness wanting to help the police (e.g., Wells et al., 2000). The results, therefore, suggest that the ability of participants to metacognitively judge the difficulty of the task was not proportional to the actual degree of difficulty; mirroring earlier findings (Smith et al., 2018).
Distance Estimation Accuracy
The main findings regarding distance estimation was that as distance increased the level of accuracy decreased and that young children and older adults made more erroneous distance estimations compared with young adults. Moreover, in comparison with young adults, increased distance increased error rates more for young children but less for older adults. Although it is difficult to make any clear interpretations regarding the age-related differences, it may be that experience plays an important role in estimating distance. It is possible that body height is a confounding variable, as a taller person (i.e., adults) might have an advantage when estimating larger distances. This could explain some of the age-related differences, although it does not explain why older adults had more errors and were less affected by increased distance compared with young adults. Notably, the large variation and overall low accuracy of distance estimation indicates that subjective estimations of distance are highly unreliable.
Simultaneous and Sequential Line-ups
Simultaneous line-ups provided an advantage over sequential line-ups, with higher accuracy and a less steep decline in diagnosticity and d′ for all age groups with increased distance. However, young children (ages 6 to 11) were worse at identifying targets in TP sequential lineups compared with young adults (ages 18 to 44) and older adults (ages 45 to 77) were worse at correctly rejecting line-ups in TA simultaneous and TA sequential line-ups, compared with young adults. The differences between age groups fall partly in line with earlier results (Fitzgerald & Price, 2015), because young children and older adults faired much worse compared with young adults. Interestingly, it was apparent from the simultaneous TA rejections that the older adults appear to have been almost equally good/bad at rejecting line-ups with increased distance. This suggests that older adults are prone to choose no matter the memory strength in the TA simultaneous line-ups (but not in the TA sequential line-ups), which could be seen as a dependency on familiarity rather than recollection (Healy et al., 2005; Shing et al., 2010, 2008). Generally, response bias was more liberal in sequential line-ups compared with simultaneous line-ups (see Appendix B in the online supplemental materials), indicating that sequential line-ups increased the likelihood of choosing compared with simultaneous line-ups. Nevertheless, response bias increased in both line-ups as distance increased, indicating that all age groups adopted a more liberal response criterion as the task became more difficult. We have interpreted this as reflecting a higher reliance on a familiarity-based rather than a recollection-based strategy.
Before placing too much emphasis on the differences between the simultaneous and sequential line-ups, it is important to note that the sequential line-ups differed from common U.S. police practice in that the task was absolute and no additional rounds were permitted (e.g., Steblay, Dietrich, Ryan, Raczynski, & James, 2011). Moreover, the number of images in the sequential line-ups was mentioned in the line-up instructions and this can decrease discriminability especially if the target image is presented late in the line-up (Horry, Palmer, & Brewer, 2012). It is, thus possible that the differences between the line-up types are partly due to different degrees of pristine conditions.
These results are relevant to the ongoing debate over simultaneous and sequential line-ups. There have been findings showing that simultaneous line-ups have an advantage, due perhaps to an increased discriminability in the relative judgment task (Clark, 2012; Clark et al., 2015; Gronlund et al., 2015; Wixted et al., 2016). Others have shown that sequential line-ups have an advantage because they decrease mistaken identifications without impacting the number of correct identifications (Steblay et al., 2003, 2011; Wells et al., 2015). Some have even proposed that sequential line-ups do not improve discriminability but have an advantage because they encourage the use of a more conservative criterion (Palmer & Brewer, 2012). The present results indicate that there are important differences in age groups depending on memory encoding and line-up type. When considering increased distance as a representation of lower memory quality, it is clear that most of the age differences disappeared at higher distances due to floor effects, representing the limits of perception and encoding. More research is needed to gain a more in-depth understanding of how different age groups make judgments based on variations in memory strength.
Confidence
A CAC analysis (Mickes, 2015) confirmed that high confidence is associated with high accuracy at distances up to 40m. This was true for all age groups. After 40m there were too few high-confidence observations to reliably analyze the results. The average levels of confidence fell with increased distance, meaning that participants perhaps understood, to a certain degree, the difficulty of the task and downshifted their confidence as distance increased. These results are interesting in relation to the continuing debate regarding the degree to which high confidence is a postdictive indicator of accuracy. It has previously been suggested that less optimal estimator variables will negatively impact the relationship between confidence and accuracy (Deffenbacher, 2008). However, a counterargument is that in pristine conditions and when memory is examined immediately, as in an immediate identification task, then high confidence is associated with high accuracy (Brewer & Wells, 2006; Clark et al., 2015; Sporer et al., 1995). It is also suggested that under such conditions, estimator variables such as distance, will not influence the confidence-accuracy relationship and that participants will adjust their confidence downward when the memory-match for photographs in the line-up is low (Semmler et al., 2018; Wixted & Wells, 2017). The current results appear to fit the latter hypothesis. Nevertheless, it is important to state that in the current sample, there were very few high-confidence responses after 40m; of which very few were correct. This suggests that in real world situations, high-confidence identifications at longer distances most likely reflect either very unique encoding conditions, as for example in the case of a familiar face, or the impact of suggestive factors that inflate confidence, such as an investigator positively reinforcement of the choice made (see, e.g., Wixted & Wells, 2017).
Response Times
The results regarding the relationship between response time and identification accuracy showed that shorter response times were robustly associated with higher identification accuracy at distances below approximately 40m in simultaneous TP line-ups. Earlier studies have suggested that there is a cut-off between 10 and 12 s, below which there is a higher degree of accuracy (Dunning & Perretta, 2002). However, more recent work has called this cut-off into question and has shown that there is great variability in response time and that the previously suggested cut-off point does not accurately distinguish between high and low accuracy (Sauer, Brewer, & Wells, 2008; Weber, Brewer, Wells, Semmler, & Keast, 2004). The present results suggest that shorter response times do have a postdictive value, at least below approximately 40m, with decisions made under five seconds being the most accurate. The implications are that, as with confidence, more research is needed to understand the effect that increased distance has on the relationship between response time and accuracy.
Practical Applications
The main take-home message of the current study is that both objective distance and age are crucial factors to take into consideration when assessing the benefit of conducting a line-up and that there are upper distance limits to eyewitness reliability. For practitioners in the field it is also important to emphasize that at 40 m diagnosticity was 50% lower compared with diagnosticity at 5 m. Moreover, as distance increased, diagnosticity tapered toward 1 so that by 100 m, no age group, using either line-up type, produced diagnosticity values higher than 1 (±0.5). Nevertheless, there were substantial differences between age groups, showing that older children (ages 12 to 17) and young adults (ages 18 to 44) had upper distance cut-offs that were roughly 10–20 m higher compared with young children (ages 6 to 11) and older adults (ages 45 to 77).
Importantly, the current results were obtained in pristine conditions (i.e., best practice methods), with optimal viewing conditions (i.e., 20-s viewing time, natural and optimal lighting, no distractions), and using an immediate line-up task. Therefore, the distance thresholds reported in this article are likely to be overestimates of thresholds in real life settings, where flawed line-up procedures, less optimal viewing conditions, and delayed identifications are much more common. For example, Felson and Poulsen (2003) estimated that approximately 50% of crimes take place after 8 p.m. (i.e., when lighting and visibility is low).
The current perceptual distance thresholds should be interpreted as the maximum thresholds possible in the best possible conditions. When an actual crime takes place, there are often other factors present, such as stress (Deffenbacher, Bornstein, Penrod, & McGorty, 2004) and weapon focus (Erickson, Lampinen, & Leding, 2014; Fawcett, Russell, Peace, & Christie, 2011), that make it more likely that correct identifications are already improbable at shorter distances. Additionally, it is known that (facial) memory is imperfect, susceptible to distortion, and decays with time (Deffenbacher, Bornstein, McGorty, & Penrod, 2008; Lacy & Stark, 2013). It can, therefore, be assumed that delayed identifications will produce less accurate results compared with the present findings. In addition to this, Lindsay and colleagues (2008) found that delayed responses gave rise to a significantly higher number of “not sure” and incorrect rejections compared with an immediate identification task.
Limitations
The current data collection is not without its limitations. One limitation is that we used very similar targets and that more variation in appearance, ethnicity, or age would have been informative. The setting was in a science center, so although the results are highly generalizable, there might be some variation in choosing and rejection rates in comparison with an actual or mock police setup, where the consequences of choosing or not choosing are more critical. The design was a prospective task where participants knew beforehand that they would be witnessing four targets and conducting four identifications. On the one hand this increases the significance of our results because this is an additional optimal condition factor, but on the other hand it would be very informative to investigate the effect of distance on an uninformed and a retrospective line-up task. Instructing the participants as to how many images would be shown in the sequential line-ups, also slightly hampered the interpretation of the results. Despite these limitations, the present research represents a substantial improvement on past research where less ecologically valid paradigms have been used.
Luck Beliefs and Happiness
Our finding that Belief in Luck is broadly negatively associated with happiness is consonant with Maltby et al.’s (2008) suggestion that Belief in Luck is perhaps a maladaptive trait. Consequently, any notion of happy-go-lucky individuals cheerfully trusting to luck would seem to be inaccurate, at least if those individuals believe in luck as a non-random, deterministic and external phenomenon. Indeed, insofar as such individuals may irrationally trust to luck as a deterministic phenomenon, they would seem to do so unhappily not happily.
However, our finding that Belief in Personal Luckiness is positively associated with happiness tends to suggest the happy may indeed go lucky, in the sense that happiness and believing oneself to be lucky are associated. Of course, the relatively large size of associations we find here suggests that Belief in Personal Luckiness might in fact be a facet of an overall happiness construct. A possible implication of this is that Belief in Personal Luckiness’ association with any particular happiness measure could, perhaps, be fully accounted for by controlling other happiness measures. To investigate this possibility, we separately regressed each of the four measures of happiness on Belief in Personal Luckiness while simultaneously controlling for the three remaining happiness measures in each respective case, to see if Belief in Personal Luckiness maintained a significant beta. Doing so we found Belief in Personal Luckiness is not associated with either Positive or Negative Affect. However, Belief in Personal Luckiness is still significantly associated with Happiness (β = .09, p < .01; ΔR2 = .05, p < .01), and Optimism (β = .09, p < .01; ΔR2 = .06, p < .01). This would seem to support, partly at least, that Belief in Personal Luckiness may represent either a facet of happiness or a discrete personality trait positively associated with happiness.
Luck Beliefs, Five-Factor Model and Happiness
Neither Belief in Luck nor Belief in Personal Luckiness appear from our findings to be mediators of the association between the five-factor model of personality and happiness.
Indeed, our analyses, in part, suggest the contrary: that Neuroticism fully mediates Belief in Luck’s association with happiness. This does not imply that Belief in Luck necessarily ‘causes’ Neuroticism, but it is reasonable to speculate that the underlying irrationality and the lack of both agency and self-determination that would seem to underpin Belief in Luck also to some extent underpin or are facets of Neuroticism. This would be consonant with previous research demonstrating significant relationships between Neuroticism and locus of control (Judge et al. 2002; Morelli et al. 1979), self-determination (Elliot and Sheldon 1997; Elliot et al. 1997), and irrational beliefs (Davies 2006; Sava 2009).
We do not find evidence for any component of the five-factor personality model mediating Belief in Personal Luckiness’ association with happiness, nor do we find evidence of any pronounced confounding effects between Belief in Personal Luckiness and the five-factor model and their respective associations with happiness. Hence, considering Belief in Personal Luckiness to be a trait discrete from fundamental personality models would on the basis of our findings not seem unreasonable. Nor would it seem unreasonable to suggest that Belief in Personal Luckiness might potentially be either a facet of happiness or a personality trait discrete from but associated with not just the five-factor model but also happiness.
Our conclusions here certainly seem to apply with greatest saliency to the most direct measure of trait happiness we used, Lyubomirsky and Lepper’s (1999) Subjective Happiness Scale, and to a lesser extent to Optimism, a measure closely allied with happiness (Brebner et al. 1995; Chaplin et al. 2010; Furnham and Cheng 2000; Salary and Shaieri 2013). However, while the pattern of relationships is broadly similar for both Positive Affect and Negative Affect, the effect sizes are smaller and either less significant or insignificant. This would suggest that, while both Positive Affect and Negative Affect are often used as proxies for happiness, they might perhaps best be regarded as constructs related to, rather than directly synonyms of, happiness.
Limitations and Further Research
While our research sheds new empirical light on the relationships between luck beliefs, happiness and the five-factor personality model, a number of limitations need to be kept in mind. As with any findings based on cross-sectional data, interpreting our findings in terms of directions of causality would be imprudent and, of course, constrained by the assumption of our research that happiness, luck beliefs, and the five-factor model are all personality traits rather than individual difference states. Personality traits may, of course, be associated in systematic patterns, but the very notion of traits being essentially innate and non-manipulable, unlike individual difference states, intrinsically excludes the possibility that one might be ‘caused’ by another. To take the five-factor model as an example, its five personality traits have a well-established systematic pattern of associations, but it would be implausible to suggest any of the five in any mechanistic sense causes another: they exist together discretely, with none generally argued to be a facet or sub-component or effect of the other. This said, an area for further research might be to examine the effects of trait luck beliefs on state affect that varies temporally and is manipulable, so hence susceptible to theorization and testing using either longitudinal or experimental data.
A further limitation to our study relates to necessary caution in generalizing its findings in view of the deliberately homogeneous population we used. Further research to replicate our findings amongst heterogeneous populations in terms of nationality, occupation, and socio-economic status would be useful as it has been shown across multiple domains that psychological characteristics and their relationships may vary accordingly (Becker et al. 2012; Boyce and Wood 2011; John and Thomsen 2014; Rawwas 2000; Thompson and Phua 2005a, 2005b; Winkelmann and Winkelmann 2008). Furthermore, although each of the happiness and luck measures we employ have been individually validated across internationally diverse samples including Hong Kong Chinese, underlying conceptions of both are known to exhibit nuanced cultural differences (Lu and Gilmour 2004; Lu and Shih 1997; Raphals 2003; Sommer 2007), which conceivably could modify measured associations between them.
We also note that our study, in common with most research, has limitations due to the limited selection of measures with which we operationalized our investigation. We selected just four measures commonly used in studies of trait happiness, but several others exist, although some, like the Satisfaction with Life Scale (Diener et al. 1985) can arguably be regarded as assessing state rather than trait happiness. We also selected a five-factor model measure that, while not as potentially prone to poor measurement validity as extremely short measures, is sufficiently brief as to exclude examination of possible relationships of each of the big-five elements on a sub-component basis. Certainly given our findings in relation to Neuroticism, further research using multi-component measures of this dimension of the five-factor model might prove illuminating.
In addition, research examining possible mediation and moderation effects of cognate psychology constructs such as, for example, locus of control (Pannells and Claxton 2008; Verme 2009), illusion of control (Larson 2008; Erez et al. 1995), and gratitude (Sun and Kong 2013; Toussaint and Friedman 2009) might help further the understanding of relationships between luck beliefs, happiness, and the five-factor model.