Looking guilty: Handcuffing suspects influences judgements of deception. Mircea Zloteanu,Nadine L. Salman,Eva G. Krumhuber,Daniel C. Richardson. Journal of Investigative Psychology and Offender Profiling, September 7 2022. https://doi.org/10.1002/jip.1597
Abstract: Veracity judgements are important in legal and investigative contexts. However, people are poor judges of deception, often relying on incorrect behavioural cues when these may reflect the situation more than the sender's internal state. We investigated one such situational factor relevant to forensic contexts: handcuffing suspects. Judges—police officers (n = 23) and laypersons (n = 83)—assessed recordings of suspects, providing truthful and deceptive responses in an interrogation setting where half were handcuffed. Handcuffing was predicted to undermine efforts to judge veracity by constraining suspects' gesticulation and by priming stereotypes of criminality. It was found that both laypersons and police officers were worse at detecting deception when judging handcuffed suspects compared to non-handcuffed suspects, while not affecting their judgement bias; police officers were also overconfident in their judgements. The findings suggest that handcuffing can negatively impact veracity judgements, highlighting the need for research on situational factors to better inform forensic practice.
7 DISCUSSION
The present research explored whether a situational factor related to interrogation procedures (i.e., the use of handcuffs on suspects) can negatively impact veracity judgements. Confirming our hypothesis, the handcuffing manipulation affected both laypersons' and police officers' ability to detect deception (i.e., H2 was supported; moderate effect size). Statements made by handcuffed suspects were harder to classify for both police officers and laypersons. Converting the handcuffing effect size (ξ = 0.37) to more intuitive estimates (as recommended by Fritz et al., 2012), we obtain a Number Needed to Treat (NNT) of 5.01. Meaning for every fifth person that is interviewed wearing handcuffs we would expect one more misclassification of veracity. Or, based on the Common Language (CL) effect size, the probability that a suspect selected at random from the handcuffed condition is misclassified in terms of statement veracity compared to a suspect from the non-handcuffed condition is 64.3%. This decrease in accuracy was attributable to the study's manipulation affecting veracity discriminability rather than a shift in judgement response tendencies (H1 was not supported), as all judges remained truth-biased overall (H3 was not supported; NNT = 10.54, CL = 56.7%). For both judge groups, truths were easier to detect than lies (NNT = 12.02, CL = 55.9%; replicating the veracity effect; Levine et al., 1999).
Unsurprisingly, police officers did not perform better at judging veracity than laypersons (see Aamodt & Custer, 2006), and judging handcuffed suspects made this process even harder. However, the manipulation did not affect officers' response bias (H5 was not supported). This contrasts research arguing for a veracity detection reversal in professionals (i.e., police officers showing higher lie detection, but lower truth detection compared to laypersons; Meissner & Kassin, 2002). The similarity in response patterns with laypersons indicates that police officers were not overall more suspicious of suspects. This could, however, be due to the relatively junior sample of officers recruited (see Table 1), or, potentially, due to the “suspects” being naïve students which may have mitigated lie bias towards them; however, we note that the instructions never mention the status of suspects.
A more worrying result, and per our prediction, police officers displayed higher confidence while being no more accurate than laypersons (i.e., H4 was supported; moderate-to-large effect size; NNT = 3.66, CL = 70.2%), even showing a trend towards lower accuracy (e.g., below chance lie detection; NNT = 5.88, CL = 62.2%). This parallels findings of professionals tending to be overconfident in their veracity judgements (Aamodt & Custer, 2006; DePaulo & Pfeifer, 1986; Masip et al., 2016). While the police officers' level of experience may have not been sufficient to bias their judgements in the direction of a lie, it was able to increase their confidence in catching liars (e.g., Masip et al., 2016).
Overall, judges performed worse at discriminating veracity when viewing handcuffed suspects, supporting our assertions that situational factors can negatively impact the discriminability between deceptive and honest suspects (for a more detailed breakdown of the honesty scale data, see SI). Such effects may have serious ramifications for the forensic domain (Verschuere et al., 2016), especially when considering the already poor deception detection rates in the absence of the handcuffing manipulation. Interestingly, both laypersons and police officers were less confident in their judgements when they watched the handcuffed (vs. non-handcuffed) videos (NNT = 5.32, CL = 63.6%). Judges may have found deception detection more difficult when suspects were handcuffed, tempering their confidence.
These results illustrate that situational elements can impact the perception and judgement of both laypersons and police officers. Reducing the impact of such artificial factors could improve forensic practices and deception detection procedures, whilst reducing the risk of potential miscarriages of justice. Such effects are especially pertinent in situations of judgement under uncertainty where external and contextual information often influence the perception of ambiguous or ambivalent information (Masip et al., 2009; Mobbs et al., 2006). In line with research on investigative interviewing, it would seem recommendable that the space and circumstances under which an interrogation takes place are comfortable and do not restrict the individual (Goodman-Delahunty et al., 2014; Kelly et al., 2013).
7.1 Future directions
The current work sought to highlight the effects of situational factors on veracity judgements, particularly in forensic contexts. Future research could elaborate on the different ways in which handcuffing affects senders and judges by separating their influence on suspect perceptions (e.g., handcuffs as a visual cue of criminality; Stiff et al., 1992) from the effect on suspects' ability to gesticulate (within-sender features). For this, handcuffed and non-handcuffed suspects' movements could be restricted by asking them, for example, to place their hands flat on a table throughout the interrogation. This would equate the nonverbal differences whilst having the presence/absence of handcuffs as the only factor that differs between conditions. Alternatively, the videos could be edited to show the same suspect with or without handcuffs, revealing whether any impressions brought about by being handcuffed are due to the presence of external visual cues.
Considerations should also be given to the content of the stimuli themselves. An analysis of the videos may reveal verbal, paraverbal, and/or nonverbal cues which may aid in understanding the current findings. Such an investigation could uncover if behavioural differences between the liars and truth-tellers are indeed reduced by handcuffing and if differences in impression management are brought about by the manipulation (e.g., handcuffed suspects may “compensate” for their restricted gesticulation by modifying their speech and, by extension, their verbal cues may differ; see Verschuere et al., 2021).
Additionally, given the within-sender variability typically seen in deception research (Levine, 2010; Zloteanu, Bull, et al., 2021), the current stimulus set may be expanded to show a larger number of senders which would provide more precise effect size estimates and reduced uncertainty (Levine et al., 2022). Future research should also employ a more in-depth statistical approach (i.e., multi-level modelling) that accounts for both sender and decoder variability. This may be especially relevant in understanding if handcuffing interacts with senders' demeanour and judges' expectations. The possibility exists that the manipulation may not affect all individuals to the same degree or in the same manner (see DAG in SI for the potential influence of within/between subject and stimuli variance on the judgement process).
Subsequent work may also explore the effect of handcuffing on the relationship quality between suspect and interrogator (also, see SI). Due to the interactive nature of the interrogation task, handcuffs may have affected the rapport between the interrogator and suspect, which in turn could shape the behaviour of suspects (Kassin et al., 2003; Paton et al., 2018). The present manipulation demonstrates that deception detection does not happen in isolation. Future studies investigating veracity judgements should expand the range of factors being considered, both within the lab and in the real world.
7.2 Limitations
The issue of generalisability in the deception field is rarely addressed; nonetheless, a few elements of the current research must be considered. First, the type of lie told by suspects related to personal information that liars misrepresented. It can be argued that differences in performance and judgement may emerge if other types of lies (e.g., lies about transgressions) are employed (Levine, Kim, & Blair, 2010; cf. Hartwig & Bond, 2014; Hauch et al., 2014). Second, although some have argued that using students instead of real suspects may impact the detection rate (see O’Sullivan et al., 2009), both empirical investigations and meta-analyses report that deception detection is unaffected by whether the sender is a student or not (Hartwig & Bond, 2014; Zhang et al., 2013), nor do police officers show better accuracy rates even in naturalistic high-stakes settings (Hartwig, 2004; Meissner & Kassin, 2002). However, using different type of senders may influence perceptions and judgements.
Presently, it is difficult to separate the effect of handcuffing on judges' perception (i.e., pure external features) from that on sender performance (i.e., within-sender features) as our manipulation may have been affecting either or both. For example, handcuffing could attenuate behavioural differences between liars and truth-tellers resulting in poorer overall veracity discrimination. However, considering the dynamics between the interrogator and the suspects, being handcuffed could have also prompted senders as to the added scrutiny and behavioural restrictions, and compensated through increased impression management to produce a more convincing performance (Buller & Burgoon, 1996; Burgoon et al., 1996). The interplay between the interviewee and the interviewer is an important unknown, as some response variability may be due to the interrogator himself, given that rapport strongly influences interviewing outcomes (Abbe & Brandon, 2013).
The interrogation style used should also be weighed. Currently, while we did not find any effect of probing, this element could not be explored in depth due to a lack of variability in the use of the three probes by the interrogator (see SI). The literature on probing is equivocal on its use impacting veracity judgements (Buller et al., 1991). Nonetheless, it may impact rapport building and disclosure (Paton et al., 2018). Different probes may result in changes in the interdynamics of the interrogator and suspect, as well as subsequent judges (e.g., biasing impressions based on the valence of the probe used during the questioning). Future research could consider manipulating (e.g., standardising) the probing element to investigate how it interacts with the handcuffing element (e.g., Granhag & Strömwall, 2001); specific probes may bolster (e.g., negative) or attenuate (e.g., positive) the effects of handcuffing.
Finally, a more pronounced limitation is the relatively small and unbalanced sample. Underpowered studies are less likely to find true effects (i.e., Type II error), have a higher chance of found effects being statistical artefacts (i.e., Type I error), inflate estimates of true effects (i.e., Type M error), and have lower replicability (Fraley & Vazire, 2014; Gelman & Carlin, 2014). For instance, the CIs around the handcuffing effect indicate that the data is compatible with a wide range of effect sizes, from large and of potential interest (ξ = 0.58) to small and potentially unimportant (ξ = 0.10). Thus, we advise readers to interpret the results with care. Still, considering the forensic-relevant sample alongside the implications of our findings (especially for miscarriages of justice), on balance, we consider that the value of the research outweighs its drawbacks (Eckermann et al., 2010; Sterling et al., 1995).
To increase usability, we report all necessary measurements of uncertainty and variability (Calin-Jageman & Cumming, 2019), permitting future hypothesis generation and integration into meta-analyses (Cumming, 2014; Fritz et al., 2012). For example, replications can consider the effect sizes reported and their confidence intervals to estimate future results (e.g., prediction intervals; Cumming, 2008), and calculate the statistical power needed to reproduce the effect (e.g., considering ξ33%; see, Simonsohn, 2015).