The city of New York recently witnessed a record payout to George Bell, falsely convicted of murder in 1999, after it emerged prosecutors had deliberately hidden evidence casting doubt on his guilt, giving false statements in court. Bell is the latest in a long line of people, especially Black Americans, unfoundedly convicted. More recently, Jabar Walker and Wayne Gardine were cleared after decades in prison. Conviction integrity units across North America have found serious flaws with many long-standing convictions.
Alarmingly for scientists, misleading forensic and expert evidence is too often a deciding factor in such miscarriages of justice; of the 233 exonerations in 2022 alone recorded by the National Registry of Exonerations, deceptive forensic evidence and expert testimony was a factor in 44 of them. In an era of high-tech forensics, the persistence of such brazen miscarriages of justice is more than unsettling. The National Institute of Justice, part of the U.S. Department of Justice, has just published a report that found certain techniques, including footprint analysis and fire debris, in forensic science were disproportionately associated with wrongful conviction. The same report found expert testimony that “reported forensic science results in an erroneous manner” or “mischaracterized statistical weight or probability” was often the driving force in false convictions. The disconcerting reality is that illusions of scientific legitimacy and flawed expert testimony are often the catalyst for deeply unsound convictions.
This paradox arises because scientific evidence is highly valued by juries, which often lack the expertise to correctly interpret or question it. Juries with a lower understanding of the potential limitations of such evidence are more likely to convict without questioning the evidence or its context. This is exacerbated by undue trust in expert witnesses, who may overstate evidence or underplay uncertainty. As a 2016 presidential advisors report warned, “expert witnesses have often overstated the probative value of their evidence, going far beyond what the relevant science can justify.”
The debacle of British pediatrician Roy Meadow serves as a powerful exemplar of precisely this. Famed for his influential “Meadow’s law,” which asserted that one sudden infant death is a tragedy, two is suspicious, and three is murder until proved otherwise, Meadow was a frequent expert witness in trials in the United Kingdom. His penchant for seeing sinister patterns, however, stemmed not from real insight, but from terrible statistical ineptitude. In the late 1990s, Sally Clark suffered a double tragedy, losing two infant sons to sudden infant death syndrome. Despite scant evidence of anything beyond misfortune, Clark was tried for murder, with Meadows testifying to her guilt.
In court, Meadow testified that families like the Clarks had a one-in-8,543 chance of a sudden infant death syndrome (SIDS) case. Thus, he asserted, the probability of two cases in one family was this squared, roughly one-in-73 million of two deaths arising by chance alone. In a rhetorical flourish, he likened it to successfully backing an 80-to-1 outsider to win the Grand National horse race over four successive years. This seemingly unimpeachable, damning statistic figure convinced both jury and public of her guilt. Clark was demonized in the press and imprisoned for murder.
Yet this verdict horrified statisticians, for several reasons. To arrive at his figure, Meadow simply multiplied probabilities together. This is perfectly correct for truly independent events like roulette wheels or coin-flips, but fails horribly when this assumption is not met. By the late 1990s, there was overwhelming epidemiological evidence that SIDS ran in families, rendering assumptions of independence untenable. More subtle but as damaging was a trick of perception. To many, this appeared equivalent to a one-in-73-million chance Clark was innocent. While this implication was intended by the prosecution, such an inference was a statistical error so ubiquitous in courtrooms it has a fitting moniker: the prosecutor’s fallacy.
This variant of the base-rate fallacy arises because while multiple cases of SIDS are rare, so too are multiple maternal infanticides. To determine which situation is more likely, the relative likelihood of these two competing explanations must be compared. In Clark’s case, this analysis would have shown that the probability of two SIDS deaths vastly exceeded the infant murder hypothesis. The Royal Statistical Society issued a damning indictment of Meadow’s testimony, echoed by a paper in the British Medical Journal. But such rebukes did not save Clark from years in jail.
After a long campaign, Clark’s verdict was overturned in 2003, and several other women convicted by Meadow’s testimony were subsequently exonerated. The General Medical Council found Meadow guilty of professional misconduct and barred him from practicing medicine. But Clark’s vindication was no consolation for the heartbreak she had suffered, and she died an alcohol-related death in 2007. The prosecutor’s fallacy emerges constantly in problems of conditional probability, leading us sirenlike towards precisely the wrong conclusions—and undetected, sends innocent people to jail.
Earlier this year, Australia pardoned Kathleen Folbigg after 20 years in jail after a conviction for murdering her four children in 2003 based on Meadow’s discredited law. Dutch nurse Lucia de Berk was convicted of seven murders of patients in 2004, based on ostensible statistical evidence. While convincing to a jury, it also appalled statistical experts, who lobbied for a reopening of the case. Again, the case against de Berk pivoted entirely on the prosecutor’s fallacy, and her conviction was overturned in 2010.
This isn’t just historical occurrence. The veneer of science and expert opinion has such an aura of authority that when invoked in open court, it is rarely challenged. Even effective techniques like blood splatter and DNA analysis can be misused in unsound convictions, underpinned by variants of the prosecutor’s fallacy. A suspect’s rare blood type (5 percent) matching traces at a scene, for example, does not imply that guilt is 95 percent certain. A hypothetical town of 2,000 potential suspects has 100 people matching that criterion, which renders the probability that the suspect is guilty in the absence of other evidence at just 1 percent.
Worse is when the science cited is so dubious as to be useless. One recent analysis found only about 40 percent of psychological measures cited in courts have strong evidentiary background, and yet they are rarely challenged. Entire techniques like bite-mark analysis have been shown to be effectively useless despite convictions still turning on them. Polygraph tests are so utterly inaccurate as to be deemed inadmissible by courts, and yet remain perversely popular with swathes of American law enforcement.
This can and does ruin lives. Hair analysis, dismissed by forensics experts worldwide as pseudoscientific, was embraced by the FBI for its ability to get convictions. But this hollow theater of science condemned innocent people, disproportionately affecting people of color like Kirk Odom, who languished in prison for 22 years for a rape he did not commit. Odom was but one victim of this illusory science; a 2015 report found hundreds of cases in which hair examiners made erroneous statements in inculpating defendants, including 33 cases that sent defendants to death row, nine of whom were already executed by the time the report saw daylight. As noted by ProPublica, the use of “lung float” tests to supposedly differentiate between stillbirth and murder is being challenged by experts. Despite the fact the test is highly fallible, it has already been used to justify imprisoning women who lost children for murder, raising alarm over yet another potential manifestation of the prosecutor’s fallacy.
While science and statistics are crucial in the pursuit of justice, their uncertainties and weaknesses must be as clearly communicated as strengths. Evidence and statistics demand context, lest they mislead rather than enlighten. Juries and Judges need to be educated on standards of scientific and statistical evidence, and to understand what to demand of expert testimony, before courts send people to prison. Without improved scientific and statistical integrity in courtrooms, the risk of convicting innocent people can neither be circumvented nor ignored.
This is an opinion and analysis article, and the views expressed by the author or authors are not necessarily those of Scientific American.