Why Analytics Projects Turn into Cold Cases
Police procedural television programs are a universal crowd pleaser, especially those involving cold cases. We feel special satisfaction in seeing even the shrewdest of criminals ultimately defeated by superior investigative skill and diligence. In real life, however, many cold cases are surrounded by an earthy ugliness and dread unpalatable for television: a peaceful community experiences an apparently random and unusually violent homicide, or a gruesome fatal accident turns out to be no accident. Catching the perpetrators is top priority, as their being free is like a festering wound to the victims, while the whole community experiences a pervasive uneasiness by lacking knowledge and resolution.
For business organizations, a similar urgency and frustration surround many of their analytics investigations, as unidentified perpetrators increasingly threaten competitive advantage, future relevance, and ultimately survival. Despite tremendous intellectual and technological resources, many analytics projects turn into cold cases, while a killer of business value and vitality remains on the loose. As with criminal cases, it is actually quite easy for analytics investigations to degenerate and fail.
The overzealous district attorney with an agenda
More frequently than should ever occur, we encounter tragic stories of innocent persons convicted, imprisoned, then released several years later, after exonerating evidence is discovered. Meanwhile, the true perpetrator remains at large and unidentified, as a case once thought to be solved becomes reopened as a cold case. Since much of the available evidence that was assembled supported the wrong conclusion, it is unlikely that the evidence will be useful in apprehending the true perpetrator, and the investigation reverts to square one.
Often such a travesty occurs at the hands of ambitious district attorneys eager to advance their careers. Needing to secure multiple high profile convictions in order to gain a favorable professional reputation, they prosecute using all means possible to assure that someone will be convicted, regardless of true innocence or guilt.
Aggressive pursuit of personal or political agenda in an investigation creates ideal conditions for Goodhart’s Law, where an outcome (e.g, conviction) is already predetermined, and any evidence supporting the desired outcome will lose its information content. Furthermore, according to Campbell’s Law, such evidence will likely corrupt the entire analysis.
The criminal with early career success
Surprisingly, the tendencies of dangerous criminals are quite predictable. However, some criminals do their most heinous work early in their careers, avoiding becoming suspects because their criminal habits have not yet been established as behaviors that merit suspicion. The case grows cold as the usual suspects are rounded up and promising leads are exhausted. The true criminal remains free, later establishing a behavior pattern by future crimes, but remaining unconnected to the cold case.
Patterns also fail to be established due to investigator sloppiness. Even as analytics projects become more sophisticated, they should always begin with that tedious yet necessary step of reviewing descriptive statistics and looking at scatterplots. We should resist the urge to go straight to the model, employing analytical skills rather to uncover emergent behaviors. Technical brilliance is better demonstrated as behaviors begin to be defined, and investigators set up tests to determine whether provisional findings remain valid as plausible explanations arise, and resources are assembled to move in to capture the perpetrator.
The biases that separate us from critical information
In our favorite television programs, criminal investigations are bolstered by abundant forensic evidence obtained from meticulous laboratory work and ready access to personal information. In real life however, police investigative work is not as tidy, preoccupied instead with obtaining evidence from human testimony and by interviewing persons of interest. Yet human memory is highly imperfect, especially among victims trying to recall events that occurred under great distress. Other persons interviewed may conceal information innocently for fear of inviting trouble. And many criminal investigations require interrogation of other criminals, who may be highly adept at disinformation. As a result key witnesses are never interviewed, and critical evidence never collected or evaluated. Diligent investigations become cold cases as available evidence proves to be low quality, while high quality evidence remains unaccessed.
A criminal investigation containing low quality evidence and unanswered questions is highly susceptible to investigators relying on cognitive bias to make sense of the available information, as the human mind is uncomfortable leaving matters in a state of irresolution. Much has been written about confirmation bias, or the tendency to seek out only the evidence that supports our beliefs and deny all else in order to avoid disquietude. Psychological research is more precise, understanding confirmation bias as the propensity for persons to agree with statements or ideas which they do not fully comprehend. Investigators may conduct interviews so as to manipulate such a tendency in those being questioned, or they may pursue a particular line of investigation under the belief that the benefit of doubt least favors a certain suspect.
Other forms of systematic bias often come into play as well, including selection bias, which puts excessive weight on evidence that is selected for use, and ignores the effects of evidence not selected. Analytics investigations involving mostly non-experimental data will require data scientists with considerable experience managing selection bias in order to keep the investigation from becoming cold.
Three principles to keep analytics investigations from going cold
- Analytics should always be used to discover an outcome, and never define it.
- Expect to be surprised as often as not, and handle both with equal diligence.
- Analytics is simultaneously a process of discovery and bias elimination. You have to do both at each step in order to advance.