By Ray Rivera, Director, Solutions Management, Workforce Planning and Analytics, SAP
Good analytics is a combination of art and science, where the skillful analyst combines an entire matrix of prior knowledge with good judgment about authentically meaningful qualitative differences.
Like software developers and computer scientists, there are a lot of self-styled big data analysts and “data scientists” out there, most minimally competent, some adequate, and a few good ones—the analytics wizards.
The analytics wizards do the work of ten, perhaps a hundred adequate analysts. If we are lucky we may work with one or two of the good ones in our careers. It is easy to recognize them from their work, but it is much harder to explain what they do that separates them from the rest.
Like the Pinball Wizard described in the same-titled song by The Who, good analysts seem to play by intuition. But looks are deceiving. What appears to be intuition is explained later in the song: “ain’t got no distractions/can’t hear those buzzers and bells/don’t see lights a-flashin’.”
Indeed the hallmark of the analytics wizard is the ability to filter out the noise, and not get distracted by it. Keeping the ball alive and getting the replay is the object of the game, not lighting up the board or making more noise than everyone else.
What is the sound of noise not happening?
So how do the analytics wizards do it? A fundamental is that they know the difference between inputs and outputs, and don’t confuse the two. So at the very outset, good analysts realize that filtering out the noise is not the result of what they do, but the driver.
Analytics wizards are masters of the robust and enduring theories that govern the phenomena they are studying, and can frame the goals of their analysis according to these theories. Note that these theories are not the same as “book knowledge”, and it doesn’t matter whether they are developed by academic study or acquired as know-how.
What does matter is that good analysts compare everything they see to these theories, which allows them to classify phenomena quickly, and determine which surprising things that come up are unexplained, and which are the result of errors in the system, for example, poor observation.
Good analysts are likely quite skilled at mathematical modeling, but it is a necessary and not sufficient condition for being a good analyst. A math or code jock who is nevertheless ignorant of the enduring theories can still produce results, but cannot tell whether the results are good or even relevant. Rather, their criteria for good lie in some obscure statistical “goodness of fit” or parametric test.
Analytics wizards have internalized some of the theories into their own mental models of how the systems of the world work (“stands like a statue/becomes part of the machine/feeling all the bumpers/always playing clean”), and are constantly comparing what they see to their own models. If they score a hit, it seems easy, but really they are testing everything they see against a script. The difference is that it’s a really, really good script.
Ask a stupid question, get a clever-sounding answer
The good analysts begin with questions informed by the characteristics of the data, and then go to the details. The adequate ones begin with data gymnastics informed by pet questions (or questions that make them appear clever), and then head for even more details (so as to look even more clever).
The good ones may spend a lot of time apparently idling with definitions, but they realize that it seldom happens that when you ask a stupid question you get a brilliant answer. Before they begin building models or crunching numbers, the good ones know what the data can and cannot do.
The adequate ones are preoccupied with questions about what the data can do at various stages in the analysis, and get distracted by the ingenuity of the methods they use to satisfy their questioning. Good analysts care about clever questions, not clever methods, though good analysts may use the same bravura methods as the adequate ones. Confusing clever questions with clever methods, adequate analysts find it is easy to dive right into the data and get lost, reaching contradictory results, and wavering about what they are really seeing. Decisions made according to this process are seldom confident, and are correct only by luck.
Good questions provide appropriate boundaries on what data to be used and what methods to analyze them by. Only when the questions are tested, then the data is analyzed. Adequate analysts go about it backwards; they try to determine the range of the data, and what questions they think they can answer.
Don’t be a bad compression algorithm
What else characterizes the good ones? They don’t fall for the Procrustean bed, but rather are comfortable dealing with ambiguity. They are not seduced by false concreteness (such as a finding of statistical significance when there is actually no practical significance), or the idea that they have a handle on something that is actually intangible (such as believing that by structuring data successfully they can “name it and claim it”).
In good analytics practice, precision is an end result, as during the analysis the data can point to different conclusions. Adequate analysts try to drive out ambiguity at every turn. But like a bad compression algorithm, all they really do is throw away a lot of good information.
Good analysts realize that ambiguity can indicate different conditions under which something can occur, or exceptions to rules, or areas where theories break down. Being good at handling ambiguity is never the result of uncultivated talent, but rather extremely disciplined thinking, gained by years of focused study, careful apprenticeship, knowledge assimilation, and discarding hundreds of would-be good ideas.
The good analysts are always testing theories, especially their own. The adequate ones are skilled users of various analytical instruments; the good ones are themselves an analytical instrument.