Why Big Data Is Getting the Bully Treatment
In the popular middle-grade novel, Blubber, author Judy Blume relates a truly heartbreaking story of a group of bored fifth graders suddenly taking to tormenting one of the girls in their class. In a space vacated by indifferent and lifestyle-preoccupied adults, Wendy, a shrewd fifth-grader, directs a clique of classroom tastemakers to expand a private joke into methodical humiliation of a classmate. Linda, the bullying victim, is introverted, plain, and chubby, having done nothing to attract such cruelty, and lacks the skills to fight back effectively. Linda gained the nickname “Blubber” due to an unfortunate association of her chubbiness with a report about whales that she was delivering before the class, an irony that Wendy exploited to its limits.
Big data is now attracting similar bullying by social, industrial, and cultural tastemakers. Like Linda, data for as long as we have known it has been soft-spoken, prosaic, and bulky. For the last several decades it has lived in the background, occasionally being noticed by higher-status groups, usually in times of distress. Only in the last decade has data become recognized as a prodigy: big as in bulky and clumsy has subtly transformed into big as in potent and preponderant. Yet it is still the undifferentiated bigness that prompts the teasing.
A schoolyard ripe for bullying
The bullying of Linda began with the inopportune association of her chubbiness with her speech on whales. Consequently the theme of Linda’s tormenting was her weight and girth. Likewise, as big data gets bigger, it is unfortunately being equated with an expansion of rational control, which furthermore is bringing about a reworking of the formula that determines who gets the biggest voice.
Both were made clear in the last two weeks, as New York Times reporter John Broder published an uncomplimentary account of Tesla Model S on-road performance during an approximately 200-mile test drive. Broder’s review was challenged directly by Tesla Motors’ CEO Elon Musk, who supplied remotely retrieved log data from the test vehicle to rebut Broder’s claims. The New York Times review could come off as a bit of bullying, and something to which Musk could have responded by assembling a more formidable group of bullies, such as a competing newspaper or a gang of lawyers. Rather, Musk brought in something even more forceful: data. Broder suddenly found himself playing defense, as his account was contradicted by some of the log data obtained during the test drive by Tesla’s remote monitoring technology. Musk thereby raised the bar for determining who rules the schoolyard, as the winner will now need to prevail by both might and right.
One take home message from this still unresolved and escalating quarrel is that data is bigger than both The New York Times or Elon Musk. And data, not the cultural capital of The New York Times or Musk, will determine which of these two tastemakers wins this seminal new media battle. But the other message is that the dazzling technology that produces high performance electric cars is also changing the ideology of control substantially, and data, no matter how big, is not the cause of the latter.
Who is in control here?
In their classic paper published in 1992, sociologists Stephen Barley and Gideon Kunda demonstrated that management ideology has oscillated between normative and rational control for over a century, in phases of approximately 30 years. Normative control refers to systems where behavior is regulated by culture and norms. Individuals participate in self organizing groups, collaborate frequently, and work toward consensus. Individual behavior typically is not monitored closely, as organizations are more preoccupied with establishing and reinforcing cultural values. Behavior is dictated using powerful symbols, organizational rituals, shared experiences, charismatic leadership, and vision statements. Many tech firms renowned for innovation, such as Apple, Amazon, Tencent, and even Tesla are also known for normative control.
Rational control, on the other hand, involves systems regulated primarily by mechanisms, rules, and algorithms. Individuals belong to a well-defined hierarchy, and are expected to defer to those with greater expertise. Individual behavior is managed methodically through command-and-control mechanisms, which are supported by technology and regular monitoring. Behavior is dictated according to standard procedures and accumulated experience. Traditional blue-collar labor is often associated with rational control, along with modern high-reliability organizations, such as military, public safety, nuclear energy production, and oil platform operations.
Barley and Kunda established 1980 as the approximate time when the current longwave of normative control began, making us due for a shift back to rational control. Indeed, the conflict between The New York Times and Tesla gives us a snapshot of this shift happening. Information, communications, and networking technologies have all facilitated substantial advances in rational control mechanisms such as monitoring and command-and-control, while creating a hierarchy of expertise that requires deference to the more technologically sophisticated. Systems of rational control are advancing to the networks that define our everyday consumption of technology, whether smart phones or smart cars.
Yet the simultaneous emergence of big data and the shift back to rational control cannot be taken as one causing the other. Big data is getting big by helping us manage smart devices so that we can deploy and use them, enabling us to be in control of our activity rather than being controlled. Nor is big data the same as the rational control mechanisms that big data quantifies. Though powerful enough to trump The New York Times or Elon Musk, big data is nevertheless not powerful enough to bring about changes in control in global business or networks of mobile device users.
Data has always been big, but big has not always been good for data
Since the era of Eratosthenes and Ptolemy, scientific and mathematical advancement has required data in excess of our ability to compute it readily. Over the centuries, we have applied some of the ingenuity used to analyze data about nature to building things to calculate that same data, whether organizations of human computers or machines of integrated circuits. Both require substantial administrative effort and inputs of information in order to function properly (e.g., training, algorithms, programs), thereby creating additional sets of data separate from the data used for scientific study. In the last century, Claude Shannon demonstrated that adding more data is needed to decrease uncertainty in data transmissions, prompting the creation of another source of data.
Thus data has always been big, at least bigger than our ability to manage it. Yet this has not presented a problem until recently, as supercomputers came out of national laboratory blockhouses, into massive air-conditioned buildings in corporate headquarters, later onto millions of desktops, and now into millions more pockets in mobile devices. By the early 1950’s there were only a few machines creating data, and they were being used in well-established scientific fields, simulating nuclear reactions or controlling high speed switching, for example. But now there are millions of machines being used for communication, commerce, and culture, creating more data in a day than has existed in entire millennia of civilization. The field of data science is emerging to discover the laws that govern this species of data, while cloud computing is quickly eliminating constraints on data accumulation and storage. Both are transforming big clumsy data into big potent data, and even big pervasive data.
In a world defined by data, is not surprising that smart is the new sexy. Nor is it surprising that the traditionally sexy would share some their sexiness with big data, while they are determining whether data’s “bigness” should be handled by admitting it as a peer, co-opting it, or bullying it back into its place. But the last of these is hardly justified. Big data is not trying to defeat the traditional idea of sexy. Rather, it is facilitating a reinvention of sexy that is likely inevitable, and coincidentally happens to favor it strongly.