The traditional view of campus divides the student body into “jocks” (who like sports), and “nerds” (who don’t). It seems that every paper I read about Big Data tries to appeal to the jocks by quoting sports statistics as the most compelling example.
The face that launched a thousand blog posts
Moneyball, and Brad Pitt’s good looks became the face that launched a thousand blog posts.
I should know, I wrote a few myself. But what about some other examples, which might appeal to us Europeans who don’t endlessly pore over box scores, feeding on-base percentages into a Hadoop cluster. We are too busy drinking wine and watching Eurovision. This leads to questions such as “Can Big Data help me find a good bottle of wine?” or “Can it help predict the winner of Eurovision?”
It turns out it can help, as the book “Supercrunchers” explains, in the world of wine recommendations. Imagine that you are trying to determine if 2013 will be a good year for cabernet. This might be because you want to invest in wine futures, or you want to place an early order for a few cases of the good stuff from your wine merchant. The usual approach is to ask a wine connoisseur who has decades of personal experience and is well-respected. This expert uses “swish & spit” to expose the complex flavours and, let’s be frank, has a livelihood dependent on being the expert.
Orley Ashenfelter, an economist by day, decided a superior approach would be to “Run the Numbers”, and found that all that expertise can be beaten by a simple linear equation:
Wine quality = 12.145 / 0.00117 * Winter Rainfall + 0.0614 average growing season temp – 0.00386 harvest rainfall
It turns out that the Mathematical approach was superior by correctly predicting the “Wines of the Century” in 1989 and 1990. The reaction of the traditional experts was the same as the old scouts around the table in Moneyball. The highly influential Robert Parker laughed off the approach with the comment, “I’d hate to be invited to his house to drink wine.” But Ashenfelter had the last laugh because he made lots of money for his advocates in wine futures by betting “against the house”.
What could be further away from American sport than Eurovision.
David Rothschild, an economist at Microsoft Research, harnessed powerful computer clusters to predict that Emmelie de Forest’s rendition of “Only Teardrops” would win for Denmark (54% probability). He was right. Data came from social media sites “I like Emilie” Tweets, YouTube download statistics, polls, the Spread betting market, even Game theory which anlaysed how European countries vote for each other (the Scandinavian effect)
Eurovision is on May 10th this year.
Find your “Dark Data”
Up until about ten years ago, physicists had figured the universe out pretty well. They knew what stuff was, and what stuff did and where it was. It was a case of refining theories, crossing t’s and dotting i’s. But in the 1980s when they re-worked the numbers, they started to come up with the conclusion that most matter was not the familiar atoms, but some other, ethereal substance which they called dark matter. More equations later, and it seems that 95% of the total matter of the universe is dark. And we know little about it. Oops.
I see some of the same things going on with data. Up until the dawn of business networks, companies assumed that all of its data could be rationalized into a data warehouse. There were still things to do, for sure: reconcile master data, design reports, tidy up some schemas, and build reporting cubes. But once enterprises began to link up with business networks, suddenly we discovered that more of the important data needed was outside the enterprise.
Dark matter is now an exciting field for physicists: and dark data is also becoming a crucial area of activity for data scientists. We need to figure how to get it, how to rationalize it, and how to use it.
Business Networks: harnessing the power of Dark Data
Procurement success has typically been measured on objectives such as reducing costs, minimizing risk and driving compliance. These objectives are now made more challenging with corporate business initiatives such as sustainability, localism and corporate responsibility. Balancing all these is hard to do if you’re not tapping into business networks to access intelligence on market trends, supplier history, qualifications and risk factors and gain insights that will help them design and execute smarter strategies and deliver results.
Just like social networks make it easy for consumers to manage their personal relationships and activities, business networks allow companies to connect and collaborate with their trading partners around the world anytime, anywhere, from any device. More than 1 million companies in 190 countries, for instance, use the Ariba Network to transact over $465 billion in commerce on an annual basis. But networks are about more than just connecting companies, people and processes. Their real power lies in what goes on inside them – all the interactions, transactions and commentary – and the massive amounts of data that they generate.
Consider the following: leveraging the hundreds of billions of dollars of financial transactions and structured, curated, audited, and cleansed transactional data along with relationship history that resides in business networks,buyers and sellers can make more informed decisions by detecting changes in buying patterns or pricing trends and provide confidence and qualifying information on a potential – yet unfamiliar – trading partner. And, when combined with community-generated ratings and content and a whole host of unstructured data such as texts, tweets, blog posts, web-based videos, and other social postings, they can glean not only real-time insights, but also recommended strategies for moving their businesses forward.
Big Data can be used to answer many questions that typically were previously the domain of the expert. In the area of Procurement and Supply Chain this could be:
- What are the long term risks associated with this source of supply ?
- I’ve not bought laminated aluminium sheets before. What suppliers do companies like mine typically use for this commodity?
- What is the average hourly rate for a level II warehouseman in Chicago?
It’s a new way of operating but organizations that embrace it can ultimately transform their businesses.To learn more about Ariba’s business network and cloud-based applications and how you can leverage them to fuel a data-driven approach to procurement that delivers results, come to my session Capture the Knowledge in Your Networks at SapphireNow.