Big Data -Debunking the Myths – Part 1
Big Data! The newest kid on the block! And, here are some interesting and relevant stories that were in the media recently.
-Recently there was a public outcry about a certain mega-retail chain store in the US, recommending baby-products to expectant teens! Your credit-card history, cell phone GPS data, and ‘like’ history on social sites can predict your needs/behaviors/expenditures in an increasingly accurate manner and are manipulating you!
– HP has built a ‘flight-risk model”; and applied it to evaluate each employee about his/her quitting probability!
– Your Internet surfing, outings, vacations and escapades are being tracked and that explains the deluge of spam and promotional mails regularly making it to your inbox!
Well, back in the day, traditional analytics had huge latencies, data challenges and months-long response times compared to present-day technologies which can expedite large data-crunching to minutes/hours and deliver actionable business intelligence to marketing!
Sales and Marketing folks may relish these opportunities, but customers may begin to feel cheated and increasingly fearful of losing the privacy battle.
Big Data has also spawned many related myths.
So let’s begin by defining Big Data and also debunk some of the popular myths surrounding it! Some quick definitions for Big Data here!
# “is a huge dataset(s) that requires unconventional/specialized computing resources to store, crunch and extract insights/values/actionables” – yours truly
# “Big Data is high volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.” – Gartner 1
“Big Data Myths”
1. Analytics and its benefits apply only to huge volumes of data!
Volume is the least important key element though many vendors want you to think otherwise. Most organizations typically have volumes less than 100 Terabyte2. Data size is actually a moving target which is increasing over time. Facebook and Twitter’s daily data crunch of 50 billion photos and 340 million tweets respectively 3, remains the exception, not the norm! Current technologies can still leverage their data for gaining competitive advantages. For e.g. a manufacturing organization can leverage their production processes logs to un-earth new insights about product life-cycle, quality and costs, failures and rejection aspects. A retail chain can gain insights about their selling pattern that show similar profitability or the optimum use of the shelf space at the retail store or product returns versus geographical neighborhood information5. A pharmaceutical company can gain insights about a particular drug’s effects/performance across a population, geographical or otherwise. A kids garment manufacturing company can gain significant information about the population distributions and optimize inventory and stock logistics against demand in real-time. So analytics is as much about calculation intensity as it is about large data sets.
Take away here is “You do not need to have large volumes of data to get the real economic benefits of implementing big data solutions”
2. Big Data is too expensive!
Typically, Big Data projects begin with a specific use-case and a specific large data set and it is usually a ‘learn-as-you-go’ journey.
Implementations may or may not require complex integration of infrastructure components and may or may not carry business risks. To mitigate risks, many organizations often prefer “ready-made” platforms providing cheaper alternative, smoother take-offs and quicker-wins to begin with. Many of the Big Data technologies and platforms are also available as open-source for the cost-constrained. So in other words, many organizations can still adopt these technologies as per their specific requirements and make a significant move towards leveraging Big Data.
The key take-aways are “Let business drive IT; Stay agile; and make Custom Order Solutions”
3. Big Data is for Social Media Feeds and Sentiment Analysis
Google, Facebook, Twitter and countless other companies are the headliners of the current data explosion, enabling and leveraging it. They routinely analyze huge chunks of data for various reasons. They represent smart organizational models for anyone to imitate.
However, the same technologies are available for anyone to use in their organizations and leverage their data. In other words, your organization can broadly analyze web traffic, IT System logs, customer sentiments through email communication threads (feedback and appreciation and compliant mails), process logs, financial transactional logs or other types of digital impressions (structured or unstructured) in the context of the ecosystem of customers, employees, partners, vendors and affiliates. These can provide insights and other often surprising information, patterns, and not so obvious correlations that can go into the day-to-day decision making for management
4. Big Data can easily be leveraged into a profitable, analytics-driven machine:
Data is often flawed, misleading and incomplete! Big data discipline is rugged and proven. But, operational executives lacking statistical skills may be misled by “algorithmic illusions” and create corporate bubbles of overconfidence as expressed by MIT Media Lab visiting scholar Kate Crawford 3.
“Technology and data alone cannot work miracles for an organization. Just as availability of cheap high-quality film production software doesn’t translate to hundreds of Steven Spielbergs emerging — it does takes creativity and keen business sense to orchestrate a masterful production. “
Well, what would happen if we evolved the definition of Big Data6? What if we added another two Vs like Viscosity and Virality? The Viscosity measures the resistance to the flow of data and Virality measures how quickly it is shared. This expanded definition,though not glamorous, is closer to the reality everyone is trying to come to terms with.
- Gartner Big Data Glossary – http://www.gartner.com/it-glossary/big-data/
- Small and Midsize Companies Look to Make Big Gains with “Big Data”- http://www.sap.com/corporate-en/news.epx?PressID=19188
- Untangling algorithmic illusions from reality in big data – http://radar.oreilly.com/2013/03/untangling-algorithmic-illusions-from-reality-in-big-data.html
- Wikipedia Twitter – http://en.wikipedia.org/wiki/Twitter, Wikipedia Facebook : http://en.wikipedia.org/wiki/Big_data
- Space and Capacity Planning imitative of a large US chain: http://business.financialpost.com/2013/04/21/canadian- content-at-target-makes-for-a-small-business-opportunity/
- Doug Laney, Gartner’s research report, 2001 – http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf