Water, water, every where,
And all the boards did shrink ;
Water, water, every where,
Nor any drop to drink.
-- “The Rime of the Ancient Mariner”, Samuel Taylor Coleridge
How many times have you read the above quote turned into “data, data everywhere; nor any drop of info to be had” or something similar? It’s pretty incredible that in all the progress made in both hardware and software, with subsequent adoption by IT departments the world over, that the common complaint from the business side of corporations is that they are still lacking actionable information. While I don't want to make excuses for any IT organization in that predicament, consider the following quote:
“If every image made and every word written from the earliest stirring of civilization to the year 2003 were converted to digital information, the total would come to five exabytes. An exabyte is one quintillion bytes, or one billion gigabytes—or just think of it as the number one followed by 18 zeros. That's a lot of digital data, but it's nothing compared with what happened from 2003 through 2010: We created five exabytes of digital information every two days. Get ready for what's coming: By next year, we'll be producing five exabytes every 10 minutes. How much information is that? The total for 2010 of 912 exabytes is the equivalent of 18 times the amount of information contained in all the books ever written. The world is not just changing, and the change is not just accelerating; the rate of the acceleration of change is itself accelerating”
–Michael Shermer. WSJ 2.22.2012 book review “Abundance: The Future Is Better than You Think"
It is unmistakable that widespread technology adoption and the digitization and subsequent automation of our world is driving that acceleration of change and the generation of that level of data volumes. The Era of Big Data is now here and here to stay. It is no longer just the marketers hyping the need for you to buy new solutions to deal with the challenges. It is a reality. A stock trading firm that I consulted at generates almost 1TB of raw data PER DAY with 20% growth Y/Y; comScore runs a 147 TB datawarehouse and loads 150 GB of data per day; Sybase 365 processes more than 1.8 billion messages per day which roughly translates to about 1.5 TB of data and that's only a small fraction of the global messaging traffic. Publications from the likes of the Harvard Business Review to McKinsey & Company to PWC have written tomes about the opportunities and challenges posed by all things Big Data. To what extent your particular organization needs to deal with its implications--that's an exercise of individual introspection, isn't it?
While the challenges posed by Big Data does not rest solely with IT, as IT professionals, it is our job to enable the enterprises we work in to harness the opportunities associated with Big Data. The challenges are great. There’s no doubt about that.
In this blog series, since I work in SAP’s Database & Technology Services (DTS) organization, I want to cover the task of data modeling for data big and small plus the associated ecosystem around it to turn it into an operating model. But data in and of itself is boring. It's not an island either. Therefore, I want to also touch on architectural aspects necessary to support the manipulation, collection, distribution, analysis, and use of all that data. It's a tall order and I don't purport to know everything there is to know about the subject. I like to think of this blog series as more of a journey. In parts of this journey where I have traveled, I would like to share some of my experiences. In that part of the journey which I have yet to travel, I invite you to join me to either share your experiences or as a companion in which we will explore topics together for a richer and varied experience.
Before starting out on discussing models though, let's take a step back and think about some trends at a macro level about what's going on around us. These are some of the things that come to my mind:
Given the above as a backdrop, what must enterprise/data architects do to contend with these challenges? What are the ingredients for enabling a successful platform? What tools are available today and what tools need to be built in order to help us with these challenges? These are the question I want to explore from a practitioners point of view. Analyst reports or whitepapers from marketing are fine to get a general framing of the topic but they are of limited use for practitioners like you and me. At the end of the day, I'm a foot soldier, a tinkerer at heart, and get paid to deliver working systems. I also learn best by getting my hands dirty.
As a start, I don’t want to take anything for granted about who the reader is nor what level of “technophile” you are so please allow me to start at the beginning.
What is a data model and why is it necessary?
Like all endeavors of human thought, a model is a means to an end, not an end in itself. To me, a model
I hope that this intro has whet your appetite for models and hope that you will join me on my exploration of this subject.
P.S. Got you're own "adventures in modeling" stories? It'd be great to hear the good, bad, funny, horrific, and all other types of stories. Join in!
________________________________________________
For other database related topics, be sure to check out the Database Services Content Library.
Learn more about SAP Database Services at http://www.sap.com/dbservices.