Taming the Big Data Beast
Your business data is seldom located in a single repository. Worse, it is increasingly stored in numerous formats—from text and spreadsheets to video and audio files, owned by multiple stakeholders and geographically distributed. Then there’s the staggering amount of data, which IDC predicts will balloon 50% this year over 2012. That, in a nutshell, is the Big Data problem companies face today. Too much information strewn about too many servers in too many formats.
Compounding the Big Data dilemma is time. To be precise, real time. Increasingly, data flowing into an organization needs to be captured, accessed, analyzed, and acted upon in real time. The business forces driving time-to-decision expectations are accelerating just as the data volume, variety, and velocity issues are increasing. For many CIOs, taming the Big Data beast is both the scariest problem they face and the biggest untapped opportunity in front of them.
The SAP Real Time Data Platform approach to Big Data is the most complete coverage model in the market. While the SAP HANA in-memory platform is at the center of the SAP RTDP offering, its unique smart data access capability lets enterprises create a federation of disparate data sources, delivering the highest performance possible for both analytical and transactional workloads. For example, SAP HANA smart data access means queries can be optimized for and then executed on servers in the federation that are ideally suited for the task, essentially pushing down processing closer to where the data physically resides.
With hundreds of SAP HANA installations already deployed, there are plenty of real-world examples of companies who are getting the upper hand on the Big Data fiend. Let me just point to one. An early adopter of SAP HANA technology in 2011 was Minneapolis-based Medtronic. The medical device manufacturer collects massive amounts of data about product reliability. Information arrives as commentary in text form from customer interaction and employee feedback as well as structured database feeds from various sources such as government entities. Reports based on these different data sources are needed up and down the organization and are used to continuously improve product quality.
The rapid growth of data within Medtronic’s data warehouse was hurting reporting performance, which is why IT there turned to SAP HANA. Its success at handling the vast volumes of structured and unstructured data in the area of product reliability has inspired Medtronic to, among other possibilities, potentially combine SAP ERP information with CRM data held in non-SAP sources.
As I’ve written here before, the SAP RTDP architecture is not limited to SAP branded applications. So SAP HANA smart data access lets an IT team integrate a network of large and disparate data sources, even if the information is residing in databases sold by SAP competitors, and even if it’s residing in Hadoop.
To that end, later this month at the SAPPHIRE NOW conference in Orlando, SAP and Intel will be showcasing how Hadoop and SAP HANA technology further advance the SAP RTDP vision. As most have already discovered, Hadoop is being adopted to manage and analyze unstructured data in today’s enterprise. At SAPPHIRE you’ll get to see it and SAP HANA run in a modern 10GigE Intel server cluster environment.
The Intel Distribution for Apache Hadoop is the first Hadoop distribution that has been designed from the silicon layer on up to deliver industry-leading performance, security, and scalability. It’s a perfect fit for the SAP HANA in-memory, high-performance database which also shares the same functional elegance of well optimized algorithms, cache efficiencies and MPP scaling derived from the latest Intel micro architecture.
If you are able to stop by during SAPPHIRE, you’ll also see, for example, how SAP Data Services integrated ETL capabilities can mine Hadoop-resident data and quickly load it into a SAP HANA database and combine it with structured data from other sources for blazingly fast contextual analysis. It’s so comprehensive and so quick, you might begin to think Big Data is not such a monster after all.