With little fanfare, last week SAP made an announcement that it will resell the Intel Distribution of the open source Apache Hadoop project as well as HortonWorks Data Platform for Hadoop. While hardly page one news for major media, it is big news for enterprises seeking to exploit their big data opportunities. Let me explain.
First, big data, particularly unstructured data that comprise 80% of the information inside most organizations, presents new IT challenges, most of which overwhelm the vast majority of the traditional, row-based data warehouses installed today. Attempting to store and analyze big data effectively in these established DWs is, for the most part, a lost cause.
Hadoop, which is an efficient distributed file system and not a database, is designed specifically for information that comes in many forms, such as server log files or personal productivity documents. Anything that can be stored as a file can be placed in a Hadoop repository.
Hadoop is also a young technology, not even a decade old. That’s why a mere 10% of respondents in a recent TDWI survey of enterprises say they currently have the file system running in their organizations. Most companies aren’t sure what other technologies Hadoop needs to be an effective tool in their data center.
Which brings me to my second reason why SAP’s reseller agreements with Intel and HortonWorks are, well, a big deal for big data. Big Data has swamped most large enterprises, presenting CIOs with two pressing problems. The first is how to store dramatically increasing volumes of information of unknown value. Well, that can be solved with Hadoop, a proven, low-cost method for storing petabytes of data. However, once stored, trying to move the data from Hadoop into a traditional data warehouse can take weeks to process before it’s ready for analysis. And even then, because the amount of data is so vast, analysts often need to remove much of the detail so as not to bring the old DW to a grinding halt. But because of the integration work we’ve done with Hadoop and the SAP HANA platform all the data that analysts need can move seamlessly between the two systems when they want it, not after weeks of processing. By combining the all in-memory, columnar SAP HANA database with Hadoop in the data center, CIOs are able to deliver infinite storage with instant insights.
Finally, there’s one more subtle benefit of our news announcement for the Hadoop community and IT. To succeed in the marketplace, emerging enterprise systems like Hadoop need established vendors to embrace them, otherwise most CIOs will not deploy them in their data centers. With SAP fully committed to Hadoop through these reselling agreements, CIOs understand that they can embark on cutting edge big data initiatives with state of the art technologies that are fully supported by a single, trusted vendor, SAP. With these partnerships, we minimize a CIO’s risk. We eliminate the problem of whom to contact for Hadoop support. It’s SAP.
In addition to the low-risk we offer enterprises Hadoop, we deliver choice. Being open source, Hadoop has many iterations to choose from; one might say, too many. So, by delivering full support to the Intel and HortonWorks distributions, we have done the hard work for you to determine the best enterprise-class versions of Hadoop for your data center. But you can still choose based on your organization’s needs.
At the same time, SAP has integrated Hadoop with the SAP HANA platform, SAP Sybase IQ software, SAP Data Services, and SAP Business Objects, making it possible to conduct sophisticated OLTP and OLAP operations with both structured and unstructured data.
Sometimes it’s the little things that make life easier for IT managers. That’s why this modest news announcement is such a big thing. By SAP reselling the Intel and HortonWorks distributions of Hadoop we provide a single point of contact for a complete technology stack that delivers the best performance so every enterprise can use all of big data to improve their business.