Predictive for ALL
Last month, Karl von Beckmann gave an excellent overview of the need for Predictive Analysis in the January edition of ACT newsletter, which I encourage you to take a look at. This week, I wanted to give some more details on why SAP Predictive Analysis represents a true step forward in predictive analytics.
SAP Predictive Analysis uses a “GUI abstraction layer” that takes away the pain and complexity of writing any scripts. This means that a business person who wants to generate a predictive query doesn’t need to know the sophisticated statistical programming language known as “R” to perform predictive analysis. As Karl pointed out last week, a business person can now easily segment their customer base (E.g. Who’s shopping online? Who are my return customers? What are their top characteristics?) without having to rely on typically over-burdened statisticians. This frees up the statisticians to focus on the most technical analysis that could have a huge impact on the overall business. It also enables the business person to take a deeper look at their data and scientifically test out their theories, rather than making multi-million dollar “gut decisions”. Ok, they may not all be multi-million, but certainly many business people have had sleepless nights over a business decision made with very thin data. SAP Predictive Analysis is trying to round out the “gut decision” with some statistical direction to make the decisions simpler, faster and more accurate.
Why HANA could be your favorite four letter word for Predictive Analysis
HANA is SAP’s revolutionary in-memory database which is literally transforming what is possible in the world of enterprise computing. Think of when Apple Inc. came out with the first iPod. After the Walkman and the portable CD player came this revolutionary device with no moving parts powered solely by flash memory. This is what SAP has brought to enterprise databases.
In addition to being a quantum leap forward in data processing speed and efficiency, HANA has the ability to revolutionize the way in which predictive analysis is performed. In order to understand how, one must first have a cursory understanding of what makes predictive analysis unique.
Most predictive analysts utilize an open Source programing language known as “R”. With “R” you are able to create any algorithm your heart desires. Currently, there are over 4,000 algorithms in the R-library, ranging from basic Linear Regression to advanced models for calculating the effects of the Large Hadron Collider at CERN bordering Switzerland and France. There is virtually no limit to what you can do with the R Language. If you can think it, you can script it.
The great benefit of HANA in all of this is speed. The HANA architecture allows you to create and train your models in the same platform that will be executing them. No more extracting data out of a database into an external statistical modeling tool, creating the model, and then publishing the model back to the database. Also, the faster you can train the models, the more modeling cycles you can execute, and thus the more accurate your models will be. (For a more detailed discussion on how HANA can use the R models, join the conversation: https://community.wdf.sap.corp/sbs/thread/120545)
Perhaps an example will illustrate the point. Our partner, Cognilytics, once had a dataset that was 10.5 million rows deep and 50 fields wide. They ran a clustering analysis – K-means, if you care – which identifies any naturally occurring groups (E.g. people who shop on weekends vs. people who shop during the week). This was taking over 70 hours to perform with a competitor’s product (keep in mind that K-means usually takes 100 passes through the data, meaning 50 billion calculations). They were able to optimize the algorithm and get the performance to 40 hours or so. But by executing the same data set in HANA with SAP Predictive Analysis as the front end, the analysis took 2 minutes. Actually, it was 90 seconds, but who’s counting?
Consider the analogy of Chocolate and Peanut Butter to describe SAP Predictive Analysis and HANA. Both are great on their own, but when you combine them they’re fantastic. The speed of HANA is truly wonderful, and a fantastic enabler for BIG Data decision making. SAP Predictive Analysis democratizes the world of statistics and enables a business audience to build robust statistical models without having to learn a scripting language or need a PhD in mathematics. Put these technologies together and you have a transformative technology that is revolutionizing business. And best of all, its calorie and guilt free!