Utilizing Hadoop via SAP HANA Smart Data Access from SAP Predictive Analytics 3.0
Another exciting new aspect of SAP Predictive Analytics 3.0 (PA) is that it formally supports SAP HANA Smart Data Access (SDA) from the data manager. So if you have data in Hadoop or IQ (or any data source supported by SDA) you can access, transform and utilize it using PA’s data manager without the need to write script or code of any kind!
Imagine a scenario where we’d like to investigate customer “churn” and build out a predictive model that would allow us the identify possible future “churn” based on the customer’s profile including how much they’ve used our call center. We might already have customer master data in SAP HANA but we’d like to supplement that with call center data that’s stored elsewhere – for example, in Hadoop.
Using the data manager in SAP Predictive Analytics 3.0 we can create data manipulations that not only access both of these data sources – but also perform transformations on them such as creating new variables, filtering, aggregation – as well as merging the two sources together so the results can be used directly by the predictive modeler.
The crucial aspect here is performance and scalability – data in Hadoop (and SAP HANA) can be huge! We’d only want to bring the minimum data necessary into PA and we should ensure that as much processing as possible is pushed down into the source data base – filtering and aggregation in particular.
So that’s what support for SAP HANA Smart Data Streaming in SAP Predictive Analytics 3.0 effectively means. The data manager generates optimized syntax for each data source to ensure that processing is as efficient as possible.
So how do you actually go about doing this? The SAP HANA Academy has produced a step-by-step tutorial video to show how you use data manager to access, transform, and combine data from both SAP HANA and Hadoop (via SDA).
We’ve created a mini-series consisting of 4 video tutorials that take you step-by-step through the process of setup and configuration, accessing the master data in SAP HANA, transforming the call center data in Hadoop, combining the results and creating the predictive model.
This tutorial video mini-series was created using SAP Predictive Analytics 3.0 and SAP HANA SPS 11.
Or find it here in the main PA playlist: SAP BusinessObjects Predictive Analytics
If you want to learn more about the intricacies of SAP HANA Smart Data Access, watch Bob and Denys show you everything you’ll likely ever need here: SAP HANA Smart Data Access