Simplified In-Database Machine Learning with SAP HANA
SAP HANA 2.0 SPS02 introduces new predictive and machine learning procedures
SAP HANA provides native in-memory capabilities for predictive analytics and machine learning executing at unprecedented speeds directly inside the database. In combination with other advanced processing capabilities like spatial, text or graph analysis SAP HANA inherently enables multi-modal analysis scenarios.
The SAP HANA predictive analytics library (PAL) includes over 90 algorithms for scenarios like cluster analysis, outlier detection, classification and regression analysis, association analysis, link prediction and recommendation analysis along with many statistical and data preparation algorithms.
The use of the PAL allows you to bring and apply predictive and machine learning algorithms to where the data is stored; in contrast to standalone, dedicated machine learning platforms, where data first has to be copied and moved out of the database in order to gain and apply any value from advanced analytics. You can then execute predictive analysis or apply predictions with trained models as a part of any database transactions, co-located with your application database-layer such as S/4HANA, BW on HANA or any other SAP- or custom application running on SAP HANA. With the use of PAL you can build and run smarter applications directly on SAP HANA.
The new release SAP HANA 2.0 SPS02 includes new algorithms and enhancements to existing functions within the predictive analytics library (see the updated documentation and roll-out sessions). Here I’d like to specifically highlight some key enhancements and new capabilities.
Highlight #1: The new PAL Procedures!
While business predictive specialists and data scientists often make use of tools like SAP Predictive Analytics to build expert predictive scenarios using the PAL algorithms in SAP HANA, database- and application developers prefer the script-based and graphical editors within the SAP WebIDE for SAP HANA. In previous releases, the script-based approach required a rather strict and tedious use of PAL functions along with well-prepared data structures and wrapper-procedures. In the past therefore programming with PAL hasn’t been perceived as really short and concise by the data scientist community.
With the new release of SAP HANA 2.0 SPS02, the script-based approach has been completely re-designed and simplified to allow a far more agile use and leverage the Predictive Analytics Library for developers and data scientists working directly on SAP HANA.
- For each PAL algorithm and function, there is a new generic use-procedure provided out of the box (within the _SYS_AFL schema).
- Nothing needs to be prepared anymore, input data and parameter tables can be dynamically passed to the procedure call while at the same time the output table structure is derived without explicit definition during the call.
- The same procedure call can be used with varying input tables or subsets of the table columns, without having to create any wrapper procedures.
The following example illustrates the simplified use of the PAL Random Decision Tree function with three different input data sets
The new PAL procedures allow for a far more agile use of SAP HANA’s native predictive and machine learning capabilities, drastically speeding up the predictive scenario development process. Developers and data scientists can now more easily apply their typical explorative and experimental analysis approach directly within the database; thus applying their expertise at the core of the SAP HANA platform and build smarter applications.
For further details see the new PAL Procedures section within the SAP HANA Predictive Analysis Library documentation.
Highlight #2: The new SAP HANA and Google Tensorflow integration
Google Tensorflow has become a very popular and a vastly growing framework open sourced by Google for machine learning data flows. Tensorflow supports distributed processing and hardware accelerations like specialized processing units. As a result, it has specifically become an interest for deep learning scenarios like image, sound and video processing for classification, recognition or object detection.
|SAP HANA’s capability to integrate with external machine learning frameworks such as R and SAS, is now being enriched by a Google Tensorflow integration. With that you can build even smarter SAP HANA applications, for example by leveraging Tensorflow’s deep learning capabilities for image classification.|
For a good use case and demo example, combining a shirt-image classification using the Tensorflow integration as the input for the SAP HANA PAL- and spatial-based personalized product offer recommendation see the recording of this year’s SAPPHIRENOW live talk by Richard Pledereder and Dimitrios Agagiotis Improve Data Insight with In-Memory Machine Learning and Advanced Analytics.
While we will share a more elaborate example about the SAP HANA Tensorflow integration in a specifically dedicated blog to follow soon, the SAP HANA Academy channel on YouTube provides a new video series on the topic, put together by Philip Mugglestone. You can watch the new series SAP HANA External Machine Learning Library playlist. To learn more about the technical details see the documentation SAP HANA External Machine Learning Library.
Where you can learn more on the recent enhancements
We presented an overview on all the predictive and machine learning enhancements in the SAP HANA 2 SPS02 enablement call for the Advanced Analyst. In order to register for the call on July 31st or watch the recording, see the details on the session “What’s New in SAP HANA 2.0 SPS 02 – Advanced Analyst” within central announcement blog here.
Of course you can directly explore how to use the new capabilities leveraging the new HANA Expression Edition here (once available).
Finally I’d like to invite you to chat with us and our experts in person if you plan to attend SAP TechEd in Las Vegas, Bangalore or Barcelona this year. Please join us for the following session where we will practice and demonstrate the use of the new capabilities:
|HBD620||CodeJam– Develop Smart Applications Leveraging Predictive Capabilities in SAP HANA link|
|HBD117||Lecture Session – Deep Learning Using SAP HANA and TensorFlow link|
|HBD831||Road Map Session – Machine Learning in SAP HANA link|
|HBD862||Road Map Session – R and Predictive Analysis Library in SAP HANA link|
|HBD173||Hands-On Workshop – Processing Services and Special Data Types in SAP HANA link|
Explore these new exciting enhancements for predictive and machine learning available in SAP HANA 2 SPS02 and learn how to exploit your business application and external data using SAP HANA’s advanced analytic processing capabilities to deliver unprecedented insights.
Share your feedback on the new capabilities with us! Can’t wait to see you all at SAP TechEd!
This is actually great. Integration with Tensorflow will really help businesses leverage the true power of Machine / Deep learning. Just hoping that the serving part gets easier as it looks slightly complex right now.
My previous blog on SAP space: https://blogs.sap.com/2017/06/18/image-recognition-in-sap-ui5-with-web-cam-using-tensorflow-over-google-ml-engine-for-deep-learning actually shows a use case built with Tensorflow for Image recognition.
any reason why APL cannot be used same way as PAL?
Hi Clemens, thanks for your comment. APL will likely adopt the new simplified procedure interface. best regards,
Here now the link to a more detailed SAP HANA and Google Tensorflow integration blog https://blogs.sap.com/2017/08/01/implementing-the-mnist-classification-problem-the-hello-world-of-ml-with-sap-hana-and-the-afl-eml-using-googles-tensorflow/
Actually APL does provide simplified Procedures : no need to define table types for input data, no need to define a procedure wrapper. This is called the "procedure technique", it is available since several releases of APL.
Very exited to see how the Google Tensorflow integration works together with SAP HANA.
Thanks a lot for this great blog post. It contains a lot of useful information.
Thank you for writing such a great blog.Just few doubts..Consider a business environment where data flows to SAP HANA via other interfaces like SLT or DRFOUT or any other interfaces. Can we create models in HANA itself or do we need Google tensor flow to actually create models and run ML algorithms in HANA ?
For image and text processing is there any other systems that could be integrated to SAP HANA apart from Google Tensorflow ?
Thanks in advance
for text data, we can certainly also use the HANA Text Analysis and Text Mining capabilties, depending on the use case. Similarly, if you extract features from an image externally to HANA, the analysis on the extracted features could happen in HANA again.
The R integration could also help, or other external ML applications. for examples SAS integrates with SAP HANA as well.
Best regards, Christoph
Thank you Christoph..