The recent release of SAP HANA 2.0 SPS 02 introduces a major new innovation in the area of machine learning and predictive analytics.
HANA already has the Predictive Analysis Library (or PAL) which provides HANA- optimized in-database training and scoring of predictive models (around 80 machine learning and statistical functions) and there are other HANA components such as the Automated Predictive Library (APL) licensed as part of SAP BusinessObjects Predictive Analytics which also enables in-database predictive processing in HANA.
For scenarios where neither of these approaches provides the desired function or algorithm there’s always been R integration with SAP HANA which has also been around for some time. Effectively you create a stored procedure in HANA which contains embedded R script and this script along with any input (eg: data, parameters) are transferred to a registered R server for remote processing. Once complete any output (eg: results) are returned and loaded into into HANA tables. This means that thousands of functions including custom functions are available to the HANA user via open-source R.
But the world doesn’t stand still. In the meantime Google introduced TensorFlow – an extremely popular and quite fashionable open-source Machine Learning library based on connected data-flow graphs that makes heavy use of Python as a scripting language. Learn more about TensorFlow here: https://www.tensorflow.org/
Wouldn’t it be cool if you could access TensorFlow models from HANA?
HANA 2.0 SPS02 includes the External Machine Learning library which makes this possible. The External Machine Learning library (or EML) is packaged as an Application Function Library (AFL) component so if you’ve used PAL in the past you’ll be pretty familiar with how to work with it. Models served by TensorFlow are registered in HANA via remote sources and then accessed through SQL-script.
At this stage it’s only possible to perform scoring on models that have already been created in TensorFlow and are being served by TensorFlow serving. As for training a TensorFlow model from HANA – that’s not currently possible – but we can always dream.
So now you’re raring to get started with HANA and TensorFlow?
The SAP HANA Academy has produced a short series of hands-on video tutorials to show you the ropes:
UPDATE 20th Sept. 2017: Please see my more recent blog for latest video tutorials
There are 5 videos covering the following topics with more to be added in the near future:
- Getting started
- Build, train and serve a model in TensorFlow
- Create Remote Source and register model
- Make predictions
As always all code snippets are available on GitHub.
The tutorials focus on the “MNIST” example which is the “Hello World” of TensorFlow. It’s always good to show integration working with the standard example – so anyone who already knows TensorFlow will be on familiar ground. However this does mean that the HANA SQL-script is not quite as elegant as it one might like – so if you want to make the HANA code cleaner (this does involve tweaking the model in TensorFlow somewhat) then you’ll find detailed instructions here. Thanks Frank!
The SAP HANA Academy provides free online video tutorials for the developers, consultants, partners and customers of SAP HANA.
Topics range from practical how-to instructions on administration, data loading and modeling, and integration with other SAP solutions, to more conceptual projects to help build out new solutions using mobile applications or predictive analysis.
For the full library, see SAP HANA Academy Library – by the SAP HANA Academy
For the full list of blogs, see Blog Posts – by the SAP HANA Academy