Even though predictive analytics has been around for quite some time, interest around this topic has increased over the last couple of years. It is no longer enough for a company to accurately record what has happened. Today, an organization’s success depends on its ability to reliably predict what will happen – be it predictions about what a customer is likely to buy next, an asset that could require maintenance, or the best action to take next in a business process.
Predictive analytics uses (big) data, statistical algorithms, and machine learning techniques to identify the likelihood of future outcomes based on historical data, enabling both optimization and innovation. Existing processes can be improved – for example by forecasting sales and spikes in demand and enabling the required adjustments to the production planning. Also, new insights can uncover new business opportunities or even make new products and services possible – just think of cross- and upselling or new as-a-service-models.
One great example of this is Kaeser Kompressoren, one of the largest suppliers of air systems. The company began putting sensors on those systems and examining the data collected, thus finding a new way to generate revenue: With a better understanding of its machines and the ability to analyze them continuously, Kaeser now sells compressed air by the cubic meter, charging its customers only for the air that is used – air-as-a-service, so to speak, made possible by predictive maintenance, powered by SAP HANA.
Indeed, SAP has a strong predictive analytics and machine learning foundation with SAP Predictive Analytics and SAP HANA. In this blog, I’d like to focus on the predictive capabilities of SAP HANA. Its ability to implement execution engines that can operate on data in-memory enables it to offer multiple predictive capabilities and offer them concurrently.
Unleashing the power of predictive with SAP HANA
Not only in the above-mentioned example with Kaeser, but within the Internet of Things (IoT) itself, the capability to ingest, process, and analyze streaming (structured and unstructured) data from multiple sources in real time is key. In the Kaeser example, an immediate reaction to a compressor downtime that was predicted based on its performance in real-time can save both time and money.
To process and analyze different formats of data in real time and with the required scalability and fault tolerance, companies require a combination of online transaction processing (OLTP) and online analytical processing (OLAP). On top of this, they need integration and analytics. SAP HANA includes both database and machine learning tools and as such can help unlock business value from IoT initiatives.
One way in which SAP HANA increases performance is by executing complex calculations – for example predictive analyses – in the database instead of on the application server. SAP HANA offers several possibilities to move application logic into the database, one of the most important being SAP HANA Application Functions.
“Algorithmic” predictive capabilities of SAP HANA – PAL
Within SAP HANA, such Application Functions for a particular functional area are grouped into an Application Function Library (AFL). It contains pre-delivered utilized business, predictive, and other types of algorithms written in C++ that are commonly used in projects or solutions running on SAP HANA. The predictive engines of SAP HANA are implemented on the AFL layer, which enables almost any other process to use or embed these predictive capabilities with minimal effort by avoiding the need to write custom complex algorithms from scratch.
Predictive functions have been grouped together in the SAP HANA Predictive Analysis Library (PAL) which is one of the SAP HANA AFLs. PAL takes advantage of the ability of SAP HANA to host execution engines and perform local calculations in memory and in parallel. Users can perform in-database data mining and statistical calculations with excellent performance on large data sets.
PAL contains pre-build, parameter-driven algorithms primarily related to predictive analysis and data mining in many different categories like clustering, classification, and regression. From a predictive maintenance perspective, specific algorithms are included for survival analysis, probability distributions, classification and regression, cluster analysis, and time-series analysis. The library defines more than 90 functions that can be called within SAP HANA SQL Script procedures to execute analytic algorithms. In SAP HANA 2.0 SPS02, users can use SAP Web IDE for SAP HANA to model SAP HANA predictive workflows using PAL.
PAL also includes several incremental data science and machine-learning algorithms that learn and update continuously and instantaneously to enable dynamic predictions. Companies can incorporate current data into algorithms rather than periodically polling an external data source. This way, they can instantly adapt to changing conditions and behaviors.
“The number of algorithms supported in PAL has been growing with every service pack of SAP HANA. In SAP HANA 2.0 SPS02, we provide state-enabled real-time prediction for many classification and regression algorithms”, says Xingtian Shi, who manages the development team for PAL. “Instead of the normal process that requires reading and parsing the model every time the prediction procedure is called, the feature in PAL allows the user to import the trained model into SAP HANA first with the prediction request directly following so it doesn’t need to repeat the model reading and parsing. This is a very beneficial feature for IoT use cases where the reaction/prediction ideally happens in real time. Also, the incremental algorithms developed in the SAP HANA smart data streaming engine are well-suited for IoT use cases. They are designed and implemented for devices with limited calculation and memory resources and run on real-time data, for example from sensors.”
Unlocking business value from IoT initiatives
For completeness’s sake, let’s also quickly touch on another AFL – the SAP HANA Automated Predictive Library (APL), which allows the use of the data mining capabilities of the SAP Predictive Analytics automated analytics engine on customer datasets stored in SAP HANA. Other machine learning tools in SAP HANA include SAP HANA R integration and the SAP HANA External Machine Learning Library (EML), which is another AFL that allows the integration with TensorFLow. Both show the openness of the SAP HANA platform for machine learning and IoT.
The choice between algorithmic (PAL) and automated (APL) predictive capabilities depends largely on the target users and their needs. Both libraries run natively in the AFL layer of SAP HANA and have direct access to the data. Calculations are performed within SAP HANA, and therefore no data is extracted, no external I/O load is created, and no external systems are required. APL provides flexibility to automate the predictive analytics workflow without users needing to know how to build complex data models from scratch. PAL typically requires a user to create a SQL script manually for each stage of the predictive modeling workflow, offering a high degree of control and precision in the modeling process.
IDC predicts that by 2019, all IoT efforts on the application layer will merge streaming analytics with AI and machine learning to increase the agility and robustness of IoT investment. Choosing SAP HANA as predictive analytics platform, companies can unlock the business value of their IoT initiatives today, ensuring they are best prepared for all future requirements.