Skip to Content

This was an ASUG webcast last month provided by the BITI Dev/Tech Special Interest Group.  What is machine learning, and what is SAP HANA Machine Learning?

Figure 1: Source: SAP

Computers learn from data without explicitly programmed

Today, you have to recode application

In ML algorithm encompasses decision making and prediction – decouple logic from algorithm

Increase the flexibility of applications

Computers learn from data

Input data, train model, test that model is valid

As new data comes retrain the model to take advantage

Model becomes more robust

Three phases – input, machine learning, output

Input could be text, images, data cleansing, build model, subset data to train model

Embed model in applications

Figure 2: Source: SAP

Why now?  Explosion of data, coming from variety of sources, collect data for competitive advantage

Look at past transactions to see what people have been buying

Increase in processing power (such as SAP HANA, PAL)

Building deploying applications has become easier due to large set of algorithms for numerical, non-numerical data, and deep learning

Integrate with applications

Model management capabilities

Figure 3: Source: SAP

Figure 3 shows Machine Learning in HANA – end to end process

Start with data ingestion – HANA can load streaming

Next data exploration; – how does data look?  Does it have data missing?  Tools to help

Feature engineering – transform input

Once data is ready, split in 2 groups – 1 for training, other for testing and validate ML algorithm

Store model in HANA

Models need to be deployed in a variety of different ways – SQL or as a service

Scoring or prediction – real time, execute models quickly

Model management -ensure you can retrain models on an event driven basis or periodic basis

Figure 4: Source: SAP

HANA comprehensive learning capabilities

Integrate with 3rd party libraries ®

High performance in scoring

Ready for developers using HANA express

Figure 5: Source: SAP

Figure 5 shows the scenarios addressed by PAL

Applicants suitable for credit card processing – classification scenario – credit score, income level, geography; use a decision tree – which one will default

Build model on historic data, new applicants to predict if they default or extend credit

Regression model for predicting house prices; build model, periodically trigger/retrain model as needed to ensure model predictability is accurate

Look at customers to run marketing to group as a logical entity – clustering, k-means, you may want to put some marketing programs

Analyze customer transaction data – customers who bought milk, did they buy, using sequential pattern mining

Over 90 algorithms in PAL (shown on the right of Figure 5)

Figure 6: Source: SAP

 

Algorithms are typically used by data scientists, to allow a fine grain of control

Each release of SAP HANA continue to release and offer new algorithms

Figure 7: Source: SAP

APL provides higher level of abstraction

Can be used by business analysts

Library embedded in SAP HANA

The user doesn’t have to select input or classification algorithm

Takes set of inputs, derives them, forecasting for accuracy

Complementary to PAL

Could use both PAL/APL

Figure 8: Source: SAP

Figure 8 shows you can classify documents, input into HANA, do text mining, term document, then given a new document can run the classification algorithm

Figure 9: Source: SAP

Figure 9 shows business forecasting, predict future inventory levels, future sales and consumption

HANA can store and process time series

Provide algorithms such as exponential smoothing

Business function library contains algorithms for business processing

Figure 10: Source: SAP

More need to deal with event streaming; machines have sensors and want to analyze information as it comes out

High likelihood of failure?  Take corrective action

Smart data streaming is a component in SAP HANA

Streaming engine – define query continuously

SAP HANA supports incremental machine learning, as data comes in and prediction on event streams, scoring decision trees in real time

Figure 11: Source: SAP

R is a popular 3rd party language, offering a variety of different packages

Stored procedure can contain R code

R server runs in a separate node, processing the code

Results are returned to stored procedures

Figure 12: Source: SAP

High performance – quickly take model to production to be competitive

Iterations – to do quickly

Once build model, execute quickly

Use models for predicting in real time

HANA approach is three-fold

Pushed processing logic to data base

Training algorithms take advantage of multiple cores

Focus on multi-node architecture for parallelization

Figure 13: Source: SAP

Can use Predictive Analytics

Data scientist can use the expert mode and store models in Predictive Factory and models can be retrained

Business analysts use automated predictive modeler

Figure 14: Source: SAP

On the HANA side you have PAL and R

HANA studio supports

You can write SQL script to invoke PAL for training and scoring

Figure 15: Source: SAP

Figure 15 talks about embedding the predictive models in applications

Figure 16: Source: SAP

The webcast ended with SAP saying you can start developing with this today using HANA Express

Recording link

Related

Video from SAP about machine learning and why you should come to SAP TechEd

 

To report this post you need to login first.

5 Comments

You must be Logged on to comment or reply to a post.

  1. Prasenjit Singh Bist

    Simply putting a blog and teched session won’t help. If SAP wants to empower us the real people who who would work with these technologies then put it on opensap with proper sap dev tools

    (0) 
    1. Tammy Powlas Post author

      Hi – I don’t work for SAP – and many tools are available now at the Cloud Appliance Library – cal.sap.com for you to try yourself

      (0) 
    1. Tammy Powlas Post author

      I’m sorry for the delay – I updated the blog for the link – you will need to register to view it.  It is after Figure 16.

      (0) 

Leave a Reply