Skip to Content

In 2017, the very first “Machine Learning” oriented content based on the SAP Predictive Service was rolled out on the SAP Developer Center with dedicated series of tutorials and a brand new CodeJam topic that was delivered at over 12 locations.

During TechEd, I wasn’t expecting such a success at the Developer Garage with the “Machine Learning” AppSpace track.

And the feedback I got was consistent and simple: we want more!

And because with every new year comes new resolutions, new projects, new content, new ways to engage with the developer community (from the SAP ecosystem and beyond), we decided that 2018 is in no way different. So, let’s start something different!


Welcome to the first in a series of blogs about:  “Project: Machine Learning in a Box”!


The goal will be to let you to open the “black box” that a lot of people think is Machine Learning and help you find out what is inside. Let’s see if we can transform that “black box” into a transparent and versatile box for you to use in the future.

The other intent is also to help build your own Machine Learning “box” and run your experiments and projects.

How Do You Start?

Before we can get you started with this journey with installing products and tools, download data and do some coding, we need to set the scene and define what is Machine Learning.

Over the next few weeks, we will discuss:

  • Project Methodology

Running a Machine Learning project requires a methodology just like any other project. And just like with any other projects, the coding/modeling represents only a small portion of the overall project duration and effort.

  • Understand the different types and families of algorithms

Understand the difference between supervised and unsupervised machine learning and the associated the families of algorithms (association, classification, clustering, regression, time series)

  • The “Lingo” & Terminology

One of the biggest challenge when getting on-board with Machine Learning is to actually understand the lingo. When talking to very intelligent folks with PhD’s, they tend to use a very obscure language, which they only understand, so we will try to clarify some of the common terms and concepts.

  • Platform and Environment

Details about the platform that will be required to run this series and the associated example. This environment will evolve based on the needs, as for example at some point we will be using some R script or TensorFlow which will require additional software to be installed. And if there are some requests, we might even look a SAP Predictive Analytics at some point.

Spoiler alert: We will be using SAP HANA, express edition. I know that I’m lucky to have a 64 GB of RAM machine, and it’s not the case for everyone.
But I’ll always consider this constraint with the example and dataset that I’ll choose. It will show you that you don’t need that much to run SAP HANA, express edition and still can do and learn some cool stuff.

And after?

After that, we will dive into some cool dataset, and look at some of the algorithm available out-of-box within the SAP HANA libraries to address the use case:

  • SAP HANA Automated Predictive Library (APL) provides a single powerful algorithm for each family of algorithm leveraging the SAP HANA platform resources
  • SAP HANA Predictive Analytics Library (PAL) provides over 90+ industry standard algorithms implemented and optimized for the SAP HANA platform

We won’t limit ourselves with just the SAP HANA libraries, as we will next look at the Open Source R integration along with the External Machine Learning (EML) with TensorFlow.

Is that going to be all?

Off course not, I can be really creative! No kidding!

We will then start looking at what are the strategies to go live with a model, and building a few apps or extensions to demonstrate how to leverage our model results or capabilities.

But off course based on your feedback, this can change.

The goal: Get Hands-On Experience, Share Feedback and Knowledge!

This series won’t be just about getting you familiar with the terminology, understand some of the concepts and theories, or understand how, why, when and where to use Machine Learning.

I mean it will be great if you can make sense of all the lingo depicted below, but your goal will be to get practical experience with examples and use cases for you to try.  It will also be about sharing!

During this series, I will be promoting existing or new tutorials, but also referencing some existing SAP content including openSAP courses (don’t be scared, not the all course but just particular units) but also external interesting content.

What skills do I need?

I don’t expect everyone to be proficient in everything. I’ll just assume that you have some basic knowledge about statistics and mathematics, in addition to some basic SQL or programming skills.

And if you feel you are missing some of these skills or want to get deeper, no worries, just ask, there are plenty of valuable content out there.

When does it start?

It has already started and here is the first piece:


The Difference Between Data Science, Machine Learning, and AI by David Robinson on DZone

There is still a lot of confusion around what is the difference between Data Science, Machine Learning, Deep Learning, AI etc. This article tries to describe the main differences:

Data science produces insights.
Machine learning produces predictions.
Artificial intelligence produces actions.

Difference between Machine Learning, Data Science, AI, Deep Learning, and Statistics by Vincent Granville on Data Science Central

Another great article that aims at describing also the types of data scientist.

Good reading!


How often do you plan to publish?

My target will be to publish a new piece of content on a weekly basis and the best way to track the series would be to either follow me on:

– Twitter @adadouche

– SAP Community: https://people.sap.com/abdel.dadouche

Anyway, after publishing a new piece, I will also update the previous publication with the link to the next one. So, if you are using the Follow feature with this one, you should get notified about any updates or comments.

Now, I expect from you to share your thought,
contribute, engage with others
and share with your friends and colleagues!!


(Remember sharing && giving feedback is caring!)

UPDATE: Here are the links to all the Machine Learning in a Box weekly blogs:

To report this post you need to login first.

9 Comments

You must be Logged on to comment or reply to a post.

  1. Lakshmi Sankaran

    What is your data set look like and which domain as this is where I feel we need more tool kits for our customers to get confidence in sharing data so we can (machine) learn from it.

     

    (0) 
    1. Abdel DADOUCHE Post author

      Hi Lakshmi Sankaran

      There are plenty of data sets repositories available online that we will be leveraging to understand interesting concept around data, but also to experiment with algorithms.

      The intent of this blog series is not to deeply dive into specific domain use case or data set. The main reason is that for every use case, the data you will get from a customer to an other will differ including the definition of their problem.

      To answer your comment regarding building confidence with a customer, I can tell you that with every customers we had to engage with while at KXEN, we had to demonstrate that we were working in a collaborative and iterative mode applying a robust project methodology.

      This allowed us to build a trusted relationship, and become trusted advisers on the way to implement a “Predictive Factory” while the customer was and remained the domain/business subject matter expert and drove the business needs.

      What I mean here is that you have to make a clear distinction on a data science expertise or a business domain expertise, and I know that in some (many) situation the border is thin.

      When this trusted relationship was established, the customer will become your best sponsor, and give you even more opportunities.

      And last but not least, when we joined SAP (from KXEN) in 2013, we identified multiple uses cases for many industries and lines of businesses where SAP is present.
      As you are part of the SAP Family, I can connect you with the relevant people internal from the Global Predictive CoE to discuss industry use cases.

      Regards

      (0) 
    1. Abdel DADOUCHE Post author

      Hi Former Member

      My intent with this blog series is to share with the community how to get started with Machine Learning, by first understanding some of the concepts about what it means, how to run a project around Machine Learning, get your environment ready.

      Then we will start analyzing and playing with data and algorithms leveraging many of the SAP HANA, express edition capabilities. This will allow us to produce insights and predictions which we will move into the cloud for example or look at how to best deploy them.

      It won’t be following the format a formal training course like the “Getting Started with Data Science” openSAP course which I encourage everyone to take.

      What I really expect from you is to share your thought, contribute, engage with others and not just for me to have plenty of page views in the end 😉

      (0) 

Leave a Reply