Introducing “Project: Machine Learning in a Box”
In 2017, the very first “Machine Learning” oriented content based on the SAP Predictive Service was rolled out on the SAP Developer Center with dedicated series of tutorials and a brand new CodeJam topic that was delivered at over 12 locations.
During TechEd, I wasn’t expecting such a success at the Developer Garage with the “Machine Learning” AppSpace track.
And the feedback I got was consistent and simple: we want more!
And because with every new year comes new resolutions, new projects, new content, new ways to engage with the developer community (from the SAP ecosystem and beyond), we decided that 2018 is in no way different. So, let’s start something different!
Welcome to the first in a series of blogs about: “Project: Machine Learning in a Box”!
The goal will be to let you to open the “black box” that a lot of people think is Machine Learning and help you find out what is inside. Let’s see if we can transform that “black box” into a transparent and versatile box for you to use in the future.
The other intent is also to help build your own Machine Learning “box” and run your experiments and projects.
How Do You Start?
Before we can get you started with this journey with installing products and tools, download data and do some coding, we need to set the scene and define what is Machine Learning.
Over the next few weeks, we will discuss:
- Project Methodology
Running a Machine Learning project requires a methodology just like any other project. And just like with any other projects, the coding/modeling represents only a small portion of the overall project duration and effort.
- Understand the different types and families of algorithms
Understand the difference between supervised and unsupervised machine learning and the associated the families of algorithms (association, classification, clustering, regression, time series)
- The “Lingo” & Terminology
One of the biggest challenge when getting on-board with Machine Learning is to actually understand the lingo. When talking to very intelligent folks with PhD’s, they tend to use a very obscure language, which they only understand, so we will try to clarify some of the common terms and concepts.
- Platform and Environment
Details about the platform that will be required to run this series and the associated example. This environment will evolve based on the needs, as for example at some point we will be using some R script or TensorFlow which will require additional software to be installed. And if there are some requests, we might even look a SAP Predictive Analytics at some point.
Spoiler alert: We will be using SAP HANA, express edition. I know that I’m lucky to have a 64 GB of RAM machine, and it’s not the case for everyone.
But I’ll always consider this constraint with the example and dataset that I’ll choose. It will show you that you don’t need that much to run SAP HANA, express edition and still can do and learn some cool stuff.
After that, we will dive into some cool dataset, and look at some of the algorithm available out-of-box within the SAP HANA libraries to address the use case:
- SAP HANA Automated Predictive Library (APL) provides a single powerful algorithm for each family of algorithm leveraging the SAP HANA platform resources
- SAP HANA Predictive Analytics Library (PAL) provides over 90+ industry standard algorithms implemented and optimized for the SAP HANA platform
We won’t limit ourselves with just the SAP HANA libraries, as we will next look at the Open Source R integration along with the External Machine Learning (EML) with TensorFlow.
Is that going to be all?
Off course not, I can be really creative! No kidding!
We will then start looking at what are the strategies to go live with a model, and building a few apps or extensions to demonstrate how to leverage our model results or capabilities.
But off course based on your feedback, this can change.
The goal: Get Hands-On Experience, Share Feedback and Knowledge!
This series won’t be just about getting you familiar with the terminology, understand some of the concepts and theories, or understand how, why, when and where to use Machine Learning.
I mean it will be great if you can make sense of all the lingo depicted below, but your goal will be to get practical experience with examples and use cases for you to try. It will also be about sharing!
During this series, I will be promoting existing or new tutorials, but also referencing some existing SAP content including openSAP courses (don’t be scared, not the all course but just particular units) but also external interesting content.
What skills do I need?
I don’t expect everyone to be proficient in everything. I’ll just assume that you have some basic knowledge about statistics and mathematics, in addition to some basic SQL or programming skills.
And if you feel you are missing some of these skills or want to get deeper, no worries, just ask, there are plenty of valuable content out there.
When does it start?
It has already started and here is the first piece:
There is still a lot of confusion around what is the difference between Data Science, Machine Learning, Deep Learning, AI etc. This article tries to describe the main differences:
Data science produces insights.
Machine learning produces predictions.
Artificial intelligence produces actions.
Another great article that aims at describing also the types of data scientist.
How often do you plan to publish?
My target will be to publish a new piece of content on a weekly basis and the best way to track the series would be to either follow me on:
– Twitter @adadouche
– SAP Community: https://people.sap.com/abdel.dadouche
Anyway, after publishing a new piece, I will also update the previous publication with the link to the next one. So, if you are using the Follow feature with this one, you should get notified about any updates or comments.
Now, I expect from you to share your thought,
contribute, engage with others
and share with your friends and colleagues!!
(Remember sharing && giving feedback is caring!)
UPDATE: Here are the links to all the Machine Learning in a Box weekly blogs:
- Introducing “Project: Machine Learning in a Box”
- Machine Learning in a Box (part 2) : Project Methodologies
- Recap Machine Learning in a Box (part 2) : Project Methodologies
- Machine Learning in a Box (part 3) : Algorithms Learning Styles
- Machine Learning in a Box (part 4) : Get your environment up and running
- Machine Learning in a Box (part 5) : Upload Machine Learning Datasets
- Machine Learning in a Box (part 6) : SAP HANA R Integration
- Machine Learning in a Box (part 7) : Jupyter Notebook
- Machine Learning in a Box (part 8) : SAP HANA EML and TensorFlow Integration
- Machine Learning in a Box (part 9) : Build your first Machine Learning application
- Machine Learning in a Box (part 10) : JupyterLab