Machine learning with SAP ECC 6.0 and SAP BW 7.50
Machine learning with SAP ECC 6.0 and SAP BW 7.50
What if you currently work with SAP ECC 6.0 and you want to build a proof of concept with machine learning and be prepared to migrate your ideas to SAP HANA Predictive Analysis Library?
I started to work with SAP about 20 years ago and I stepped through different stages of understanding business data in various implementation projects of master data, material management, sales and distributions, plant maintenance, demand planning, business warehouse, integrated planning, and interfaces with other satellite systems. A few years ago, machine learning started to flood the media. This caught my attention because I have seen so much data and I was curious about revamped age of artificial intelligence I read from science fiction books and movies up to the point of singularity Ex Machina, but in reality, all that worked in the past were only basic statistical models. When setting expectations take into account the divergence of goals and degrees of understanding a domain knowledge The Expert (Short Comedy Sketch). 🙂 In a higher dimensionality, you can draw a kitten, and then with dimensionality reduction you can get a line, so that is possible, but does that abstraction really make sense? 🙂
People Tend To Overestimate What Can Be Done In One Year And To Underestimate What Can Be Done In Five Or Ten Years.
I completed various AI courses from https://open.sap.com/ and at the same time, I started to look for a second opinion of the core solutions to break in my mind concept into basic elements. I found out that most data science environment relies on Python and that the giants democratized the machine learning libraries. The most notable neural network library is TensorFlow with Keras on top provided by Google.
Once you understand the basics you want to understand the scale and because of that you inevitably will land on Kaggle competition. I took my chances with learning and scrapping the code from Earthquake Kaggle competition.
The Kaggle completion is a fast start and gives you the real flavor of machine learning. The Kaggle playground works out of the box and from the community-shared exploratory data analysis you can puzzle your own ideas. After you are convinced that machine learning has a real flavor you start asking if it works because of data or if it might work with any data?
Supervised learning is a more commonly used form of machine learning than reinforcement learning in part because it’s a faster, cheaper form of machine learning. With data sets, a supervised learning model can be mapped to inputs and outputs to solve a regression problem or a classification problem like image recognition, machine translation or other category allocation model. In supervised learning, you have input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output. You can’t create this function Y=f(X) in a deterministic way, but you can create a stochastic function with machine learning.
Probably one of the most common and important areas in a company to check out on machine learning is pricing prediction. Sales process are probably the most detailed with many attributes in sales orders and sales invoices. Sales reporting heavily relies on standard reports, logical databases, and profitability reports for real-time data from ECC 6.0 or for aggregated data from SAP BW 7.50.
I liked this course SAP Getting Started with Data Science (Edition 2021) because it discloses so many public E-learning resources the machine learning relies on with the community you can learn from and contribute.
The hardest part is to set the roadmap there are so many libraries that you are facing with the problem of over-choice. You know the quotation ”I can only show you the door, you have to walk through it”. 🙂
The connect-the-dots solution to the problem of over-choice:
- The source data from reporting with categorical and numerical data.
- The format of data can be TXT, CSV or XLS. I used XLS BEX workbooks from BW 7.50 because of the convenient data update with refresh functionality. If you intend to continually update the data in a pipeline you have to post source data with an API.
- The environment for exploratory data analysis and training is anaconda.com with Jupyter Notebook.
- The main library for data manipulation is pandas.
- The library for data preprocessing is sklearn. Save labels encoding with pickle.
- The library for training is XGBoost. I like the embedded function of features’ importance. Fit the model with XGBRegressor() and save the best model with pickle.
- The library for graphing is plotly. I like the intuitive content of objects with dictionaries.
- The library for the user interface is built with ipywidgets and Voila. Voila is slow and refuses to start sometimes, but the advantage is that you don’t need to adapt the code. Otherwise, I would use Streamlit, it is faster.
- The library for API to call prediction from SAP RFC is FastAPI. Load the labels encoding and the best model with pickle.
- The class for ABAP requests is SE24: IF_HTTP_CLIENT.
- The class for ABAP JSON for data-interchange format with FastAPI is SE24: /UI2/CL_JSON. The code I used is similar to the one from my article Application to query split VAT status.
- You can use RFC for prediction in SAP ECC for real-time execution in developed reports.
- You can use the same RFC for predicting the SAP BW process chain in the routine of the data transfer process.
What’s the point with all these steps apart from building your knowledge with a proof of concept? Probably going deeper with basic elements gives you more flexibility to fill the gaps between different environments and use various libraries and external API services or even OpenAI Generative Pre-trained Transformer models like GPT-3. Certainly, it is helpful to have a kind of Davinci-codex engine to explain ABAP code to natural language and to translate the natural language to ABAP code. After all a programming language is only a tool we communicate with a computer to solve a human problem and we would like that to be as close as possible to our natural language.
How did you solve the problem of over choice with machine learning to check out your ideas with proofs of concept?
Update on 05.11.2021
When you have a lot of incoming orders with a lot of characteristics you have to analyze them thoroughly and assign them to specialized internal customer representatives. Traditionally service agents are doing this manually. When you have a lot of data you may try to automate it with a machine learning classifier as a first layer for allocating in order to increase productivity and accuracy. You can extract all your data from the SAP system, train a model with XGBClassifier(), build an API, and integrate it into the application for order allocation.
Update on 12.11.2021
Expectations from AI are for important decisions for important issues. However, every important decision is the result of many micro-decisions. Micro-decisions in SAP means the availability to incorporate machine learning into all small functionalities like search helps, filling proposals, matching proposals, substitution proposals, validation proposals, BAPI, and reporting.
Update on 24.11.2021
String matching can be useful for a variety of situations and can save you ample amounts of time. For instance, material master data contains different external classification and one have to choose by descriptions from thousands of rows the best match. Simplicity is true elegance. It is worth exploring these simple-to-use libraries fuzzy-wuzzy and gensim with a focus on unsupervised NLP models to solve semantic text-matching requirements. In S4/HANA there are already processes leveraged to machine learning such as Goods and invoice Receipt Reconciliation and Predictive Accounting.
Update on 06.02.2023.
I came in second place 🏆 in the #SapHanaCloud #MachineLearning #Challenge. 🤖
I hope you enjoy the blog I wrote about the challenge and the solution. 🙂
Click the link to the blog https://lnkd.in/d7PfxKMH!