Machine Learning and the Need for Scalability

Philipp_Z · ‎09-28-2020

Welcome to the second post in our Data Intelligence introduction series. You can find part 1 here.

In our first Data Intelligence post, we talked about why we need to pair AI and machine learning with human intelligence. In this post, we’ll take a closer look at machine learning and how Data Intelligence Cloud uses it to pull value from data.

Artificial intelligence and how machines learn

Let’s begin with the difference between AI vs. machine learning. AI is artificial intelligence in the broad sense: it aims to simulate human thinking and behavior, and operate semi-independently, or even completely independently.

Machine learning, meanwhile, is a subset of AI where machines continuously learn from data without the need for re-programming.

If that seems a bit blurry, think about it this way: a machine that takes data and makes independent decisions based on that data is using AI. However, if it’s not learning from its decisions, then it is not using machine learning.

If the machine is learning from its decisions, and factoring that into future decisions, then it’s utilizing machine learning as part of its AI.

Many companies see machine learning as a tool to optimize their business. Predicting customer churn, making intelligent product recommendations, forecasting product quality and completing visual product recognition are just a few use cases for enterprise machine learning.

On the other hand, many enterprises also struggle with machine learning and its potential pitfalls. According to analysts, it takes roughly 2 months just to get a single predictive model from the research stage to production.

Why is it so hard to initiate machine learning across an organization? This usually depends on three capabilities:

Managing the data

Managing the design

Managing the deployment

Let’s focus on the second and third points while we take a closer look at how machine learning models are typically developed.

Designing and deploying machine learning

Crafting machine learning models is a highly iterative process, meaning that you have to re-do small parts of it over and over again. You prepare your data, experiment on it, and then come up with an initial model version.

Once you finally create your ideal model, you’ll most likely have generated many versions of data, experiments, and models. Keeping track of all this is a challenge. But it’s important, because transparency (which models exist in my organization?) and auditability (which models and data lead to a prediction?) represent key necessities for enterprise machine learning.

That being said, deploying and maintaining machine learning models is a difficult task. How does one provide a model so that it can be used by applications in a scalable and secure manner? How does one monitor the quality of models and trigger a re-training if needed? How can one manage the hand-over from data scientists to IT operations and back without friction?

SAP Data Intelligence Cloud aims to answer these questions. It’s a unified and integrated tool that covers the entire process: from connecting, understanding, and preparing data, to creating, deploying and operating machine learning models. And best of all, it’s built from the ground up to be cloud-native, so users don’t have to worry about costly on-premise setup, storage, or maintenance.

SAP Data Intelligence Cloud covers the entire machine learning cycle from end-to-end

Users also benefit from a deep integration into the existing SAP landscape. Systems such as SAP Business Warehouse or S/4HANA can easily be connected and embedded into data flows. This is crucial, since those often represent the IT backbone which powers the core business processes of many organizations.

Data Intelligence Cloud makes data governance easy. It also provides specific support for data engineers, data architects, data scientists and IT operations, as noted below:

Data engineers and architects can connect, discover, and process any type of data (e.g. structured, unstructured, streaming) and volume. Furthermore, they can build visual data flows and data pipelines to orchestrate distributed data landscapes.

Data scientists can use a machine learning framework of their choice (e.g. Python, TensorFlow, R or HANA ML) and design models with interactive Jupyter Notebooks. They can manage all the different machine learning elements, such as data sets, pipelines, and model versions, in one central place.

IT operations can easily deploy, monitor, and re-train all models from within a unified control center at scale.

SAP Data Intelligence provides a comprehensive view over all your machine learning-related assets, such as data sets, notebooks, and models.

Enabling transparency and collaboration among those is now one of the essential ingredients to scale machine learning from research to production. Think of it as an assembly line: not for industrial production, but for AI and machine learning.

By supporting specific AI-related expert work and at the same time linking all of it together in an integrated manner, Data Intelligence Cloud allows organizations to move from a fractured approach – often dominated by data sprawl and disparate tools – to a continuously unified operation. One where machine learning models can be continuously repurposed and reused, accelerating efficiency like never before.

Want to read more? You can find the third post in our Data Intelligence introduction series here.