Machine Learning Thursdays: Predictive Models as a Service Versus Training as a Service Part One
Machine Learning is on every CXO’s mind at this time. We’ve all heard and tested many use cases for machine learning, spanning various domains. I would like to focus on one point in this blog, which is that we see the emergence of two distinct categories of machine learning solution depending on the type of problem that needs to be solved.
This week and next, I will describe in detail these approaches which I call: “Macro Modeling/Predictive Models as a Service” and “Micro Modeling/ Training as a Service.” In addition, I’ll highlight some of the challenges which technology executives need to be aware of when making investment decisions. (Today I’ll cover the first of these approaches, and next week I’ll dig deeper into the second.)
Predictive Models as a Service—Macro Modeling
When talking about machine learning, some obvious use cases that come to mind are autonomous vehicles and machine learning-powered translation systems. These could be described as generic purpose systems powered by predictive or machine learning techniques. Let’s see how these are generated and consumed.
Challenge Number 1
First, we’re talking about one system that has been trained on a very large corpus of data in order to get the proper results in many different situations. For example, Toyota is saying that they will need 8.8 billion miles to achieve a safe autonomous car. The same is true for image recognition where public image data sets contain more than 100 million images (and the true internal datasets used by Yahoo and the like are much bigger than that).
To build a general purpose (predictive) model requires thus a gigantic amount of data that you have the right to use for this purpose.
Challenge Number 2
How many intelligent general purpose systems do we need? Today, we have many teams focused on autonomous vehicles or image classification or even translation systems—but, how many of these systems do we need on the planet? If a system to drive autonomous vehicle is efficient enough to beat competition, you can expect that there will be 10 systems or so to equip all the cars on the planet. We expect these systems to work well in cities, in the countryside, during day and nights, and so on. Producers will compete on the price and reliability of the sensors.
The same is true for translations systems, or even image recognitions system. This is a winner-take-all market. If we push it to the limit: how many true artificial intelligence systems do we need on the Earth?
Challenge Number 3
As always with predictive and machine learning, it’s almost never a “fire and forget” activity. Your systems need to be continuously updated as new data comes in or specific rare situations occurs, which means that you need to connect them to continuous feeds of data collection for continuous updates and monitoring.
This continuous improvement feature has cost impacts of course, that will push the need for continuous learning or increment learning. This in turn, will also be used in order to start from generic purpose predictive models to specific models for specific contexts.
Of course, the fact that these systems can be transported is of importance. It’s nice to have such systems available as REST APIs on the cloud, but this means that they will be available only into connected environments, which will solve a lot of the use cases but not all. Typically, an autonomous car must be able to run even if there is no connection.
On the shared models through services, speed and concurrency is very important, as well as exchanged data security and privacy. These are technical challenges that have been solved in the SAP Leonardo machine learning foundation based on SAP Cloud Platform with Cloud Foundry.
Finally, we’re talking about very large data volumes and very large computing power, which impact on the direct operation costs. Consider, too, the fact that we can expect more improvement on this financial equation in the future (such as the introduction of ASICs) and even on the pure electrical consumption, not to mention the amount of data traffic.
Stay tuned here. Next week, I’ll discuss “Training as a Service – Micro Modeling.”