The big data industry 4.0 prediction miracle – and a few painful realizations
Let’s start off without any pretensions – I am not a data scientist, and my engineering math skills are also quite rusty. This blog contains my take aways from a number of discussions with mill products customers who embarked on the journey to industry 4.0 – focusing on big data use cases in the intersection of manufacturing, quality and plant maintenance, and from discussions with some of the SAP experts on data science and prediction.
“Take a bundle of raw sensor data and overlay it with maintenance notifications and customer complaints from the last 5 years, put into a prediction engine, and – miracle – you get valuable insights.”
I like this claim. Why?
First, I like the term “valuable insights” because the whole story of industry 4.0, big data and predictions gets “handfest” (German for “solid and tangible”) when it comes to productivity, profitability, and a business case.
Secondly, the “miracle” triggers my curiosity. What is supposed to happen in this “black box” – and what are the assumptions, short comings and boundary conditions within?
Thirdly, it sums up a number of use cases that frequently come up in our industry:
- Predictive maintenance – find rules based on sensor readings e.g. vibration analysis that give an early warning about asset health issues. Further in the future, I would like to see an automated near time interval control, where machine-to-machine communication enables a more resilient asset/production environment which can self-diagnose an emerging issue and trigger counter measures.
- Inline and predictive quality – find rules to identify a quality defect e.g. from a image analysis, or detect indicators for deteriorating quality while you still can “save” the product. Bosch’s Karl Tragl describes this nicely in his blog 5 things we can do without in manufacturing in 2025. The faster we will be able to know a quality problem, the less waste we produce, the quicker we can adjust manufacturing parameters or trigger a maintenance task to solve a problem.
- Identifying patterns and root causes for customer complaints e.g. product colour deviations, surface defects and their relation to process parameters, maintenance, raw material. Or using the deeper understanding of product failure probability to plan for smarter warranty & service models, savin on the bottom line, and making the customer happier.
So, how can you make sure that your industry 4.0 big data prediction case is viable and feasible?
My first and foremost recommendation is to read a book: Data science for business. What you (as a business leader, project lead, IT lead) need to know about data mining and data analytic-thinking. At least read the first chapter – which is freely available here.
Foster Provost and Tom Fawcett’s book is not about algorithms, but gives you a good foundation to ask the right questions about your own big data projects.
For example, they explain the most common methods and techniques in easy to understand, real-life examples:
- Who are my most profitable customers?
In many cases this is simply a database query. The result may be a list of customer records e.g. sorted by profitability.
- Is there really a difference in value to the company between the profitable and the average customers?
Answering this, is a statistical analysis task. The result would be a probability that the assumed value difference was real.
- Can I describe a profitable customer by common characteristics?
Can I differentiate a profitable from an unprofitable customer – and how? This is about pattern finding and classification – and a “real” data mining task. Data scientists would e.g. determine which combination of characteristics give the strongest indication for “profitable”, and would also create a model to describe this.
- Will a particular new customer xyz be profitable – and what value can I expect from this customer?
This question can be answered by data mining from historical data by creating predictive models.
Probing questions for your project
What exactly is the problem you want to solve?
This may sound easy, but often is not clear at the start of a project. What do we want to achieve? How? And what is the expected business value behind this?
Often this is an iterative task. Data scientist, developers and domain experts from the business need to work together to sharpen the business problem.
How good is your data?
Historical data is often collected for purposes that are not related to the current business problem, or just for no purpose at all.
During data preparation and evaluation phase, you will realize that sensor data vary greatly in reliability.
Assume you want to detect & predict quality problems. In the historic data – when did you know when quality defect occurred? How precisely & timely?
Also in this stage, it is essential that data scientists and domain experts on the shop floor work closely together.
What is the context of sensor readings?
Let’s look at an example from the airline industry: If you want to analyse the vibration or heat sensors of an airplane, you need to know what you are measuring. Is the sensor reading from the take off, climb or cruise phase? You may even need to consider outside temperature, length of airfield etc.to make sure you build the right model.
Identifying the influencing parameters therefore always requires the domain expertise from your business people.
In our industry, we need to understand which steel grade, thickness is processed on a cold rolling mill for a given sensor reading?
Or which type of feed material is ground to which fineness in a grinding mill of a cement plant?
Just looking at the raw data without the context will hardly provide the right conclusions.
In several customer examples, also data quality was a bigger challenge. Customer complaints on product quality may not be specific enough, and difficult to link to a specific production lot. Similarly, manually created maintenance notifications may not have the correct classifications or lack importants details. Or simply: some sensors consistently fire false alerts.
How good and viable is the result of your model?
For example, your quality error detection solution may have a very high detection accuracy (and thus may find close to all defects). But how many false alarms would it trigger, and how much would it cost to deal with the false alarms?
There are promising proof points in our industry that quality detection is feasible and viable e.g. based on high speed image processing for thermal metal part scans, or in loom quality inspection.
But finally, it is your business experts who will need to evaluate the results against the initial value expectations.
Is the “miracle” feasible?
And you do not need to start from scratch. We have already done this jointly with many customers, also in our industry.
But, as mentioned above, there may be a few more steps than just throwing it into a predictive miracle blac kbox.
If you start your journey to predictive maintenance or quality – we can bring our predictive experts and data scientists together with your domain experts around your very specific use case. Not to mention our leading leading predictive & big data & IOT enabler, the SAP HANA platform.
We are very interested to enter into an exploration & co-innovation mode with your company. This are exciting times with exciting new tools, and the possibility to solve problems and get insights that were never feasible beforout
You may also be intererested in: Considering industry 4.0? Need some best practices how to lay the analytics foundation?
I really like your 4 examples to expain methods and techniques.
Very comprehensive article - and the idea arrived already at the shop-floor of Mill Products companies! If you can spare the time have a look at this session at Sapphire Now / ASUG conference:
Mohawk, the world's leading flooring manufacturer, uses SAP HANA and predictive analytics in the \"Predictive Maintenance\" program to increase product quality by improving its manufacturing process parameters, including many millions of sensor data readings
(Session ID BI747, scheduled for May 5th, 3:00 p.m. - 4:00 p.m. EST)