Skip to Content
Author's profile photo Former Member

What is a data scientist and who needs them?

The term ‘Internet of Things’ was first documented by Kevin Ashton in 1999 but was predated by the concept of networked devices in the early 1980’s.  Some people may argue that you can even go as far back as Charles Babbage in the 1830’s!  We are all familiar with RFID (first patent granted to Charles Walton in 1983) tagging of items that offered to revolutionize processes like inventory control and goods movements but this seems almost primitive compared to the explosion of networked ‘smart devices’ envisioned now.


Technology is enabling smaller, lower cost and pervasive devices as well as databases and software capable of handling the huge increase in data generated by the integrated devices.  SAP’s HANA database  with S/4 HANA applications are good examples of how the potential for big data can be used to benefit businesses to run more efficiently and profitably.  We now see the term ‘data mining’ that represents the task of automatic or semiautomatic analysis of large quantities of data to extract previously unknown interesting patterns such as cluster analysis, anomaly detection and dependencies.


Predictive Analytics is one area that has great potential to utilize Big Data and makes we wonder what kind of person and skill set is required by Mill Products companies to gain benefits from analyzing a lot of data.  I recently read a Chicago Tribune article by Alexia Elejalde-Ruiz :  ‘Data Scientist’ is fastest growing Chicago job search;  In Chicago, searches for ‘data scientist’ positions grew fastest last year, up 82% from the year before….’


What the heck is a data scientist?  In the early 1980’s, the forest products company I worked for hired an operations research person with a lot of math skills so maybe that is the same thing?  Closer examination shows that OR involves use of quantitative mathematical techniques for decision making.  A problem is modeled as a set of mathematical equations and subjected to techniques  such as linear programming that handles complex information in allocation of resources.  Not exactly the same thing as looking for unknown interesting patterns.


Gil Press has an article in Forbes ‘ A Very Short History of Data Science’ where he says: ” The term ‘Data Science’ has emerged only recently to specifically designate a new profession that is expected to make sense of the vast stores of big data”.  And is ‘the coupling of the mature discipline of statistics with a very young one – computer science.’


Daniel Gutierrez, Managing Editor, insideBIGDATA says: “First of all, any true data scientist requires a firm foundation on mathematical statistics, probability theory, computer science and machine learning.  To fully understand the latter, you need to comprehend the math – calculus, linear algebra and PDEs at the minimum …….. but only years of experience applying the methods of data science can lead to a secure place in this field.  The moral of this story is an advanced degree, Masters or Ph.D. is certainly a noble way to go about becoming a data scientist.  Is such a degree mandatory?  Probably not.  A mixture of contemporary education resources (traditional degree in data science or MOOCs) coupled with years of practical experience is a good equation for success in this field.”


The demand for data scientists has created a shortage of people but also increased their salaries with the top paying jobs at Facebook and LinkedIn.

Barb Darrow’s recent article in Fortune ‘Data Science is white hot, but nothing lasts forever’ has a statement: “Enjoy your fat salaries while you can data scientists, because the rising tide of new talent and -gasp-automation take their toll.”


Do Mill Products companies need in-house data scientists or can their skills be provided by consultants or third party vendors or will software do more and more of what data scientists do today?  I am sure there are some companies who have people that fit the data scientist profile but is more likely a team of individuals will do the data mining and analysis some of whom are data scientists and some who are business experts.


There is also the development of pre-defined solutions by IT vendors that can be supplemented by business people or data scientists for company specific analysis.  SAP InfiniteInsight offers automated data preparation and modeling for both business users and data scientists.  In addition, SAP has data scientists on staff who can help on customer projects when needed.


I am interested in your opinion on the need for in-house data scientists or can companies rely on external tools and personnel to do advanced analytics of Big Data generated in a connected network within a manufacturing facility.



More than 30 manufacturing companies will gather to talk about these issues and more at the SAP Manufacturing Industries Forum June 22-24 in Lombard, Il.

Companies speaking include ArcelorMittal, Johns Manville, Chicago Faucets, EMC, Caterpillar and more….

Use #SAPMIF15 to find out more on Twitter.

Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Ritesh Dube
      Ritesh Dube

      Nice article Brian Dickinson, and very intresting heading too. 🙂

      Keep Sharing.


      Author's profile photo Alfred Becker
      Alfred Becker

      The need for a data scientist may depend on the subject. I'm convinced there are many use cases where an inferior correlation is better than none. Such an inferior model may get refined over time.

      In other cases the "structure" of an issue may be hidden so well within huge amounts of data that there is even no mathematical operation available to uncover the essence of a problem. Graphical correlation of apparently orthogonal data may point to e.g. the failure of a device. But this knowledge can be the result of years and years of experience - maybe not even as a data scientist rather than as an engineer....