At recent conferences the topic of BIG DATA seems to be on everyone’s agenda – What are you doing with all of this data – What is your
plan for merging machine to machine data – How are you visualizing the data? More questions than answers as usual followed.
What I’ve noticed is that while these are the first level questions, the more “Insightful” question seems to revolve around what “Insight”
are you gaining from the big data you are using, how did you define the problem and how did you benefit?
First the Basics –
Circling around any “Insightful” data collections are the following criteria and considerations. How much are we dealing with daily and over time, How fast does the data arrive, How diverse is the data, How trustworthy is the data we are considering.
Beyond those criteria we have to consider the structures needed to house the data, the methods of connectivity employed, the
delay arriving to our data store and finally how to visualize to address our needs.
Much of the data we may need has existed for quite some time – in historians, disk pacs, tapes, traces and logs. So what can we do to gain insight around
historical data that will deliver insightful benefits today?
Speed to process – is the answer to the question of “Insight” driven solely by feeds and speeds? I would say not – case in point – the real time processing takes place in the OT realm where the issues of latency, security and control reside (and have been addressed). Analytical solutions that provide “Insight” are not moving to take over the real time domain and supplant the traditional suppliers. So is it enough to be being able to crank through billions of records daily to give the “Insight” we are all looking for? Yes at some level – but not in totality – it certainly helps if you are able to run the calcs and reports in sub-second time – or evaluate the entire fleet of assets several times in a quarter – if the results of the data run don’t provide the benefits then we are still at the starting gate. Hold that thought.
Besides the historical data we have new data being created by the myriad of devices that squawk either event or steam data into the stores
constantly. How to marry this together and into a singular data base that can scale and perform – not just the single record deep or single attribute wide – but successfully execute the square query – Lots of records (deep) and across the entire population (wide).
What we can do is realize as a group is that the answer to “Insight“ is more than speed and feeds, connectivity, latency, structures and visualization
– “Insight” involves defining the problem that needs a solution and writing that in the form of a use case.
By defining the problem and the use case, what classes of assets to pole or listen to for maximum value become evident, the determination of the best data source (veracity), how much data to collect/read, storage, how we present the data in an understandable way lead us to the challenge of providing “Insight”.
So the next time you are called into a meeting to define the best way to use the shiny new massive data set – drive the conversation towards the definition of use cases that can be prioritized by value. That listing will become your roadmap to design, realization and newly found “Insight”. Your approach and effort will contribute to the movement away from being Data Rich and Insight Poor.
Please follow this story as it unfolds in our second installment of “Data Rich and Insight Poor” .