Focus on unstructured data – Is it nothing but a glorified needle in a haystack story
Everywhere you turn in the IT world the buzzwords are Big data, lightening speed reports, unstructured data etc. One core example always sited being the treasure trove of information available as unstructured data (one obvious example is e-mails) and the claim that by proper text mining and pattern identification business insights about the future can be ascertained from these unstructured data. I have few doubts about this claim and general acceptance of this assertion.
By design unstructured data are generated in a particular context and validity of that information quickly fades as time goes on. So to construct a historical perspective on the topic on hand, analyzing all the past information around it, be it in a structured or an unstructured format is definitely the way forward. But to extrapolate that such text mining and analysis of unstructured data can given business insights about the future is a stretch which has all the marks of a marketing gimmick.
This clearly looks like the case of software product companies trying to create a fad. Enormous advances in hardware capabilities coupled with the fall in hardware costs have in fact created a paradigm shift in customers evaluating standard data costs. For the software companies in the data industry, archiving projects were one sure source of revenue due to the enormous build up of data across the ERP customer base. But due to fall on hardware costs and rising popularity of cloud based data centers, the cost benefit that normally accrues to customers due to data archiving has lost steam. In the recent past I have seen companies deciding against funding data archiving projects as the cost savings out of the project is not attractive enough. My take is that the data industry sensing this loss of revenue streams around archiving, data storage, data management due to this steel fall in server costs is building a huge case with customers to acquire analytical capabilities around unstructured data aka Big Data as a new revenue stream.
For organizations to spend huge money on building analysis capabilities in unstructured data would be equivalent to shooting in the dark and expecting to hit the target by chance and not by deliberate action. These analysis capabilities which are expensive to acquire cannot replace a sound and rational judgment of an experienced professional. To me search for business insights that will provide clues to the future on unstructured data of the past is nothing but a glorified well known needle in a haystack story.