Capital markets firms are increasing motivated to mine more and more data. So a question about the spread of data caught my attention a couple of weeks ago: Would different groups within an organization collecting as much data as they could for data mining result in multiple copies of data?
I had just presented at the Waters Technology “Big Data Webcast,” which investigated challenges of handling and using Big Data, when a listener made this inquiry during the Q&A.
As the question implies, Big Data should be a company-wide initiative whenever possible. But one group within the company must be responsible for collecting the data. That way other localized expert groups can use that central data source to identify patterns – and avoid spawning multiple data copies.
Challenges brought up during the webcast focused initially on technology issues such as volume, velocity, variety and value, as well as Intel and Platform Computing. And I talked about technologies that address these issues: Sybase ESP (complex event processing), SAP HANA (in memory databases) and Sybase IQ (massive disk based historical data store).
But the use and impact of Big Data in financial services firms got a lot of attention during the Q&A. Streams of Big Data will only continue to grow. And our capacity to mine, sort and analyze Big Data will have to grow alongside it.
Another concern was that Big Data might result in job losses as technology continues to automate tasks once performed by people. Most of the panel and I agreed that, to the contrary, Big Data is likely to result in more jobs, as firms require more data scientists to sort more information, resulting in an enlarged IT footprint.