I was speaking with a colleague who is presenting at StrataConf + HadoopWorld next week in New York. His session Tying the Knot Between Hadoop and EDW explores the roles both play in big data projects and various approaches for bringing the two together. I was pleased when he told me about the content of his session because there is so much hoopla about Hadoop (Hadoopla) when it comes to big data. Don’t get me wrong; I think Hadoop has an important role to play, but it‘s not the only technology you need to get value from big data.
Most industry analysts define big data in terms of volume, velocity, and variety of data. If you just have big volumes of structured data, a database is going to work just fine to support your analytic needs, especially if you’re using a high-performing, analytic database, such a columnar data warehouse like SAP Sybase IQ, or an in-memory platform like SAP HANA. If you have a need for real-time monitoring and analysis such as risk analysis, fraud detection, and algorithmic trading in the Financial Services industry, event stream or complex event processing technology is well suited to address high data velocity needs. The sweet spot for Hadoop is collecting and storing unstructured data to address data variety needs and acting as a pre-processing/staging area for analysis. In fact, the most common deployment of Hadoop so far is sessionization of weblog files to understand user behavior on websites.
From my perspective, there is probably a little too much Hadoopla because to get the most business value from big data you need a variety of technologies working together. Mitsui Knowledge Industry (MKI) is an example of an organization realizing the value of big data by utilizing Hadoop and an analytic database. Reducing the time to perform genome analysis helps them deliver personalized medical treatment optimized for an individual patient’s gene mutations. Hadoop is used to collect and store a large library of DNA data from multiple patients and to pre-process a patient’s digital genome sequence for analysis using an analytic database. By comparing an individual’s information to the data in the library doctors can recommend better targeted treatments. You can learn more about the application in this video about MKI.
Do you think is there too much Hadoopla?
Or am I just an old dog that can’t learn new tricks?
If you agree with me why do you think there is so much Hadoopla?
One final thought, if you’re attending StrataConf + HadoopWorld next week you might be interested in From Traditional Database to Big Data Platform in addition to the session I already talked about above.