Gartner BI Summit 2015: Big Data = Predictive Analytics
If there was one cross-vendor event for analytics that opens your eyes as to “what’s out there”, it is likely the Gartner BI and Analytics Summit that just wrapped up in Las Vegas this week. Many customers come to this conference to understand their strategic options with their existing vendor, as well as what the new “up and comers” are up to. As an employee of one of the vendors, it is also a chance to see if the industry is going in the same direction we are.
So what are the “Big Takeaways” that are guiding the industry in 2015 and beyond?
Francis Ford Coppola Is One Cool Guy
Okay, this one isn’t guiding the industry, but it definitely had an impact on the audience: We heard the legendary film director, producer, screenwriter, and wine maker Francis Ford Coppola give his humorous and extremely insightful views on almost any topic the audience could come up with – from his experiences in film making and wine production to predictive technologies, Big Data, and even when to consider instinct and passion versus algorithms and raw data. This was clearly the highlight of the conference.
Big Data Is Watching Us
If you have a pulse, you likely have heard about this thing called “Big Data” and it knows the answer to everything you could ask (hint: the answer isn’t “42”). As an abstract concept, the vast Volume of data from a Variety of places at high Velocity with varying levels of Veracity (quality of data) sounds great, but few organizations really know how to get there or what to do when they reach “Big Data Nirvana”.
The attendees of the Gartner conference may have been unwittingly part of a Big Data project themselves. A colleague of mine was surprised to find that the Gartner website knew which sessions he attended – and which he didn’t. It turns out that each attendee’s name badge had a tiny RFID tag that was only visible if you held it up to the light.
To track the movement of each person, above the door of each session room, and even littered in the hallways were RFID readers placed like ordinary furniture.
YES, Gartner knows when I reached a session, when I left, whether I attended the keynote, how long I was on the exhibit floor, even if I attended the evening reception. Add to that the “booth scans” when I talk to someone or whenever they give me a cute toy, and Gartner knows which booth I went to, when, and a has pretty good idea of the percentage of my total time spent at that booth.
They can also quite easily cluster people based on the common sessions each goes to and determine trends based on actual intent and action rather than those sometimes annoying surveys. They can fine-tune the conference in close to real-time and even market my interest patterns back to the vendors of the sessions I attended.
Big Data’s “Veracity” Is a Killer
The fatal flaw in this story is that while the Gartner website accurately placed me at the show floor (I was booth staff) and even caught the one session I was pulled into that I didn’t register for, it also listed me in the keynote and a couple random sessions in one afternoon – I call it a 3/6 score. Helpfully the Gartner session review site allows me to add sessions I actually went to and signal which ones I was erroneously placed at – and if I fill out the reviews, I might even win a prize! Oh wait, did I just help them improve the data quality for information they can use for better analysis on me? Yup. That was sneaky – but also very smart.
As any data scientist will tell you, preparing and cleansing the data is definitely the largest burden on finding any useful information in Big Data – “Veracity” issues included. But let’s say I play along and “Enter the draw for a prize” by filling out my surveys to help Gartner cleanse their data on me: Gartner still has a huge job to do – the raw data is not enough. For example:
- How many sessions did I walk out early from?
- Is there a pattern in the topics I dropped out of?
- Do my booth visits correlate with the types of sessions I went to?
- What are the “surprise hits” in terms of sessions they did not think were that busy?
- Which “up and comer” companies should Gartner reach out to create an analyst relationship with?
Predictive Analytics is The Core of Any Big Data Strategy
Data Scientists are in such high demand because as the data has more “Velocity”, “Volume”, “Variety”, and “Veracity” issues, a human simply cannot glean insights using visualization and instinct alone. We turn to algorithmic analysis which has proven to be far better at finding patterns than any size army of humans with the latest BI tools. These highly trained people are also highly paid because they can easily justify their value: The ROI of even a 10% improvement could be worth millions of dollars.
Every vendor at the Gartner conference had a big data story and no vendor can do a thing without some form of “predictive” or “advanced” analytics – after all, isn’t the point of collecting all this data to discover insights that can be used to improve outcomes (by “predicting a better outcome if certain actions are taken”)?
One of the key takeaways from the Gartner conference is that Big Data is only as good as the insights you can gain from it. Big Data hasn’t quite exhausted it’s “buzzword value” yet, but it is clear more vendors and customers are realizing it’s the analytics on top of it that really rings the bells. It’s really too bad that those million-dollar insights require such specialized knowledge!
The Future of Predictive Is Invisible
There are some universal truths that exist in this world: It is unrealistic to train a majority of the population in Data Science. Data Science cannot be “dumbed down” too far without losing its intrinsic analytical value. And most importantly: Trying to fundamentally change how a person works has a lower chance of success than learning ancient Latin overnight – from a book – in the dark.
From walking the show floor at Gartner, it’s clear that momentum is shifting from:
“Use Hadoop, connect via Spark, and fire up analytic solution X, using predictive technology Y, and deploy (magically) in your environment”
“Here’s an application that helps the business analyst understand their world in an environment they are used to and either automates or abstracts the nitty gritty out – with an escape hatch so that more proficient users and Data Scientists don’t feel boxed in”.
(The second one is way better right?) 😉
The net effect is that “predictive technologies” become more embeddable, there is more logic at the application layer, and the demands on the Data Scientist are reduced. It means that cool technology like “R”, “InfiniteInsight”, and “APL” become more embeddable and consumable in *other* solutions rather than being the basis for a standalone tool. The solutions that invisibly embed predictive technologies into their applications will have a distinct edge, and it is all of us non-Data Scientists that benefit.
My View On How SAP Is Positioned
my take on SAP’s position?
In some areas we are light years ahead – the KXEN libraries required to add automated predictive analysis to any application are around 15 MB and are already embedded in many applications inside and outside SAP. We recently announced the Automated Predictive Library (APL) for SAP HANA that brings those same capabilities natively inside HANA. In Predictive Analytics 2.0, you can automatically create predictive models and export them to database vendor-specific SQL, C++ or Java code, or even an “awk” script if that’s your thing. (Try SAP PA 2.0 HERE)
There are other areas that we’re still working on but will put us ahead when we release our next generation “SAP Predictive Analytics 3.0” product. Automated model comparison, consumption and embedding of complex “R”-based models, and a new Fiori-inspired user experience that is specifically designed to meet the needs of data scientists while still being accessible by sophisticated business analysts are just a few things that we’re working on right now.
I’m pretty excited about the future of SAP Predictive Analytics because we’ve got patented, class leading technology for embedding into all of those applications that need to embed predictive while we will still be developing, improving, and outpacing with our data scientist and data analyst product.
One look at the Gartner BI Summit and it is obvious that as the “Big Data Hype” calms down a bit and more people start asking “Well, how can I *really* use Big Data in real life?”, the true darlings of the tech world may not be those who can store the most data, but those who can bring the best Big Data insights to the most people.
For SAP and users of SAP Predictive Analytics, the future does indeed look bright. 😎
Disclaimer: This article is expressing my own personal views and may not reflect the views of SAP, its products, its partners, or its competitors (but I’m sure all of them would agree the future is bright too).