Skip to Content

Whilst currently on business in Singapore, I have been fortunate enough to attend the SAP Analytics Tour. Timo ELLIOTT , as always, gave a great presentation. Whilst covering the full range of SAP analytic offerings, he opened the attendee’s eyes to the possibilities of real-time analytics, by showcasing new SAP products through some awesome demos. The inspiration for this blog transpired from an offline discussion Timo ELLIOTT and I had before his presentation, around the importance of information, data governance, management & quality, and just how vital these are in order for consumers of that information to benefit from their data.

Roll on two nights later, and I’m having dinner with fellow SAP Mentor and educator, Paul Hawking,  also in Singapore. After the pleasantries were exchanged, the conversation quickly went to this topic. What was meant to be a quick bite out, turned into a deep, five hour conversation on how customers are going to deal with real-time analytics and dirty data on the fly, regardless of which vendor they run their systems on.

Back at my hotel, I spent a restless night thinking about this post (I’ve been here over a week now, so it’s not jet lag). I thought it best to get my thoughts down, so that I could get some sleep ;-D

Let’s take a step back. For those who don’t know where I fit in, I am an Enterprise Data Warehouse (EDW) geek, who has been building EDW’s for the last 15 years. I am evangelistic about getting things right first time, rather than the panel beating approach that you typically see these days. When I engage with customers for the first time, I always ask them about their Master Data Management (MDM) and/or Data Quality (DQ) strategy, and the usual response is “just build the BI reports first and we will get to the data quality later”.  We all know that doesn’t work, but I see this happen time and time again. Just ask an IT or business department for a budget to run an MDM initiative within your organization, and you will soon find out what I mean. Yes, there are the handful of customers who do have this right, and kudos to them, but they are, in my experience, rare.

I know there has been talk about the “death of the Data Warehouse”, and that all the “layers” of complexity are no longer needed. I agree, in certain circumstances, that EDW’s are over engineered and complicated, but when one takes a look under the hood, there is often a good reason for that. In many large corporates today, there are source systems, extracts, files & mainframes that no-one is willing to touch or change in case they break. In most cases, the person who wrote that system (which is no longer supported by the vendor) has left and vague maintenance instructions have been passed down over the years.

I am the first person to say “fix the problem at source”, as this is the only way to ensure that you have quality information coming into your EDW. We all know that “garbage in equals garbage out”, but this is often more challenging than meets the eye.

If you find yourself in a situation as above, where you are just letting your legacy source systems tick over till end-of-life, then typically all new business rules and data anomalies are handled in your extract-transform-load (ETL) layer within your architecture. If you still have an active team developing and maintaining your source system, and need to cater for new variations in the system, this change is often slow and tedious, and locked down by strict prioritization and change control processes.

In the textbook world, you would wait for the changes needed to take place at source. In the real world, however, I want the consumers of my information to be able to trust the data and use it daily. If your end users lose faith in the functionality of both your system and the information within that system, then you are fighting a loosing battle.

So how do we ensure that the end users can answer business questions and keep up with change? The answer is simple. By writing complex ETL jobs and processes to handle all the possible variations, so that your users can keep answering business questions while the source systems slowly catch up to the demand for change. There is also the case, as mentioned above, when the source systems are never going to change, as they are end of life.

Another situation to consider, which I have assisted many companies through a few times, is when companies go through an acquisition or merger. Let’s take SAP as an example. Over the last few years, in order to have a single version of the truth of a customer, they would have needed to have merged, matched and enriched customer data from SAP, BusinessObjects (assuming BusinessObjects had done a good job with the Crystal acquisition), Sybase, SuccessFactors, Syclo etc. to name but a few. Any takers for that job?

Up till now, I have consciously left tools out of this post for a reason. SAP, as well as their competitors, has some great solutions that can enable a customer to drive and succeed at an MDM initiative within their organization. I used the word “enable” on purpose, as in almost all aspects of my job I see and treat technology as an enabler. This is even truer when you are looking at MDM within your organization. As a side note, here is a great blog post covering a Data Governance webinar held by SAP and showing Information Steward  and one of my old time favorites PowerDesigner working together.

I don’t want to paint a picture of doom and gloom, but what I’m trying to illustrate is that managing data, especially within in a corporate domain, is complicated. My former boss and mentor, Bryn Davies from InfoBlueprint, has always said that customers need to treat their data as if it were the only non-depreciating asset on their balance sheet. Give it the love and respect it deserves and it will empower you to make trusted decisions for years to come. Very wise words from a wise man.  The only way I have seen this working is the top down approach, where board or C level executives actually base some of their staff’s KPIs on the quality of information within their department/cost center.

Do I think that customers will be running all their transactions and analytics off in-memory solutions in years to come? Absolutely! But, for me, the key is how we get there.

As I alluded to above, if your end user does not trust your system and the data in it, whether a month old or real-time, then you have lost the battle. The key for people advising customers, who start looking at using the new in-memory solutions, is to stress the importance of the enterprise information strategy within the organization. It should almost be a non-negotiable dependency as a phase in the project plan and an ongoing initiative within the organization. At first glance, customers are often horrified at the quality of the information within their organization, which leads to another one of my favourite sayings: “How deep does the rabbit hole really go”, but that is another blog entirely.

I understand, from a vendors point of view, the appeal  of selling real-time analytics is a great one, but as a man who has gained many battle scars trying to rectify failed EDW projects, I know, first hand, the importance of getting your data management done right first time.

One thing I do know is that it is going to be an exciting journey to manage the ever-exploding data landscape around us with the customers need for instant and quality information at their fingertips in real-time. End users are going to be drinking from the fire hose of information shortly, in real-time, and we need to ensure that the water  is not toxic.

http://www.harvardpress.com/Portals/0/issues/2009_07_10/fourth_la_8978_550w.jpg

I am a firm believer that the end user should not have to know or care about the legacy issue within an organization, or when the “batch window” will finish etc. They all know that they can type any question they want into Google, using their iPhones & iPads, and get an answer immediately – so why can’t they have that at work?

So, I leave you with this thought – once end users get access to real-time data within their organizations, without all the layers of ETL and Data Quality, will they get the fright of their lives, and will this lead to MDM initiatives being kicked off at a rapid rate?

I personally hope this is the case and look forward to seeing how it all pans out…

Would love to hear your thoughts on this …

To report this post you need to login first.

4 Comments

You must be Logged on to comment or reply to a post.

  1. Former Member

    great thoughts Clint. I’ve had several experiences with customers, when you first show them their data in a tool like Explorer, where they realise how poor the quality of their information is.

    what will be important is giving users the visibility of the data quality, because like you say, if they are drinking from that firehose, how well are they going to be able to see quality of data?

    Josh

    (0) 
    1. Former Member Post author

      Thanks for the comments Josh. Good point about giving visibility to the data quality issues companies are dealing with. As you know tools like Information Steward are great like this as you can set KPI’s based on the quality of the information entering your system and see how you progress along the way. As we all know .. you can only manage what you measure !

      (0) 
    1. Former Member Post author

      I could not agree with you more on this point James Oswald ! The day we get C levels execs basing the KPI’s and bonuses of their staff based on the quality of information they enter into the system rather than the quantity will be a fine day … one can only dream… ๐Ÿ˜›

      (0) 

Leave a Reply