Simple Business Questions Are Not That Simple – Part 2

Author: Yannick Cras, Chief Development Architect, Real-Time Intelligent Enterprise, Advanced Development, SAP Labs Paris

In a former column, we introduced a central issue of Business intelligence: beyond answering a business question right, we also need to answer the right business question.

Let’s review some of the typical issues users (and designers) can face from this perspective.

First, try not to help users fail.

Excel – still by far the most common BI front-end – has very precise and explicit calculation capabilities; there’s not much you can’t’ do with it. But it just doesn’t care about the nature, the meaning (let’s use the word: the semantics) of the data that it processes. A cell is a cell is a cell.  So Excel will trust the user, always, even when it should be quite obvious that they are choosing a wrong mental path. Excel2.png

Worse yet, Excel will happily help you fail. How difficult is it to compute an average of growth ratios?  That’s fairly easy; indeed the magic AutoSum button makes it dreamlike. The fact that the number I get is completely irrelevant and misleading doesn’t seem to matter when it’s so easy to format it into a neat, impressive, professional-looking report. OK, I’m somewhat unfair here, because the Pivot Table does a better job; however it is by far not as mainstream as this nice button on the home tab.

The answer to this problem is: Know Thy Data. We at SAP are in a unique position for this. Our suite of applications has extremely rich business models and collective knowledge of what they mean. We know which calculations make sense or not. We know that inventory should not be summed up over time; we know that temperature should be aggregated as an average on geographies; we know that GNP figures are meaningful for countries, not cities or ZIP codes; we know that amounts should not be summed up blindly across currencies or units. Is this knowledge completely surfaced in our analytical tools today? Not always. But this intimacy with the semantics of business data is a key differentiator.

42 is the answer, but what is the question?

Another issue with Excel – one that hurts IT departments badly – is that while it retains the answers perfectly, it tends to forget the questions. Somebody in your finance or HR department creates a perfectly designed formula that computes, say, headcount projections. This active cell depends on a named range or query, and is of course sensitive to time, scenario and a number of variables. But here comes a killer whale called Ctrl-C Ctrl-V. Somebody with best intentions will cut and paste the number – not the formula – into a Powerpoint presentation circulated to senior management. The context is lost; the actual meaning of the initial business question is lost. Remains the number, which is now carved in stone, and can be interpreted as freely as one likes. Have you ever been in a meeting with three execs were discussing, each relying on their own (and different) source of truth prepared by their respective teams? This is how messy it gets, and it is a source of entropy that can consume a significant part of IT budget to get fixed. Preserving the data is not enough; you want to preserve the business questions that they answer. This is what Enterprise BI is all about. This is what declarative and re-usable analytic models and formulas, stored in a corporate repository, are all about.

Hide the boring details… or not.

Many “simple” BI tools take the opposite approach to Excel’s. Rather than trusting you, they’ll make all decisions on your behalf. Every painful detail is hidden; just enter minimal information and the system will deduce the rest. No modeling needed, no sweat, plug and play your data source, one click and you get a chart. Isn’t that the coolest thing ever? Well, no, not if you care about your numbers. The point is, sometimes the system can’t infer what you want unless you tell it more. It just can’t invent information. It can make assumptions, but often they’re wrong. See: as a purchasing manager, I need the average number of hours billed by third-party consultants. What matters to me is the set of people or businesses with whom I’m doing actual business, not the entire set of consultancies that may be on my list of registered suppliers. Now, as an associate in a consulting firm, I need the average number of billed hours per head. I need to take into account the whole team, including consultants “on the beach”. Those are similar but not identical business questions, and only I, the user, can specify precisely on which group I want to compute the average. Chances are that typing “Average Billable Hours” in a search field is not going to be precise enough. The system needs to guide the user in specifying over which group of people the average should be computed; while for us developers it’s merely a choice between an inner or outer join from a dimension table to a fact table, business users won’t see it that way. This is where usability studies and field experience come into play.

Think incremental, code recursive.

Sometimes, users do have to create quite complicated business questions. Customer segmentation is an example. As a marketing manager I need to focus my campaigns on segments that are both promising and of reasonable size. I’ll filter my customer base using various criteria, from social or marital background to geography and purchase history. People do it by incremental steps. Nobody who’s not a geek will think of “the set of Customers aged 50 or more in California or Florida, who’ve purchased more than the average sales of my three top-selling wines over the last 12 months, or whose birthday is within a month”. This is, though, exactly what an efficient database query should compute, because the more complete the query the more food you give to your optimizer. But no user is going to provide us with such a formula. They need to be given means of incrementally building and amending their customer set, through trial and error, monitoring its size as they go. The developer’s business is to keep track of all these changes and to generate one global, complete, efficient query.

Our job is about questions as much as about answers.

Those are but a few of the complexities that even fairly simple business scenarios have to face. The blood of your organization is data, but its intelligence and memory are made of business models and questions. Much care has to be taken to preserve and nurture them. This is a large part of what analytics are about.

To know more, visit SAP Labs Paris or follow @SAPLabsParis on Twitter.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply