By now, you may have seen the “What’s New” post for the SAP Lumira 1.27 release. One of the new and exciting features of this release is data blending. This blog post will attempt to give a more detailed overview of the feature and explain how it all works. There will be more posts to come to further explain in more detail the intricacies of blending especially around dataset links, filtering, multiple matching members, duplicates, and joining vs blending decisions.

Imagine that you have a dataset that shows population values by country and city

/wp-content/uploads/2015/06/population_730696.jpg

If I apply a filter for 2014 (ie exclude 2013), and display Country by Population, I will see this in SAP Lumira:

CountryXPopulation.JPG

Suppose I also have a dataset of average net income values also broken down by country:/wp-content/uploads/2015/06/netincome_730725.jpg

I can acquire this dataset into SAP Lumira and build a separate visualization of Countries by Avg Net Income:

CountryXNetIncome.JPG

How do I build a singular visualization that can show me populations and net incomes together? I can’t do a traditional merge / join on these two datasets because there is no way to determine a unique key value on which to merge. This is where data blending comes in handy.

I know that for the two datasets, their respective Country dimensions contain values that are common. This gives me a condition on which to define a link between these two dimensions. SAP Lumira doesn’t know this relationship yet but I can:

  1. switch back to the visualization on the Population dataset (Country by Population values)
  2. change the dataset to the Net Income dataset; my object picker will change to show its respective objective, but leave the visualization unchanged. Notice that in this pre-blended state, the Chart Builder UI now shows a Datasets In Use to indicate that the visualization loaded is built using a different dataset.
  3. add the Net Income measure from my Net Income dataset into my visualization to initiate a blend

See the animation of these 3 steps below:

BlendAnimation.gif

This deliberate action of bringing in an object from my Net Income dataset into my visualization based on the Population dataset initiates the workflow of constructing a blended visualization. Because SAP Lumira does not know that the two Country dimensions can be related to one another, a dialog will appear to request that I create, establish and confirm these dataset links.

step5-createDSLinks.JPG

One very important note to point out here is that dataset links are defined at a document level, and not on a per visualization level. This means that the link context created here with Country dimensions from Population and Net Income datasets, will be used again if another blended visualization were to be created using the same two datasets. But for each visualization, it only uses whatever dataset link relationships are present in the viz itself.

One the dataset links are defined, SAP Lumira is capable of joining (via left outer join) the two aggregate tables into one blended visualization. The result is a single visualization that displays Population values from one dataset, matched with Net Income values from another dataset, based on a common Country dimension.

blend-populationCity.JPG

Notice that in this blend, the Country values are driven by the dataset that is primary: Population. This is why there is no value for US, because US only exists in the Net Income secondary dataset, and its Country, Net Income information is not able to blend into the Population primary dataset.

More to come! Expect to see posts diving into how values change depending on what dimensions are linked, how filtering is applied to a blend, and how we handle one to many or many to many relationships between primary and secondary dimension values. If you have any particular requests for detailed explanations, feel free to comment below!

Some key facts on Data Blending in SAP Lumira 1.27
– supports offline datasources only

– blending is limited to one secondary dataset

– filters are still applied at a dataset level, and not yet applied post-blend (1.28)

– ranking is supported on primary measures and dimension contexts only

To report this post you need to login first.

8 Comments

You must be Logged on to comment or reply to a post.

  1. Naveen Kumar Ketha

    In future release, Do you have plans to link datasets using other joins and multiple secondary data sets? It will be help to provide a feature to visualize the data model that shows links between multiple data sets.

    (0) 
    1. Kenneth Li Post author

      Absolutely! Multiple secondary datasets is definitely on our radar, as is blending on HANA views in online mode. Also we know that fellow “blenders” will want to have capabilities to understand how the data is coming together into a blend, so that feature set will also be a priority for us to conceive and deliver.

      (0) 
    1. Ashutosh Rastogi

      Join – When one of the column in dataset1 is a foreign key to a candidate key in dataset2.

      Blend – You don’t have this constraint with Blend.

      Ashutosh

      (0) 

Leave a Reply