Skip to Content

The SAP Analytics team has once again set up a cool Viz-a-Thon event on the outskirts of SAP TechEd 2015 in Las Vegas. This was a great opportunity to learn or apply Lumira skill sets for a good cause. This year, Made in a Free World is the non-profit organization that was willing to open its data books and allow the 50-some volunteers to visualize the risk score in 2 datasets: jeans and tablet computers. Within the (very short) 90 minutes timeframe, our “Alexa’s Super Awesome Team” was able to put together a compelling story (more here about on the name).

We received lots of comments on our usage of the Sankey Diagram custom extension. So I’ll detail here how we achieved it in such a short timeframe so that you, too, can leverage it within your organization.

For your convenience, you can find the source CSV data file, the codebook, the extension and the final Lumira file here.

Sankey Diagrams


First of all, let’s give credit where credit is due: Dong Pan was one of the Data Geniuses supporting the teams during the event. We reused his extension that is already well documented here: Lumira Visualization Extension – Sankey Diagram

According to Wikipedia, “Sankey diagrams are a specific type of flow diagram, in which the width of the arrows is shown proportionally to the flow quantity. Sankey diagrams are typically used to visualize energy or material or cost transfers between processes.”

http://ouseful.files.wordpress.com/2012/05/sankeydiagram-d3js.png?w=830

Example from Ouseful.Info

Activate the Sankey Diagram Extension


This step used to be very complicated. From Lumira 1.28 on, the extensions are handled very nicely. In GitHub or in the link above, download the Sankey Diagram Extension for SAP Lumira (pan.viz.ext.sankey.zip).`

Photo_13.png

Open SAP Lumira. Under File -> Extensions, click on Manual Installation. Locate your file. Restart Lumira. Et voilà!

Prepare: the Made in a Free World dataset

In order to reduce the risk of employing forced labor in an organization’s supply chain, Made in a Free World has developed a risk score.

Download the attached sample dataset for tablet computers. In Lumira, click on File -> New, select Text (csv) and locate your CSV file. Take a look at the data. You may also want to take a look at the well documented CodeBook. In particular, notice how the “path” column is, in fact, a tree combined with the “depth” column. The “score” column indicates the risk and is a number between 0 and 1.


Photo_1.png


For the Sankey Diagram, we need to decompose the path and only keep the last component as the “Target” and the second-to-last as the “Source”. Ideally, this would be available in separate columns, but it’s not the case in this particular dataset. Therefore, we’ll have to extract it.

In order to make our life easier, we’ll focus on the level 2 and 3 of the hierarchy. Still in the “prepare” tab, click on the gear next to the “depth” column and select Filter. Make sure only levels 2 and 3 are selected.

Photo_2.png

Photo_3.png

Select the “path” column. At the bottom right, click on “Split” and enter “:#:”. Notice that Lumira will create additional columns “path(2)” and “path(3)”. The path(2) column now contains the first level of components. The path(3) still contains multiple levels. Reproduce the splitting ad lib.

Photo_4.png

Since the network doesn’t always have the same number of levels, you will end up with half-filled columns.

Photo_5.png

In order to calculate the Source and Target, we only need to set a priority in the columns. This is easily achieved with calculated dimensions. Click on the gear next to any column and select “Create Calculated Dimension”.

Photo_6.png

Give it a name (e.g. Source) and enter a formula like this one:

if (IsNotNull({path (5)})) then {path (4)} else {path (2)}

Photo_7.png

Reproduce the same for the Target calculated dimension.

if IsNotNull({path (5)}) then {path (5)} else {path (4)}

Photo_8.png

Now that we’re done with the Prepare step, let’s Visualize!

Visualize the Sankey Diagram


In the Visualize window, click on the “Chart Extensions” button and choose “Sankey Diagram”.

Photo_9.png


In the “Measures” section, choose “Score” as a value. In the “Dimensions” section, select “Source” and “Target”.


Photo_10.png


The result should look something like this:


Photo_11.png


Compose and Conclusion

In the Compose section, you may include your visualization with some images and texts. Our result looked like this.

Photo_12.png

What is interesting in this particular dataset / visualization dataset is that you will notice some sub-components that are shared between the main components. This is the case for “Integrated Circuits”, “Resistors” and “Capacitors”. A more traditional drill-down or tree option may not have been able to show this particular relationship.

Your turn, now. Go ahead and give it a try. Use the comments section to share how you did with the provided dataset or with your own.

To report this post you need to login first.

4 Comments

You must be Logged on to comment or reply to a post.

  1. Robert Montage

    Great work but the following link doesn’t work:

    For your convenience, you can find the source CSV data file, the codebook, the extension and the final Lumira file here.

    (0) 

Leave a Reply