Skip to Content
Author's profile photo Former Member

The Ultimate Data Geek Challenge – or Discover what’s great in Visual Intelligence

Nick Smith @Nicfish set out a Ultimate Data Geek Challenge at the beginning of September to “Let the rest of the world know your data skills”. http://scn.sap.com/docs/DOC-31008

Well, my data skills aren’t the greatest mathematically so I wanted to use this challenge to push my understanding of SAP Visual Intelligence a few stages on…..  It sure did that.

I decided to use the data set of USDA Nutrient Data , which after a bit of googling is the  United States Department of Agriculture  Nutrient database that holds the nutritional information of numerous food and beverage products.

Step 1 – Where do I start ?

Well with Visual Intelligence (VISI)  it’s extremely simple to acquire data for analysis and chop out columns  you are not interested in.

Another great feature is the inbuilt enrichment of data, in this data set it is only used to create “Measures” but it is very useful for both Geographical and Date enrichment. It will in fact build out a Time and Geography hierarchy from a City or Date dimension field.

Step 2 – Have a look around the dataset

In looking around the data set it was easy to notice that a lot of information in terms of a product Hierarchy is held in the Product Description field but separate by the ‘Comma’.  Traditionally this would be a nightmare for a Web Intelligence report developer to work with and would often need a service request on the DBA to split out the field into multiple fields.

In Visual Intelligence this is an easy user task and enables them to really take a ghold of their data, Split column by <Comma>.

With a few moments of renaming  the Description can be split out into multiple columns to aid analysis.

Step 3 –  Visualising to aid analysis

There is only so much analysis you can do by eyeballing a many thousand row spreadsheet so this is where visual analysis really aids understanding. If this Data Geek Challenge experience is anything to go by it really showed me that questions lead from question and take you places in the data you never thought you’d end up.

Question 1 – What food group is “Bad for me”?

Question 2 – Just How Bad ?

Spotting the outliers in a bubble chart really help understand to exceptions

But spin the axes around an a different picture forms

Question 3 – What should I really not eat?

We all know Sugar is not great for you but it seems on first look to be intrinsic to what I consider to be “Bad for me”  but where should I try and lower my intake in my regular diet ?

I’m not a big candy (sweets) eater but as any parent there are loads of different boxes of breakfast cereal in the house.

Should I be worried my 4 year old son has Cheerios every day for breakfast ?  It looks like yes

Question 4 – What is the worst thing to eat for breakfast?

Well, how about Cheerios with instant tea instead of milk

Question 5 – What should I eat for breakfast ?

What’s high in calories but low in sugar and I should really consider eating for breakfast?

Wheat with corn beverage ?

 

Appetising ? ….  Maybe not

Summary

Well did I think I’d end up banning my youngest child from eating Cheerios as a result of this challenge?  Absolutely not.  But this challenge has certainly enabled me to get close to SAP Visual Intelligence and appreciate more that getting closer to data and putting the analysis in the hands of dare I say Data Scientists, well analysts can only be a good thing.

Assigned Tags

      1 Comment
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member
      Blog Post Author

      Part 2 – The Reprise

      I owe a lot to John Appleby @applebyj and @Ayooshha at Bluefin Solutions for getting me started in blogging there encouragement and dare I say persistence changed my attitude to social media.  I have mentioned before in my blog that Johns 10 tips to getting started in blogging did inspire me.

      This week I had to get to grips with number 11 quoted by @BoobBoo

      11. Do not be afraid to be wrong, people will challenge you but if you have passion, good grace and knowledge the conversation will most likely be rewarding and informative.

      Yet again, great advice as the conclusion in my blog is fundamentally wrong and it was kindly pointed out to me by Ethan Jewett @esjewett

      Do you realize what you did in this exercise? You added up sugar for several different types of cheerios.  Regular cheerios only has about 4g of sugar. You added regular, banana nut, yogurt burst, and chocolate varieties together. Did the same for several other types of food as well, and for the food categories in the bubble charts….

      And yep, Ethan was bang on right.

      I made at least three fundamental mistakes

      1. I assumed Cherrios was a product not a Brand.   In my house my youngest son eats Cheerios, I had no idea that various products were made including Banana Nut, Chocolate and Yoghurt.
      2. I didn’t drill down to the lowest level of granularity, if I had I would have seen the individual products individual data values and not the summated amount at Brand level (diagram below)
      3. I didn’t validate my conclusion.  In haste I didn’t stop, think and validate.  A good lesson learnt.
      4. Get the source data right.  In breaking out the label field the way I did there isn’t a consistent hierarchy, Level 3 for one product maybe in level 4 or 2 for another.

      So hopefully I have used the SAP Data Geek Challenge not only to deepen my understanding of SAP Visual Intelligence, but also see a new side to the value of blogging, open conversation and the benefits of peer review.

      And one more thing, have a play with the data set yourself and let me know what I shouldn’t be eating for breakfast !