# DataGeek III Challenge: Major League Baseball (MLB) Analytics

I am writing this Blog Post to showcase power of SAP Lumira for analyzing data and generating meaningful insights. I have used Major League Baseball (MLB) dataset for this purpose. Using the above mentioned data, I have developed a SAP Lumira storyboard, which contains different visualizations that enables the user to analyze performance of his/her favorite team(s) playing in the Major League.

Please allow me to have a walk through on the story:

1) MLB Team Winning Statistics

First of all any user who is interested in baseball would like to know the winning and the loosing statistics of his/her favorite teams. Using SAP Lumira, we can present the data in an excellent manner:

For instance, I’ve picked Baltimore and Boston teams for comparison of total wins and losses by each team.

As we observe here, Baltimore has more number of ‘Total wins’, ‘Home win percentage’ and ‘Road Win Percentage’ but has almost equal ‘On base percentage’ to that of Boston.

2) MLB Batting Statistics

Now I would like to know how all the teams are performing in terms of batting; I have created a story page as you can see below:

Let’s have a detailed look at one of the visualizations to compare Baltimore with Boston to get insight into the reasons of how Baltimore performed well

As clearly visible in the graph above, Baltimore had more number of ‘Total Bases’ and ‘Hits’ whereas Boston managed to score better in ‘Doubles’ and ‘Triples’

Even from other visualizations, Runs Scored, Runs Against and Runs Differential of Baltimore is more than Boston with higher batting average.

3) MLB Pitching Statistics

Now, let’s compare the pitching performance for both the teams using area chart.

As I observe here, Boston allowed more ‘Hits’, ‘Doubles’ and ‘Home runs’ than Baltimore did. At the same time, it is interesting to know that both the teams conceded about the same number of ‘Triples’ and ‘Intentional walks’.

4) MLB Park Factor Statistics

To get more insight into the influencing factor of Baltimore’s high win percentage, now I also want to observe the Park Factor Statistics

As we can see, the above visualizations give us the real picture on how Park Factor has played an important role in affecting the team wins on the various parameters. We can deduce that a park with the ‘Park Factor’ greater than 1 is a Hitter’s park and lesser than 1 is Pitcher’s Park. When I started comparing park factors for Hits, Doubles, and Triples, I found the reason how the home runs and hits of Baltimore is higher in its home park. Yes you are with me to correctly infer that Baltimore’s park is a hitter’s park and the home team had the total advantage to that ðŸ™‚

The way these visualizations helped us to analyze the winning percentage of Baltimore over Boston, we can further utilize the filter capability of SAP Lumira to analyze many more scenarios based on what we want to know about the teams of our interest.

