So you may have been reading the SAP HANA Academy’s Cricket series over the past few weeks and wondered how we process the match data into SAP HANA in order to create our informative visualizations about the action on the pitch. In the video below, Tahir Hussain Babar (a.k.a. Bob) shows how we use SAP Data Services to extract, transform, and load the ball by ball Indian Premier Twenty20 Cricket match data we receive in XML format from Opta Sports into SAP HANA. This hands-on tutorial video walks-through how SAP Data Services converts XML files into SAP HANA columnar tables by using loop and conditional workflows to carry out different development tasks.
The SAP Data Services’ process starts by checking a folder that receives new files from Opta every minute while matches are in progress for a specific file from the 15th over of the second inning. Once that file is in the folder, we let the job wait for 30 minutes before we start to process the files.
First, we use a MS DOS command to create a directory listing for all of the files for this specific game before we output the files into an individual folder. Then a continuous loop processes the files from each match in chronological order based on their ID number, before the XML files are unnested. Then any logic we want to add, such as transforming team IDs into team names, is outputted in the query transformation phase before the files are finally loaded into SAP HANA.
Also, in the video Bob introduces try-catch blocks, which look for any form of errors in any part of the job and then uses a SMTP function to send an email to the appropriate person with a notification that an error has occurred. Additionally, Bob profiles if-statements, shows how Data Services can create dimensions, how to output files into a CSV, and how to generate email notifications when the job is finished.
Check out the video for even more insights on how we use Data Services in our cricket demo.
Match 46 Preview
With data loaded into SAP HANA we can create insightful visualizations with SAP Lumira and SAP Crystal Reports to preview this afternoon’s clash between two teams desperate for a win to preserve their postseason chances. Dropping their last three matches and subsequently falling out of the top half the table the Sunrisers Hyderabad seek to get back on track when they host the suddenly streaking Royal Challengers Bangalore in the 46th match of the 2014 Indian Premier Twenty20 tournament. See how the Royal Challengers were successfully able to overhaul the Sunrisers’ middling total a few weeks ago and find out how the teams fair against each other last season.
With SAP Lumira we can create a pair of pie charts that show the percentage of team runs each Sunriser and Royal Challenger batsman has scored this season. The Sunrisers will look towards their terrific trokia of Shikhar Dhawan, Aaron Finch, and David Warner to power their innings. Dhawan’s 234 runs, Finch’s 291 runs, and Warner’s 375 runs are in teal, light green, and purple below respectively. AB de Villiers and Yuvraj Singh in the blue and pale green slices below respectively, have scored 343 and 308 runs for the Royal Challengers. Bangalore’s lineup contains two potent yet currently under performing batsmen in Chris Gayle and Virat Kohli.
By making a few simple modifications in SAP Lumira’s intuitive interface we can show the individual seasonal bowling efficiency of each team’s bowlers by showing the percentage of runs they have conceded and adding a depth to the pie chart of the number of balls, wides, and no balls they have delivered. In magenta below is the man currently wearing the Purple Cap as the tournament’s top wicket-taker, the Sunrisers’ Bhuvneshwar Kumar. Kumar leads the tournament with 18 wickets and his 6.28 economy is the 4th best rate for bowlers who have tossed more than 12 overs. Yuzvendra Chahal and Mitchell Starc, in pale yellow and light blue respectively below, have been RCB’s most consistently efficient bowlers but neither cracks the top 10 of tournament’s leader board for economy rate as both have been plagued by some troubling overs.
With SAP Crystal Reports we can create a graphic that compares the economy of the Sunrisers’ best bowler, Bhuvneshwar Kumar, to the strike-rate of the Royal Challengers’ best batsman, AB de Villiers, broken down by the line and length of the deliveries they have tossed and struck. Many of Kumar’s yorkers and length balls have been very effective with wide yorkers and off stump length balls earning him sub 3.40 economies. While de Villiers has strike-rates above 100 for nearly every combination of line and length deliveries he has faced, he hasn’t been as effective against Kumar’s red-hot yorker and length balls zones.
Can some timely balls from Kumar help to snap Hyderabad’s free-fall? Or can the Royal Challengers’ potent lineup finally strike with their fullest potential?
Keep your eyes peeled for more visualizations created in SAP Lumira using historical Indian Twenty20 data from SAP HANA. Please check back regularly during the tournament for more insights about the daily action. Also, watch out for a series of tutorial videos on how we load the data into SAP HANA.
SAP HANA Academy – over 500 free hands-on tutorial videos on using SAP HANA.
Check out all of the Indian Premier Twenty20 Match by Match Analysis here.
SAP HANA Academy