“Success is 99 percent failure.” – Soichire Honda.
During the 30 hours non-stop bliss, we failed more times than we succeeded, so did all the other teams. In the end, the lessons we learnt from the failures are the ones that led us to produce a successful prototype while enhancing and enriching our knowledge. Winning the trophy was a bonus.
If I think back to the reasons why I attended InnoJam, in the order of my priority,
- If not for InnoJam, I may not have been able to get my hands on HANA for a few more years (Unless I enrolled for a training course, paying a few thousand dollars)
- For 30 hours I can have face to face interaction with the best SAP brains on the planet
- I can work on HANA without the hassle of having to install it
- Opportunity to meet like-minded people in the SAP community
- Feel and share the passion that inspires and motivates
- Access to the latest SAP technologies such as SUP and NetWeaver Gateway
How we did it
Our team worked on a Business Case called ‘Forecasting using Predictive Markets’. Great use case, which, I initially thought would be too big for 30 hours. The passion and energy in the team proved me wrong.
After the initial design, we split the tasks related to Technology streams.
Tony and Wilbert discussing the initial design
Concept, ideas and presentation – Wilbert Sison and Rod Taubman
Streamworks/Gravity mentoring and BI – Nigel James and Paul Hawking
Mobile UI and NetWeaver Gateway – John Patterson and Glen Simpson
HANA and Social media integration – Sarat Atluri and Tony de Thomasis
The initial difficulty I faced was more related to my understanding of our business case and how to use the data from social media for our prototype. Rod and Wilbert took me through some really good real world examples that gave lot of clarity. Once the idea and expected outcome was clear, only realisation of the solution was missing (sounds very simple 🙂 ).
After dinner, every subgroup in our team got busy understanding, learning, discussing and implementing new technology to design and build their part of the solution.
While I was discussing with Rod and Wilbert about the business scenarios, Tony managed to setup a machine with the HANA client and Modelling Studio.
Paul and Wilbert explaining the business scenario to me.
Hands on HANA
The HANA box that we used has around 8 cores and 500GB main memory.
We decided to use our limited and precious time on HANA in two ways.
- Get the bare minimum ready for the prototype to keep it going
- Without affecting the prototype, build as much functionality as we can, in and around HANA to get some hands on experience which could be integrated into the prototype
Getting the bare minimum ready
The initial task was to create tables in HANA. After working for many years in IT across a few database systems, now I have created my first transactional, column oriented database table. My first impression of the HANA studio User Interface is that it resembles a combination of NWDS and Microsoft SQL Server.
We tried to use the graphical tools to create and update table definitions, but felt that we may have to spend too much time learning how to use them, so we used them to a minimal extent and started using the SQL Script Editor. This gave us an opportunity to learn SQL Scripting. As a developer, I felt that I gained more control using SQL Script, compared to using the graphical tools.
While we were working on the data model, Paul created us a decent dataset of some 10,000 rows in Excel. Now, the task was to load these records into our HANA tables. Onno Bagijn the local SAP HANA expert mentioned earlier that he could help us load datasets from HANA the server. However, as he was already busy helping other teams, we decided to perform out own experimentation – after all, what better way to learn? A speedy HANA data load would grant more time to explore the other juicy parts of HANA.
Tony and Onno smiling after one of several HANA discussions
We loaded the records into text editor, did a find/replace and created 10,000 insert SQL statements, then copied them into the SQL editor and hit the execute button. I haven’t tried to load so much data in this way on any database before, as I have always depended on data import services for large datasets. A combination of our extremely limited knowledge about HANA and the associated tools, limited authorisation over HANA (server, client and studio) and time constraints drove this decision – and generated a brilliant learning opportunity.
The HANA UI soon froze and became totally unresponsive during the execution of the large number of insert statements. Once we terminated the Modelling studio instance and inspected the table in a new session, we discovered a distinct lack of data in the HANA tables. We also observed several automatically created HANA tables with a prefix pattern of ‘sys_’. These tables may have been part of a recovery operation caused by terminating the Modelling studio client in the middle of an update operation. Obviously, this would not be a recommended data load mechanism in a Productive situation.
Our investigation lead us to the conclusion that the HANA database performs an implicit commit operation after each insert when performing processing SQL based inserts. This situation caused a rapid performance decline on the HANA box leading to a HANA restart required to resolve the problem. Juregen Schmerder (SAP Technology Strategist) then helped with the more efficient data load process of copying the files to the HANA server and then importing them directly.
At this point, we had achieved our initial HANA goal leaving us more time to work with and learn several other aspects of HANA for the rest of the first InnoJam event.
Generating another Java process to load the HANA
Programming for/in/with HANA
With 10,000 rows of data, we may be able to run the prototype, but we felt that we needed much more data in order to properly showcase the value proposition of HANA.
Objectives for rest of the night
- Populate the HANA tables with a few million rows
- Develop some procedures in HANA to learn and experiment with SQL Script
- Emulate a situation where we continuously create rows (for Buy and Sell Bids) and perform a continuous scan of the rows to find matches (Bids for Settlements) and settle them
- While these processes were running, we wanted to perform data analytics on the same table to demonstrate the simultaneous transactional and analytical capabilities of HANA using large datasets.
While showing us the Modelling Studio earlier in the evening, Onno Bagijn shed some light on the JDBC connection properties of the HANA Modelling studio. When we saw the JDBC drivers in the connection tab, it really got us excited as we could see the possibility of leveraging our existing Java knowledge to learn HANA.
We decided to develop Bid generation simulators using POJOs (Plain Old Java Objects) to pump data into HANA. As we did not have a Java development environment installed on that laptop, and rather than trying to install any IDE, we used the good old text editor to write Java code. We did not write any complex APIs, but possibly a few Java classes with a main method. So we thought the basic text editor would do the job.
Rather than writing the SQL logic in Java, we started learning and writing SQL logic as procedures in HANA using SQLScript. We used Java to create random bid data and passed it to HANA procedures to persist the data. We did basic tests on the procedures from the Modelling studio and connectivity tests for our POJOs to talk to HANA.
Once we got them working, we decided to use these POJOs to populate the table. Rather than changing the Java Class to include multiple threads, we created an infinite loop to call the HANA procedure with generated data. So, now we got a Java instance, which, in an infinite loop to add several single rows to the HANA database.
We then launched multiple instances of the same Java class to quickly populate the HANA tables. We did a simple row count query on the table to see how many rows are being created per minute after launch of each new instance and noticed this ratio increased in a linear fashion up to 5 parallel Java instances. Additional Java instances beyond this point caused a reduction in the row creation rate. At this point we achieved a load rate of around 12,000 rows per minute, and we estimated that we would achieve our goal of 1 million rows in around two hours – just enough time for a short break to take a quick shower and breakfast and return by 8:00am on Sunday to continue the adventure.
Nigel and Paul enjoying a quick breakfast
We really didn’t care about best practices of pushing data into HANA (as a matter of fact, we didn’t even know what they were and we had only been exploring and learning HANA for just under 12 hours.). Our methods may not have been the most elegant, but we achieved our data loading goals. Our method would obviously change to suit the scenario:-
- Data engine of a payment processing process?
- Data engine for social media application?
- Data engine for a heavily used cloud based application?
- Moreover how can it handle insanity?
As they say, nothing is fool proof from a sufficiently talented fool.
When we returned at 8:00am we were disappointed to find only half a million rows in the HANA tables. The Java instances had slowed down and we were told that the HANA server is a bit slow. Rather than perform the root cause analysis we decided to restart the HANA server and work with the loaded data. Without speculating too much, we thought the multiple Java instances loading thousands of rows into the same table resulting in reusing connection objects caused the HANA performance degradation. We would have loved to spend more time investigation the proper connection handling and asynchronous delta merging to get a better understanding, but we were pleased that we could utilise 500,000+ rows – a far cry from the initial 10,000 rows we started with.
At this point, I continued refining the HANA simulation algorithm, whilst Tony created a Yahoo Pipes algorithm with Rui Nogueira to mine and aggregate Product sentiment on Social Media channels into a single RSS feed. We then created a Java client to cyclically read the Yahoo Pipes RSS feed and directly populate HANA tables.
Rui and Tony smiling after configuring the Yahoo Pipes
Presentation and the end of the event
Around lunch time on Sunday, we began pulling together the pieces of the puzzle as the madness and excitement of working on HANA prevented me from being deeply involved with the solutions built by my other members. Rod, Tony, Nigel, Paul and Wilbert took care of the presentation, while Wilbert, John, Glen and I kept on exploring the tools.
Glen, John, Paul and Rod still exploring at the 11th hour
Throughout the whole event, the SAP team never slept, their excitement was infectious and they kept up their professional help and good humour even when we asked the silliest of our questions.
I was also amazed by what the other teams had produced in such a tight timeframe. It was even more interesting when they described how many things didn’t work in their favour and how they worked their way around those issues to complete their respective solutions, every single participant had an inspirational story to tell. Everyone involved did a great job and I feel proud and privileged to have participated in the event with so many talented people.
At 3:00AM you stand up to stretch your limbs and look around, you see so many wide awake SAP enthusiasts passionately doing their stuff. You get an overwhelming feeling that you are not alone; you belong to a crazy but equally passionate SAP ecosystem.
The 3 winning teams on stage