As HANA starts on its maturity curve, we as users of the technology need to get a clearer understanding of the potential speed bumps. HANA started as an in-memory data base on which you can build analytical applications. Next up, from November, it becomes an optional DB for BW . And further down the road – it becomes the DB for the business suite, and pretty much everything SAP builds. That is the vision. But just by putting HANA as a database, will we get enough benefits?
First up – how does HANA speed up software apps? It is primarily by making the application’s interaction with database super fast. On top of holding data in memory, in columns, and using insert only deltas – HANA can apparently also do massive parallel processing.
So first thing obviously is to pass a SQL or MDX query to hana that is tuned to how hana works. This is not a simple task – optimizing BW or ECC to pass queries to HANA in a way that is optimum is quite complex. If you don’t trust me – just expiriment with some of the MDX statements created by BW or BPC to get a hang of it. SAP has steadily improved this for RDBMS with every new release, and going by past experience, they will need some time to make it work for HANA too. I expect SAP to solve this quickly by drawing on past experience with RDBMS.
HANA is considered to be optimized for BI 4.0. That is probably a true statement compared to other tools. But if you see what gets transfered between HANA and BI layer, it is not always very optimized. In some of the POCs, we had to write handwritten SQL and point BI to that, instead of using standard BI functionality. Again, it will only get better with subsequent releases of both HANA and BI. The point is – it is not easy to write generic programs that can generate highly optimized SQL for many different cases. It is especially harder in HANA’s case, since it is getting a lot of revisions internally too.
Moving on , so what happens when data starts to scale? Say the compressed data cannot fit into one 2 TB box, and several such boxes need to be linked together. Now there is an over head of moving data within memory before it can be used. This scenario can also happen within one box – since not all parts of memory can be processed at same efficiency. So we need to understand all the nuances of scalability too. And this needs to be demonstrated and benchmarked somehow using commercially available hardware from multiple vendors.
Massive parallel processesing can help improve performance tremendously. But Not all applications were written with this paradigm in mind. So if an existing application largely needs its instructions to be executed in a strict sequence – which is normal for several normal applications – then there is not much benefit that HANA can bring to make it better. There is an overhead to constantly allocate and deallocate memory, but probably this can be done in the background without affecting the actual processing of data. I am keen to find out how SAP does this – and hopefully someone from SAP can educate us on this.
What about applications who are already efficient in database level, but are not optimal in say the ABAP layer? There is very little HANA can do in these cases. A lot of ECC transactions were written with multiple tables storing line item and aggregate tables. They probably have optimized DML already. But unless they are rewritten to use a leaner data model, there is only so much that HANA can do to help them. But rewriting a stable application is not an easy thing to do. So most probably SAP will need to keep existing datamodel, and build a parallel schema that can use the power of HANA . This applies to BW and BPC too – a lot of processing happens in ABAP layer.
To net it out – to me, it means two things
1. If HANA’s power needs to demonstrated to the world, SAP and its ecosystem are better off writing applications that are designed ground up in a way that is optimized for HANA.
2. As layers collapse (like how Vishal Sikka explained in his keynotes), SAP has an opportunity to shorten the path to value realization. It needs every part of SAP’s development organization to articulate this in their execution, and that is not easy to pull off consistently for such a large organization.
I am sure that SAP has a solution (or a plan at least) for all of these. We just need to get educated on those solutions and plans.