A journey into “The in-memory revolution by Hasso Plattner and Bernd Leukert”
I have read with great interest the book “The in-memory revolution” by Hasso Plattner and Bernd Leukert, and my goal is to transmit the key ideas of the book in the form of a blog series to the SCN community.
While wondering what would be the best way to carry across the message of the book, I’ve hesitated a lot on the format of these posts. The book itself does a marvelous job to explain and summarize the architecture of HANA, and how this architecture is allowing SAP to rethink and redevelop enterprise software into ways previously unthinkable, giving multiple real life examples along the way.
In order to apply this new knowledge, versus just summarizing the book, I’ve elected to present my article in the form of an imaginary interview with a customer’s CIO who would be unfamiliar with SAP HANA, and make with him/her this voyage of discovery into the promises of the in-memory revolution.
Part 1 – The in-memory revolution
CIO – So as you know, we have in production a big SAP footprint. Mostly ECC6, but also some older versions in some subsidiaries, and some non-SAP ERP of course. Of course, there are things we’d like to improve, some processes that are too complicated, and it’s difficult to bring new users to the platform, but overall we’ve been very satisfied. It is the backbone of our enterprise. And now you come to talk to me about “revolution”, and I’m a little scared at the idea of having to “revolutionize” my IT platform. We’ve heard about new technology revolutions before.
Me – Of course, I understand your reservations when it comes to touching to the backbone of your enterprise. But let me first explain the technology, and why we believe it’s a revolution. And you’ll see it’s not just about a groundbreaking new technology, it’s also about being able to take advantage of it now, it’s about bringing this new technology in a non-disruptive way.
CIO – You mean, not restarting from scratch?
Me – Absolutely. Even better, keeping your existing processes, but immediately bringing (potentially enormous) improvements to them.
CIO – How would that be possible?
Me – It all started with a reflection of Hasso Plattner, who brought it to his students at his Hasso Plattner Institute (HPI) in Postdam. Imagine you have a zero-response time database. What would be the consequences for enterprise software?
CIO – But do you have a zero-response time database?
Me – Well you’re right, let’s start from the beginning. Yes we do, it’s SAP HANA. It is the basis of everything.
CIO – How can you achieve that?
Me – Again Hasso’s breakthrough idea: with a zero response time database, all the constructs of computer history – that is aggregates, redundant tables linked to the usage of aggregates, caches, indexes – are not needed anymore.
Aggregates help to manage, but impede change. Aggregates bring with them a static view of the enterprise. They cannot convey a dynamic view of the enterprise. New trends remain hidden within.
Then here is the revolutionary idea: just keep transactional data, everything else will be calculated on demand.
The consequences are huge: it means OLTP + OLAP on the same system. It translates not only into faster reporting, but also into faster transactions thanks to removed redundancies.
It allows SAP to concentrate on business logic, not on performance enhancing constructs.
CIO – All this sounds absolutely great, but doesn’t tell me about your zero-response time database.
Me -You’re right. Here we’ve been taking advantage of progress and innovation in the hardware space. The key developments have been the rise of multi-core CPUs and 64 bit architecture. For example, today, we can have CPUs with eight 15-core sockets. The largest HANA machines today can accommodate up to 12 TB of RAM.
CIO – My ERP has more volume than this…
Me – Of course, I’ve just talked about hardware innovation so far. Let’s talk about the HANA architecture difference:
1- HANA is a columnar database: in an in-memory database with columnar organized tables, only populated columns consume main memory. This especially reduces data footprint in standard software as no single customer is using all attributes.
2- Dictionary compression: data is encoded with the help of dictionaries and stored as memory-efficient integers. Additional compression techniques for numerical attribute vectors reduce the storage space even more.
3- No aggregates or additional indexes: HANA being in-memory delivers zero-response time, hence does not need redundant data. Neither aggregates nor additional indexes that would require additional memory. Data is only stored at the highest level of granularity. All aggregations are calculated on demand.
4- Data tiering: Enterprise data can be split into actual and historical data. Historical data seldom accessed could be stored on disk, while only actual data needs to always be loaded in memory. This can even more reduce the data footprint.
CIO – Interesting. Do you have examples of such database footprint reduction?
Me – Absolutely, we (SAP) were among the first to take advantage of our new product. Our ERP volume on anyDB was 7.1 TB. Only by moving to HANA (what we call Suite on HANA, that is without any changes to the data model), the volume went down to 1.8 TB. It was an absolute non-disruptive move, essentially a database migration. But this is only the beginning, SAP has taken advantage of the HANA platform to redevelop its flagship ERP – this is S/4 HANA, we’ll talk about it in detail later – by removing all aggregates from his data model, and then we went down to 0.8 TB, 0.2 TB of which are actual data. As a result, the actual partition is reduced to about 3% of the size of the original ECC6 system.
CIO – Wow! That’s impressive but before moving to the actual ERP, I’d like to understand more things about HANA. You mentioned earlier OLAP and OLTP on the same platform. Isn’t that contrary to the established practice?
Me – Precisely! Another revolution. Most “established practices” come from earlier hardware and software limitations. Hasso’s paradigm is to restart from zero, but with today’s hardware, and with 40 years’ experience of developing enterprise software. OLTP was separated from OLAP because of performance issues. Once those issues are taken care of, there are no more obstacles to bring OLAP back to where it belongs. Again, it’s all about simplification and removing redundancies. The immediate consequence is being able to do real-time reporting on the actual data.
CIO – Indeed, this is great for reporting and read-only applications, but won’t database locks keep impacting transactional speed?
Me – Indeed, it could have remained an issue, but we worked on this as well. First, the removal of transactionally maintained aggregates means a sharp drop in database updates operations. Database updates are not only costly, but can cause data inconsistencies and database locks. We have chosen instead to perform update operations as insert-only operations, meaning that the old entry is invalidated and a new entry is inserted. This not only prevents database locks but allows application simplification.
CIO – Nicely thought out. So you said the first step on this journey is to migrate to HANA, but surely, in-memory alone won’t be giving you all this benefits?
Me – You’re correct, in-memory alone will above all speed up reporting, but to leverage even more of its benefits, enterprise software can be optimized to thoroughly take advantage of the new technology, which brings us to S/4 HANA.
Part 2 – S/4 HANA, the agile ERP