Skip to Content
Business Trends

The In-Memory Journey with Hasso Plattner – From Prophecy to Reality

Have you ever wondered what the world of enterprise systems looked like in the late 1960s? What transformed in the last 2 decades of the 21st century? What was once a prophecy of a renegade of our times, started a renaissance in the world of enterprise computing. Let me take you down the memory lane and through the in-memory computing journey from prophecy to reality with SAP co-founder and chairman Hasso Plattner.

 

“I have always lived in the world of these enterprise systems from day one on…” says Hasso in an auditorium of Hasso Plattner Institute, nestled in the quaint town of Potsdam while addressing his PhD students.                  

 

“200MB was a big drum like disk…”  – Reminiscing the 1960s and 70s

“I joined IBM in 1969 and IBM was a master of enterprise systems in the world then. Because the transactional volume in a company those days was so large, the first objective was how can we condense data to be able to work with faster response times.” Reminiscing the year 1969 as the beginning of online data processing via screens and type writers, he continued to explain, “The number 1 technique was to aggregate data by setting some criteria – in accounting we aggregate data by account, in sales by customer, in purchasing by supplier, by goods. The whole purpose was to aggregate data. Once we have the data aggregated, the management information systems (MIS) then could deliver relatively quickly, answers to what was total value of stock, what was our open order book on supply side, what orders have we committed to, what was our P&L statement and what does it look like. In reasonable response time, we got an answer. This technique became possible because in 1968-69 timeframe IBM brought to market large disks for 70MB and then 200MB. 200MB was a big drum like disk so we could store reasonable amount of data on disk with direct access. So that’s the world we lived in from 1968 till end of 1970s. Those systems pretty much looked like systems we have today but less complicated. The basic idea was to get the transactions in, run some transactional processing, but the main purpose was to aggregate data for consumption later on by the MIS.”

 

The very essence of the then enterprise systems was carried on with the birth of SAP, with R/2 in 1979 and R/3 in 1992. “R/3 took the world by storm” and led the transition from mainframe concept to client server – the hallmark was the 3-tier client server comprising of an intelligent front end, an application server and a separate database server with load of system distributed quite evenly and providing a much higher quality of application, much faster application – but the governing principle concept remained same. “Transactional data in, run some transactional processing up to 20 stages managed by workflow and then aggregate data. Yes we continued to aggregate data and with the same criteria developed in late 1960s. If these aggregates were not good enough, we took the data to second storage called business data warehouse (BW) where customers could further aggregate data by different dimensions defined in reports which ran in batch mode. With the last move we removed the paper from enterprise systems.”

 

“Because I am a renegade and I wanted to do something different…” – The Beginning of SAP HANA 

Towards end of 2006, close to 40 years after SAP was started, Hasso started speculating how to redo an enterprise system if he were to start from scratch and spent an entire day with his students to explain parts of how an accounting system worked then and what needed a redo. Hasso chuckles,because I am a renegade and I wanted to do something different, I made a proposal to completely get rid of all aggregates, 40 years of aggregates, which was too radical a proposal then. I then made a proposal for all systems to take the time away from the line items and leave the day and aggregate. For every single line item, take the time away.” Hasso was convinced of his hypothesis that eliminating time of day would not result in any information loss from the transaction as in normal enterprise systems it shouldn’t matter whether something happened at 923 in the morning or 330 in afternoon. The challenge that he set for his students was to build a database to be so fast that we could build any aggregate the user community wanted on the fly within a reasonable response time on transactional data. The exercise to the students was to look at z processing, meaning for a given customer, looking down from the customer, get all orders, shipment, invoices, payment in a very short response time running sequentially through large database using secondary indices (what was already being done for 40 years). And what started as a research project marked the beginnings of SAP HANA. The dominating ethos was “Change the system fundamentally and yet use the traditional database technique.”

 

“When you restart thinking, the interesting thing is you might come to a different conclusion than you are used to…”

Hasso nostalgically continued to recollect and reflect on his realizations and revelations from that day in 2006 with his students  – “If we build a system without pre-aggregates, do we really have updates in the system? Is the system really write oriented? Do we really need 2 different systems for enterprise computing – write oriented for transaction processing and read oriented for analytical processing?” Hasso was quick to conclusion on the same day – “It is not true that in an enterprise system, the behavior of transactional systems and behavior of analytical system are so fundamentally different. There is no reason for having 2 different systems. It is a myth that OLTP systems need write intensive and OLAP systems need read oriented.” Further revelations dawned. “If we remove aggregates, there should not be much update. If we remove aggregates, we can remove something else – can we avoid concurrency?  And If we have no updates, there is no contention. If there is no contention, we don’t have to introduce any measures to avoid contention. We can run any transaction massively parallel – as parallel as a computer can run.”

Hasso had come a long way down the memory lane. From the last running R/2 system that shut down last year in the US (was running single threaded update) to systems in 2006 (that did not have enough computing power to get all the data in especially between 9-11am and response time was slow as transactions were locking each other) to imagining a database that can run as parallel as a computer can run and be an in-memory database using column store and not row store.

 

Fast forward to the current decade. “Memory is the new disk…”

With SAP HANA, SAP changed the enterprise computing landscape. “Memory is the new disk.” Database is in memory. Tables organized in columns. If table has 300 attributes, it has 300 columns. All attributes stored in columns. Columns scanned at high speed. Fully indexed. No more updates. Inserts done massively parallel. Program complexity reduced. No more aggregates. No redundancies. No indices. We use aggregate vectors as indices. And through compression alone, systems become smaller. 5x compression means 5x efficient system. Hasso recalls Father of MaxDB, Rudolf Munz, once an antagonist of Hasso’s research project eventually described Hasso’s object of pursuit as “relational column store fully in memory is a fully indexed relational database.”

 

“How do we multiply the tininess…”

Today, many of the customers are leveraging inherent innovations in SAP HANA to realize potential reduction in hardware costs as well as gain significant improvement in response times. The systems are tiny now but the pursuit for Hasso continues – “How do we multiply the tininess? How do we make it faster with zero response time?” He quickly postulates “horizontally, dynamically, automatically partition tables” which is now the fundamental concept behind data aging mechanism in SAP HANA which Hasso tirelessly advocates both in academia and real-world.

 

With SAP S/4HANA, through dictionary compression and data management techniques such as data aging, it has been proven time and again customers can reduce their data footprint between 4x – 20x. Data Aging is a mechanism of partitioning data into current (data that is needed to run your business and is in memory) and historical (data that is not frequently accessed and can remain on disk) and SAP offers data aging both for technical objects (idocs, change documents, workflow, application logs) and application (sales order, delivery, billing, purchasing etc). The split of data into current and historical data not only reduces the footprint of data in memory, it also makes the current data access extremely fast. The smaller the data footprint of current data becomes, the faster the scan and filter operations in SAP HANA will run.

 

We encourage our customers to take advantage of the innovations around in-memory data management techniques in SAP HANA. if you are already on your SAP S/4HANA journey, we want to be part of it. We want to help you age your data and realize the many benefits of an in-memory system. If you are planning on embarking on your SAP S/4HANA journey, with recently launched SAP HANA Hardware Sizing Service, our customers can immediately right size their SAP HANA environment.

 

The Pursuit is the Journey…

With Hasso and with SAP HANA, the pursuit is the journey. And we are just getting started. Hasso is quick to remind the golden rules:

“Never use select all.”

“Always specify exactly which fields you want to access.”

 

 

12 Comments
You must be Logged on to comment or reply to a post.
  • This is not just a story about revolutionizing technology in SAP but also about a person pursuing perfection persistently. It’s really fascinating to read.

  • Nice read… helps us understand how HANA is able to get the best of both OLTP and OLAP worlds and provide analytics on live data, eliminating the need to ETL data from an OLTP system to OLAP system to provide analytics. Thus eliminating the time lag between transactional data processing and analytics.

  • Nice memory lane. Technical and business was never close as before. Customers must reshape their current infrastructures in order to use their data effectively. Insights from this data can add enormous value to organizations and historical data deserves attention in a real-time world. In many cases the strong desire to have this data was not matched by an equally strong reason for actually needing it. Developing technology but also simultaneously designing a perceived need is fascinating.

  • Hi Priyanka,

    Thanks for sharing. It is great to know how the great adventure of SAP HANA was started, what are the initial thoughts and how simple those initial thoughts are! This again proves one theory that evolution sometimes meant becoming simpler, not more complex. It is very inspiring that by restarting thinking we might get different solution for old problems. With in-memory technology compared to pure disk based, it is like the new era of electric cars for car industry, compared to gasoline cars, it is much simpler and yet performing better. Proud of being part of SAP!

    Best Regards,
    Haichao

  • Article that describes the journey of Sap Hana and as well as how a single idea of change has bought a revolutionary change in the Industry. Thanks Priyanka for sharing such a wonderful article that inspires us.

  • Thanks for the blog – I came to know that a big advantage in HANA is in doing aggregations on live transactional data (in a OLTP-OLAP model, this is not possible as data has to be ported from transactional to analytical db)
    I guess an application which gives useful aggregations directly on live data would tap the full potential of HANA..

  • Question if I may please:
    To have a database in memory is one. But what I would like to understand is how the database is synced to disk.
    How does disk-array replication work if the data base is in memory.
    In other words how is an RPO of zero realized by disk replication of the data is in memory and not on disk?
    Many thanks.

    • @ Herman: In Hana the data is on disk also, in the transaction log. Hence you can rebuild the most recent state by going trough the transaction log and replay the transactions. There is no difference to traditional databases. They do the same thing during recovery. A async running database writer safes the database blocks to disk once a while and if there is a power loss, the transaction log contains the data that did not end up in the database blocks yet.

      The difference is that traditional databases have memory structures to cache the database blocks, which are disk optimized. So they organize the data in memory in a disk optimized format. Does that make sense? No. But it is a side effect of the requirement to have some data on disk, some in memory. And a sideeffect of updating existing rows and hence various database blocks.

      Instead, if you decide that from now on all (hot) data should fit in memory, you can organize the data best for memory. In memory writing to two totally different addresses has no performance penalty, in fact this is preferred. With disk the disk head has to move from one side to the other which takes ages.

      In case you want to drill deeper, I once wrote up my journey towards Hana.
      https://blogs.sap.com/2014/11/28/all-databases-are-in-memory-nowarent-they/