It’s been a bit more than 2 years since SAP introduced the “In Memory” marketing push, starting with Hasso Plattner’s speech at Sapphire … or was it TechEd … my memory fails me 😉 

It has been two years and I have yet to see a good understanding emerge in the SAP community about what SAP actually means when it talks about “In Memory”. I put the phrase “In Memory” into quotes, because I want to emphasize that it has a meaning entirely different from the standard English meaning of the two words “in” and “memory”. This is a classic case, best summed up by a quote from one of the favorite movies of my childhood:


Inigo Montoya: You keep using that word. I do not think it means what you think it means.

IMDB only reasonably specific explanation of the “In Memory” term that I have seen from SAP is in this presentation by Thomas Zurek – on page 11.

If you want a coherent, official stance from SAP on “In Memory” and the impact of HANA on BW, I highly recommend reading and understanding this presentation. I think I can add a little more detail and ask some important questions, so here is my take:

Fact (I think…)

SAP is talking about at least 4 separate but complementary technologies when it says “In Memory”:

1. Cache data in RAM

This is the easy one, and is what most people assume the phrase means. But as we will see below, this is only part of the story.

By itself, caching data in RAM is no big deal. Yes, with cheaper RAM and 64-bit servers, we can cache more data in RAM than ever before, but this doesn’t give us persistence, nor does working on data in RAM guarantee a large speedup in processing for all data-structures. Often, more RAM is a very expensive way to achieve a very small performance gain.

2. Column-based storage

Columnar storage has been around for a long time, but it was introduced to the SAP eco-system in the BWA (formerly BIA, now BAE under HANA – gotta respect the acronyms) product under the guise of “In Memory” technology. The introduction of a column-based data model for use in analytic applications was probably the single biggest performance win for BWA and followed in the footsteps of pioneering analytical databases like Sybase IQ, but it was largely ignored.

Interestingly, Sybase IQ is a disk-based database, and yet displays many of the same performance characteristics for analytical queries that BWA boasts. Further evidence that not all of BWA’s magic is enabled by storing data in RAM.

3. Compression

So how do we fit all of that data in to RAM? Well, in the case of BWA the answer is that we don’t – it stores a lot of data on disk and then caches as much as possible in RAM. But we can fit a lot more data into RAM if it is compressed. BWA, and HANA, implement compression algorithms to shrink data volume by up to 90% (or so we are told).

Compression and columnar storage go hand-in-hand for two reasons:


  • Column-based storage usually sorts columns by value, usually at the byte-code level. This results in similar values being close to each other, which happens to be a data layout that results in highly efficient compression using standard compression algorithms that make use of similarities in adjacent data. Wikipedia has the scoop here:
  • When queries are executed on a column-oriented store it is often possible to execute the query directly on the compressed data. That’s right – for some types of queries on columnar-databases you don’t need to decompress the data in order to retrieve the correct records. This is because knowledge of the compression scheme can be built into the query engine, so query values can be converted into their compressed equivalents. If you choose a compression scheme that maintains ordering of your keys (like Run Length Encoding), you can even do range queries on compressed data. This paper is a good discussion of some of the advantages of executing queries on compressed data:


4. Move processing to the data

Lastly, the BWA and HANA systems make heavy use of the technique of moving processing closer to the data, rather than moving data to the processing. In essence, the idea is that it is very costly to move large volumes of data across a network from a database server to an application server. Instead, it is often more efficient to have the database server execute as much processing as possible and then send a smaller result set back to the application server for further processing. This processing trade-off has been known for a long time, but the move-processing-to-the-data approach was popularized relatively recently as a core principle of the Map-Reduce algorithm pioneered by Google:

This approach is especially useful when an analytical database server (which tends to have high data volumes) implements columnar-storage and parallelization with compression and heavy RAM-caching, so that it is capable of executing processing without becoming a bottle-neck.


There are also a few technologies that I suspect SAP has rolled into HANA, but since they don’t share the detailed technical architecture of the product, I don’t know for sure.

1. Parallel query evaluation

Parallel query execution (sometimes referred to as MPP, or massively-parallel-processing, which is a more generic term) involves breaking up, or sometimes duplicating, a dataset across more than one hardware node and then implementing a query execution engine that is highly aware of the data layout and is capable of splitting queries up across hardware. Often this results in more processing (because it turns one query into many, with an accompanying duplication of effort) but faster query response times (because each of the smaller sub-queries executes faster and in parallel). MPP is another concept that has been around for a long time but was popularized recently by the Map-Reduce paradigm. Several distributed DBMSes implement parallel query execution, including Vertica, Teradata, and hBase

2. Write-persistence-mechanism

Since HANA is billed as ANSI SQL-compliant and ACID-compliant, it clearly delivers full write-persistence. What is not clear is what method is used to achieve fast and persistent writes along with a column-based data model. Does it use a write-ahead-log with recovery? Maybe a method involving a log combined with point-in-time snapshots? Some other method? Each approach has different trade-offs with regards to memory consumption and the ability to maintain performance under a sustained onslaught of write operations.


So, there are still a lot of questions about what exactly SAP means (or thinks it means) when it talks about “In Memory”, but hopefully this helpsto clarify the concept, and maybe prompt some more clarity from SAP about its technology innovations. There is no denying that BWA was and HANA will be a fairly innovative product, but for people using this technology it is important to get past the facade of an innovative black-box and understand the technologies underneath and how the approach applies to the business, data, or technical problem we are trying to solve.

To report this post you need to login first.


You must be Logged on to comment or reply to a post.

  1. Greg Chase
    While I’m not in a position to comment on whats inside Hana (because I don’t know), I will say that your article is extremely well thought out and researched.  This in and of itself makes it a very worthwhile read.   Thank you for taking the time to post this!
  2. Witalij Rudnicki
    Hi Ethan. I read you post, and to me it is what you mean of what SAP means by “In-Memory” 🙂 And to be more precise it is a list of multiple technologies used in BWA: column-oriented, MPP, compression. For SAP “In memory” means simply: in memory.
    ICE, which is correct (for the moment 😉 name for BAE or NewDB, is going to be both column-oriented and row-oriented storage, running in both MPP and SMP versions. Compression techniques are used in row-oriented databases as well (e.g. 2-4X claimed by Advanced Compression in Oracle Database 11g EE), but it is combination of dictionary-based coding, inverted indexes, bit coding and spares compression that allows to compress column-based data sets to ~10X [Taken from my today’s session at].
    I am constantly on the road and still have plans to blog about SAP HANA sometime soon. Stay tuned.
    Cheers. -Vitaliy
    1. Ethan Jewett Post author
      Hi Vitaliy,

      Thanks for the insightful comment! You are totally correct that this blog is what *I* think SAP means by “In Memory”. I tried to be clear about that, but hopefully if I failed this will clarify the situation.

      Regarding what “In Memory” means for SAP, I’ve struggled with this for a while and this blog is the result of that thought and investigation. It seems clear to me that when SAP is using the term “In Memory” in press releases such as this one (, the SAP is using it as a stand-in for all the technologies in BWA/ICE/BAE or whatever the acronym of the day happens to be 😉

      I really look forward to reading about your thoughts and experiences with HANA! Hopefully it will help me better understand the additional technologies SAP is using in that product.


  3. Viralkumar Patel
    Thanks for writting such an informative blog after having detailed research.
    I would say its emerging technology and heard lots about it, as I am still not fully cleared with it.
  4. trupti agarwal

    HANA Memory Usage


    SAP HANA is a leading in-memory database and data management platform, specifically developed to take full advantage of the capabilities provided by modern hardware to increase application performance. By keeping all relevant data in main memory (RAM), data processing operations are significantly accelerated.

    “SAP HANA has become the fastest growing product in SAP’s history.”

    A fundamental #SAP #HANA resource is memory. Understanding how the SAP HANA system requests, uses and manages this resource is crucial to the understanding of SAP HANA. SAP HANA provides a variety of memory usage indicators, to allow monitoring, tracking and alerting.

    Read More:


Leave a Reply