In-Memory Data Management 2014 Wrap-up
I just reached the final credits of In-Memory Data Management (2014) – Implications on Enterprise Systems and I’d like to share my thoughts for each session.
I won’t explain the content for each week (you can found a very well explanation here but I’ll give my impressions, what I liked or learned. It’s totally personal, you may found other topics more interesting.
Lecture 1 – History of Enterprise Computing
When you start to hear a senior man with white hair talking about tape storage you may think “what I’m doing? I just bought my very modern smartphone and wasting my time hearing that man talking about store information in… tapes?!”. It’s not true in this case. I always like to hear Platter, it’s like a Jedi Master teaching. This introduction is very important to understand the motivation and birth of in-memory database.
Lecture 2 – Enterprise Application Characteristics
In my ABAP classes I always teach about OLAP/OLTP and the paradigm to have separated machine with different tuning for each one. Here I learnt a different history.
Lecture 3 – Changes in Hardware
How cheap memories, fast network and affordable servers able in-memory computing.
Lecture 4 – Dictionary Encoding
Here is one of the key points of SAP HANA, column storage. Here Plattner explain columnar storage and start to talk about compression.
Lecture 5 – Architecture Blueprint of SassouciDB
A very quick explanation about an academic and experimental database.
Lecture 1 – Compression
It’s another key point for SAP HANA. You will learn about compression technics and yes, you will start to do some math to compare them.
Lecture 2 – Data Layout
More detail about row vs. column data storage. Pros and cons for each approach and a hybrid possibility.
Lecture 3 – Row v. Column Layout (Excursion)
Here we have more of Professor Plattner giving more information about data layout.
Lecture 4 – Partitioning
As a geek I only heard about partition when I want to install two operational systems in the same machine. Here I learned a very powerful technic to help parallelism reach higher levels.
Lecture 5 – Insert
Insert command. Under the hood.
Lecture 6 – Update
Lots of things to modify, re-ordenate and re-write.
Lecture 7 – Delete
Not delete, left behind.
Lecture 8 – Insert-Only
Worry about the future without forget the past.
Lecture 1 – Select
Projection, Cartesian Product and Selectivity. All the beautiful theory about retrieving data.
Lecture 2 – Tuple Reconstruction
Retrieve a tuple in a row database: piece of cake. Retrieve a tuple in a colomn database: pain in the …
Lecture 3 – Scan Performance
Full table scan: row versus column layout. Show me the numbers!
Lecture 4 – Materialization Strategies
Materialization: when the attribute vector and dictionary mean something. Here you will learn two strategies for materialization during a query: early and later materialization.
Lecture 5 – Differential Buffer
I special buffer to help speed up write operations. Do you remember the insert-only paradigm? It’s about “worry about the future”.
Lecture 6 – Merge
When the differential buffer becomes main partition. Do you remember the insert-only paradigm? It’s about “without forget the past”.
Lecture 7 – Join
Once you learn that retrieve a tuple in a column layout is a pain, you can imagine what’s doing a join. Here you will know why.
Lecture 1 – Parallel Data Processing
Very good lesson about parallel data processing. The lecture and reading material try to cover hardware and software aspects of parallelism. Highlight to map reduce. I highly recommend you deep into.
Lecture 2 – Indices
Presenting the indices of indices: inverted indices. “Using this approach, we reduce the data volume read by a CPU from the main memory by providing a data structure that does not require the scan of the entire attribute vector.” (from the reading material, chapter 18).
Lecture 3 – Aggregate Functions
Coming from old-school ABAP generation, aggregate functions still causes some itches in my ears. However, with push-down concept everything changed. Can old dog still can learn new tricks?
Lecture 4 – Aggregate Cache
In the past everything was simple: storage in disk, cache in memory. Today, storage in memory and cache in.. memory too!? Why do I need cache using in-memory database? Cache some chewed data, here is aggregate cache.
Lecture 5 – Enterprise Simulations
Answer insanity-fast a query is only part of the game. Now, enterprise simulations are possible. Change some variables and see the result. Ok, it’s not that simple, but it’s awesome anyway!
Lecture 6 – Enterprise Simulations on Co-processors (Excursion)
Awesomeness of enterprise simulation with co-processors. For who born before internet might remember co-processor 387, “almost” the same. In this presentation we see how co-processors can help high intensive calculation processing.
Lecture 1 – Logging
If you think that logging is just to check what happened in the past or to check who was responsible to change the value that causes the highest incident in production yesterday, it’s better to think twice. Logging have a very important role in recovery process.
Lecture 2 – Recovery
The first think that everyone try to realize when know about in-memory databases is “if power goes down? All my database data will be swiped out?”. Here you learn that it’s right. But you also learn how in-memory database overcome that.
Lecture 3 – Replication
I remember a very simplistic definition of ACID concept: “All or nothing in”. In this lecture we check “all in” concept applied to in-memory databases. How to guarantee ACID in a database stored at RAM.
Lecture 4 – Read-only Replication Demo (Excursion)
Replication in action.
Lecture 5 – Hot-Standby
It’s a very hot topic (sorry…I won’t do it again). Hot-standby works together replication in order to guarantee ACID. It’s a good opportunity to you see why we can say that SAP HANA is a very beautiful piece of engineering.
Lecture 6 – Hot-Standby Demo (Excursion)
Hot-standby in action.
Lecture 7 – Workload Management and Scheduling
SAP HANA is all about speed, including user response. Professor Platner explain the importance to have a very responsive system. Here a quote that summarize it: “we must have to answer to user in the same speed of Excel, otherwise the user will download the data to Excel and work there”.
Lecture 8 – Implications on Application Development
What the implications for that special people that develop application to users? Code push-down (mode business logic to database) and store procedures, yes we’re still talking about ABAP. Those are the biggest paradigm shift for ABAP developers.
Lecture 1 – Database-Driven Data Aging
Carsten Meyer explain news ideas about archiving and old data.
Lecture 2 – Actual and Historical Partitions
Cold data in not about aging, it’s about usage. Nuffsaid Professor Plattner.
Lecture 3 – Genome Analysis
In-memory have very huge implications beyond the Enterprise System. Let me bring a excerpt from “High Performance In-Memory Genome Data Analysis” reading material that can help to desmystify HANA as a luxury: “Nowadays, a range of time-consuming tasks has to be accomplished before researchers and clinicians can work with analysis results, e.g., to gain new insights”.
Lecture 4 – Showcase: Virtual Patient Explorer (Excursion)
Medical and patient stuff with lots of lots of information.
Lecture 5 – Showcase: Medical Research Insights (Excursion)
More medical and patient stuff with lots of lots of information.
Lecture 6 – Point-of-Sales Explorer
How In-memory SAP HANA DB help sales analysis. Three tables and 8 billions rows. Featuring The Professor commenting about SAP HANA performance “freaking unbelievable! People are scared!”.
Lecture 7 – What’s in it for Enterprises (Excursion)
More benefits to use SAP HANA for Enterprise. Decisions are able to be made in real-time basis.
Lecture 8 – The Enterprise Cloud (Excursion)
Bernd Leukert, member of the executive Board of SAP, talking about running business on cloud is much more than upload your files to Dropboxe.
As I said, it’s was my impression about each section. I really enjoy that training and it’s helping me a lot to understand other SAP HANA trainings.
I consider that as the cornerstone for anyone that decide to work with SAP HANA.