The ACID House : What is an ‘In Memory Database’?
In Memory Computing and in particularly In Memory Databases are currently buzzwords du jour amongst the IT vendor community.
But what do they mean? A quick search yields the following definition: –
In-memory computing is the storage of information in the main random access memory (RAM) of dedicated servers rather than in complicated relational databases operating on comparatively slow disk drives. In-memory computing helps business customers, including retailers, banks and utilities, to quickly detect patterns, analyze massive data volumes on the fly, and perform their operations quickly. The drop in memory prices in the present market is a major factor contributing to the increasing popularity of in-memory computing technology. This has made in-memory computing economical among a wide variety of applications (http://www.techopedia.com/definition/28539/in-memory-computing)
Sounds new and different right? No more creaky spinning platters – everything happening in memory – this sounds like a major technological change.
However, at least from the perspective of database systems, things are not so straightforward.
Looking at the leading Relational Database Management Systems, we can see that that they are designed to complete operations in memory – wherever possible – minimising writing and reading from permanent storage (‘disk’). The more memory you give one of these systems the less they write and read and disk. Reads are cached for later use, writes are minimised – with only minimal information being written to disk to meet the ACID requirements of a transaction.
So do the new batch on ‘In Memory Databases’ work differently from this? Well yes and no. For example SAP HANA has some very interesting storage structures which facilitate very fast processing on huge data sets; but this would seem to be more to with the column store structures and query parallel execution rather than being ‘In memory’. It’s still an ACID compliant system which writes out transactions to ‘disk’ albeit SSD – a technology which can be used with any database system. Of course HANA loads all data into memory at start-up; but database systems (and database administrators) have been ‘warming the cache’ for decades.
So what happens if we become a little less ACID? This opens up opportunities for significant performance gains for OLTP systems. Many database systems offer options here. For example SAP ASE allows transaction confirmation without the write going to disk (the Durable of ACID), ‘delayed commit’. Sounds like a bad idea? Depends. Of course in the event of a server crash you risk lost transactions. Not really suitable for a bank; potentially very suitable for an online retailer who needs performance to mitigate losing customers and can risk loosing a handful of customer transactions in the rare event of a system failure.
It’s also worth looking at the mix of transactions you have on a system, using full durability only where the business requires it. This can generate major performance benefits in busy OLTP systems.
Moving on from ‘a bit less ACID’ we can consider placing the entire database in memory without ever writing out to disk – that’s transaction and data pages. SAP offers this with Sybase ASE In Memory databases. Major performance gains can be delivered, without having to change a line of code.
In summary, database systems love memory. They always have done. The more memory, the less reading from disk. Optimised storage structures (i.e. column stores with ‘direct access compression’ – HANA) can allow far more data to reside in the same memory space, allow queries across bigger data sets at in memory speeds. Reducing writing to disks requires more consideration. Moving away from full ACID transactions by utilizing delayed commits and relaxed latency databases can yield real performance gains for OLTP systems.
Finally – replacing your spinning platters with SSD will benefit any database system. Is that ‘In Memory’?