Immutability
Another day another flood of articles, blogs, PR pieces and opinions about Blockchain.
I am no blockchain expert – but I do have a bit of an idea what it is all about and why people are so excited about it. Having said that blockchain is clearly still in the realms of a solution looking for a problem – if you exclude crypto-currency as the problem.
My BS detector goes off big time when this “peer-to-peer network” of “independently controlled nodes” where “every participant is equal to all others” and “there is no central authority” is being flogged by the very same central authorities it was designed to bypass.
However one attribute of blockchain is very interesting. Immutability.
Immutability means that once a block is created and inserted into the blockchain it cannot be modified without the consensus of the blockchain participants.
This is a very useful characteristic. Take, for example, a bank account as shown in this sample statement. (Source)
Business
Cheque Statement
Customer Number | Account Summary |
|
Opening Balance | $6,713.87 | |
Total credits | + $1,510.28 | |
Total debits | – $5,070.38 | |
Account enquiries ( 132 032 | ||
Call MyBank Telephone Banking | Closing Balance | + $3,153.77 |
Details of your account |
For the period from |
31 Jul 1997 to |
Date | Description of transaction | Debit | Credit | Balance |
1997 | STATEMENT OPENING BALANCE | 6,713.87 | ||
1-Aug | DEPOSIT CAPALABA OLD | 235.15 | 6,949.02 | |
1-Aug | STATE GOVT TAX ON WITHDRAWALS | 38.90 | 6,910.12 | |
1-Aug | WITHDRAWAL/CHEQUE 301095 | 45.00 | 6,865.12 | |
1-Aug | WITHDRAWAL/CHEQUE 301096 | 883.30 | 5,981.82 | |
5-Aug | WITHDRAWAL/CHEQUE 301086 | 36.00 | 5,945.82 | |
7-Aug | WITHDRAWAL/CHEQUE 301099 | 855.10 | 5,090.72 | |
8-Aug | WITHDRAWAL/CHEQUE 301100 | 566.60 | 4,524.12 | |
10-Aug | WITHDRAWAL/CHEQUE 301097 | 141.80 | 4,382.32 | |
12-Aug | WITHDRAWAL/CHEQUE 301098 | 253.80 | 4,128.52 | |
13-Aug | DEPOSIT CAPALABA QLD | 656.20 | 4,784.72 | |
15-Aug | WITHDRAWAL/CHEQUE 300903 | 362.50 | 4,422.22 | |
18-Aug | WITHDRAWAL/CHEQUE 300902 | 90.40 | 4,331.82 | |
18-Aug | WITHDRAWAL/CHEQUE 301050 | 10.00 | 4,321.82 | |
18-Aug | WITHDRAWAL/CHEQUE 300906 | 883.30 | 3,438.52 | |
20-Aug | DEPOSIT CAPALABA QLD | 9.68 | 3,448.20 | |
21-Aug | DEPOSIT CAPALABA QLD | 369.25 | 3,817.45 | |
22-Aug | WITHDRAWAL/CHEQUE 300905 | 266.98 | 3,550.47 | |
24-Aug | DEPOSIT CAPALABA QLD | 240.00 | 3,790.47 | |
25-Aug | WITHDRAWAL/CHEQUE 300910 | 38.30 | 3,752.17 | |
29-Aug | WITHDRAWAL/CHEQUE 300911 | 598.40 | 3,153.77 | |
31-Aug | CLOSING BALANCE | 3,153.77 |
Proceeds of cheques will not be available until cleared. Please check all entries promptly and notify the bank immediately of any errors. |
|
Statement No. 133 Page 1 of 1 | |
MyBank Banking Corporation ABN 007 457 141 |
Immutability means we can’t go back and modify a transaction, nor can we insert a new transaction into the sequence of transactions.
Double entry book keeping principles mean that for each transaction posted to this account there has to be at least one other countering transaction to balance it. For example each “DEPOSIT” appears in this account as a Credit – so there would be a corresponding Debit transaction in another account called something like “Money Bank Owes Customers”. And for each “WITHDRAWAL” that appears here as a Debit there will be a corresponding Credit – possibly also to the “Money Bank Owes Customers” account.
A single business transaction means there are at least two transactions created and posted to two different accounts which must never be changed. Reversals and adjustments require reversal and adjustment transactions – not changing the original transactions. This provides a clear and verifiable audit trail of everything.
Before computers came along white collar crime could take the form of a creative accountant with some liquid paper and an array of coloured pens. These days financial systems have strict controls around data security but technically you could bypass these controls – if you knew how – and adjust the data to present “alternative facts”.
So immutability is important.
Guess what other technology provides immutability? SAP HANA.
That’s right the column store in SAP HANA is “insert-only”. You cannot modify any data once it is inserted. You have to insert new data that supersedes the old data. Essentially each record as a “start-of-life” and “end-of-life” timestamp that defines it’s relevance. Similarly you can’t delete data either.
Of course this characteristic is hidden from everyone by Structured Query Language (SQL) – the lingua franca of the database management system. When a developer says …
UPDATE Customers
SET ContactName = 'Alfred Schmidt', City= 'Frankfurt'
WHERE CustomerID = 1;
… HANA inserts new records for Alfred Schmidt and Frankfurt and starts their lives while ending the lives of their predecessors.
But those predecessors are still in the HANA database and to my mind could be used to fully reconstruct an audit trail of the customer records for CustomerID 1 at any time. As it stands good old SQL abstracts this “expired” data from the database consumer – but I don’t see why it couldn’t be used to provide an immutable audit trail of all changes to a record or a complete dataset.
It could even be represented as a linked chain of blocks showing each update to the dataset. Sort of a block chain. 😉 #seewhatIdidthere
You could even time travel back and forth through your dataset. Fascinating!
Here's an interesting alternative view...
https://www.multichain.com/blog/2017/05/blockchain-immutability-myth/
One question. If records of a column store are never deleted won't the DB eventually run out of space?
I had exactly the same thought!
Hey Graham, nice post, clear and simple introduction to immutability and relating blockchain's use to HANA's features.
The concept of immutability is an interesting one, and extends to some programming languages too. In my attempts to educate myself in the realms of functional programming, I have come across immutability as an important principle, along with other important principles such as higher-order functions. Elm is one of the languages I flirted with, and there are no variables - only immutable constants.
I explored a couple of aspects of functional programming with Elm, Haskell and other functional languages in this talk I gave last year: Discovering the beauty of recursion and pattern matching and one can see that immutability is not as restricting as one might think (the examples are deliberately simple but the principle holds for more complex cases too).
To your thoughts about time travel, Elm is a language in which someone has implemented a time travelling debugger, inspired by a classic talk by Brett Victor. A good, short video intro to this is: LambdaCat - 01 - elm-reactor: a Time Travelling Debugger.
unless someone has 51%...
Hi Graham,
good to see you cross over to the dark side (but maybe not necessarily leave the bright side?).
i also noticed a quite concerted effort in putting out an avalanche of blogs, articles, and solutions being somewhat cryptically pushed by our hosts here. it's all for good, i hope, and the little guys won't get steamrolled into something they have absolutely no control over. time will tell.
while we are at it, i just want to take a little different position. being highly appreciatiive the engineering marvel that HANA is and can actually very well be a part of a blockchain, it is not its main element or purpose or what have you. at the moment, it's not even indispensable until enough large accounts actually choose to keep their blockchain ledgers in it.
btw, it's very easy to become an expert in blockchain as there are plenty open source projects on github and you can fork it, modify it, and run it any way you feel it will suit you and your clients.
rgds, greg
Hey Graham,
It's probably worth noting that unless I'm mistaken, HANA tables don't have these characteristics by default. My understanding is that the delta store does use tombstones and additive-only changes but during merge into the main store these changes are combined. Only in history tables are all changes kept for posterity, and these tables have operational limitations, including the issue with size growing along with operations even if the number of current records doesn't increase. Kinda like a blockchain...
Anyway, I'll just point out that in the blockchain world "immutability" is a misnomer. It is *expensive* to change historical fact in blockchains that use proof-of-work, but they are not immutable. The cost of changing them can be calculated and it quickly becomes infeasible if proof-of-work requirements are high enough. But this just highlights that so-called "immutability" in blockchain systems based on proof-of-work is a direct tradeoff of difficulty of historical change vs. the inverse of transaction cost (in both time and computation). Bitcoin is not optimal, but the facts that it is very difficult to change history in Bitcoin and transactions are expensive and take quite a while are not unrelated. In fact, they are directly linked.
Most corporate blockchains can't afford these kinds of transaction costs and latencies, so they usually use a trusted central administrator or a small trusted consortium of administrators. And these administrators can change history whenever they want to. The only thing that stops them in well-designed systems is that they publish the transaction record, so anyone can save off the transaction record and then compare their version of history to the version of history currently offered by the administrator. This makes it easy to spot changes if you are paying attention. But the same feature can be offered by any system based on databases, or even paper. The key is to allow unrelated 3rd parties to copy to the transaction record whenever they want to.
Ethan
This is one of the blog posts where I wish we would have had a longer conversation about the topic beforehand.
The “potential for an application feature” you identified in the way SAP HANA column store works has been noticed by the SAP HANA architects from the beginning. If you watch the early openHPI lectures with Hasso Plattner you’ll hear him talk about the built-in versioning of records. This is exactly what you’re talking about.
In SAP HANA this feature was offered as so-called History Tables that mostly got mentioned in the context of “time travel” queries.
Now, if you search this site for “history tables” or “time travel” you’ll find a couple of discussions in which I argued vehemently against using them. The reasons for my dislike of history tables in SAP HANA are that the technical implementation was inefficient and, more important than that, the supported query semantic is weak.
I don’t want to go into too much detail here on the ins and outs of time-dependent data management, but a crucial difference that happens to make all the difference is that in a relational database records, as seen by the application, are distributed across many tables. That means, that simply keeping track of the entries in a single table does not suffice to reconstruct a true version of a complete record/document.
Blockchain is different here as it doesn’t have the concept of distributed entities or normalisation.
Another important aspect is that database record validity is not the same as application level validity. We need two validity time frames here to properly manage both, which leads to the bi-temporal data management concept.
The point here is that tables are not application level entities. These are a higher level of abstraction and the immutability requirement/feature is required on this level.
With SAP HANA 2 SP03 onwards, a new feature is available the system-versioned tables. These offer application developers an automatic way to
Given these options, it is now possible to build applications that support time-dependent queries efficiently. Not only views of the data as of a certain time in the past are possible, but also cumulative version histories (e.g. which records have changed over time? or how did a specific record change?).
By leveraging partitioning and dynamic tiering, the memory overhead of keeping historic data available can also be managed a lot better than with history tables, since now application specific knowledge can be leveraged (e.g. “closed orders” can safely be in a “warm” store instead of main memory).
Having written all this, I’ve got to also write: I’ve yet to see the database level implementation of time-dependent processing that gets widely adopted. In all cases, I’ve seen so far, the concept of how time should work was heavily dependent on the application logic itself. Defining these in a precise manner is complex and hard and very often maps to a whole set of SQL commands to reconstruct records according to these application semantics. Database record validity time stamps alone don’t cut it.
So, yes, SAP HANA gives you an immutable data trail for your application records - but it’s not as easy as simply tracking all row changes.
Well that explains everything - LOL
A very interesting topic/read and some great input also, thanks.