The real questions is “What has changed to alleviate data bottlenecks, slow of response time and complexities of data architecture that will NOW allows us to do things we have always aspired to do in Finance”. These are the things that we have felt “almost” in our grasp since the invention of spreadsheets.
The big game changer is the advent of in-memory technology:
With an in-memory database, nonvolatile [disk] storage is still needed to ensure we have a persistent data store of all transactions, but this “write” function can be performed after the information is written to memory. The benefit of writing it to memory is that it can be immediately read from main memory eliminating the latency of disk I/O. Although this provides a big decrease in response time, there are additional techniques that decrease response time even further:
Organizing Information in a Columnar Data Store
Conceptually, relational databases write information in complete transactions, as a long sentence made up of columns that electronically describe the transaction. We say we write the information to “rows” of the database. In (Figure 1) each row represents a vendor purchase/payment transaction. Analytics, however, are typically performed on selected “columns” of the database. We don’t need all 11 columns in the above example to analyze vendor purchases; we only need the “Amount” column and the corresponding “Vendor Id”. If we want to do analysis it would be beneficial to organize the data in a columnar database.
There are benefits to both row and column organized databases, row databases are appropriate when the system needs to process a complete record at a time, like recording a vendor invoice for payment. Column databases are appropriate for analytics, when I don’t need all the columns of a transaction [row] as in the example above. The workload has changed and is mainly driven by analytical style queries even in transactional systems.
Compressing the Information
There are two benefits to data compression. It reduces overall memory requirements of the system and read operations are performed more efficiently because the data segment being read is smaller. Data compression is performed on columnar data where columnar values are converted to integer codes and these integer codes are stored in a separate dictionary for each column. In the above example, figure 2, we show the column contents with the system assigned sequential record ID. In figure 3 we show the unique column values being converted to an integer id and represent the dictionary. Figure 4 shows the compressed column of information. A performance factor of between 100 and 1000 can be achieved when comparing operations on integer-encoded compressed values vs uncompressed data.1
Supporting parallel execution
Columnar data stores facilitate parallel execution because calculations using different columns can be performed by different cores. Also, calculations on one column can be executed by dividing the column into multiple sections that are processed by different processor cores and aggregating the subtotals produced by each core. This also has a high multiplier reduction in processing time as well as mirroring Excel like calculations.
The speed enhancements from the “raw” in-memory database coupled with the three techniques above allow us to merge On Line Transacting Processing Data, (i.e. the financial and operational transactions) with the On Line Analytics Processing [OLAP] data analysis and modeling into one database. This allows us to create a database with simplified data schema that provides a single source for all calculations. The speed enables us to perform all calculations on-the fly (i.e. no aggregate tables), thus, we can combine the actual results, with the analysis (typically performed off line in Excel) and the simulation and planning processes (typically performed in an EPM solution). It also creates a direct line of sight from the lowest level transaction to the highest level calculated KPI. When split among two databases this “line if site” is fragile at best.
While the in-memory database makes the Financial Reformation possible, the other key component fueling the reformation is creating a simplified data architecture that is easy to understand, use and consume, without requiring a road map of how to perform inner and outer joins to combine the necessary information. In S4/HANA Finance we do that with the Universal Journal. I will talk about that in my next blog.
1 Hasso Plattner. “A Common Database Approach for OLTP and OLAP using an In-Memory Column Database.” Providence, Rhode Island, 2009.