Embracing the Analytical Transaction Database Revolution
At IDC, we talk a lot about “digital transformation” — the concept that the increasing role of digital data, media, and content is transforming our lives, and especially how we do business. Although the early stages of this transformation were incremental, moving from automation of manual tasks and the digitization of content to decisions driven by business intelligence and advanced analytics, we have now reached a stage of critical digital mass, thanks to technologies that enable us to collect, categorize, curate, and leverage a much larger volume and variety of data at a much greater velocity than ever before. Using not only historical business data, but current transactional data, streaming sensor data, interactive data from smart, mobile devices, and a range of content, we can build a very complete view of the enterprise and its business context, and act on that knowledge very quickly, even automatically. With a full contextual view of the business situation, together with live transactional data, we are in a position to make decisions “in the moment”.
Doing this, however, requires the coordination of operations, and the leveraging of all relevant data in a single processing context. Most enterprise IT configurations are not set up to do this, even only for structured data. We have also always segregated analytical data from transactional data, because the two types, when managed on a disk-based database, require very different structures and handling, and because analytic queries would slow down transaction processing if they were together in one system. For this reason, operational data would be maintained on a transactional database, and then copied over to an analytical database (such as a data warehouse) for analysis, usually using an ETL process. Obviously, this means that the analytical data would always be hours, or even days old before it could be examined.
The reason for this is that the conventional way of managing data has been based on disk-optimized database technology. The disk-optimized approach assumes that the data is resident on disk, and is optimized to minimize disk i/o. When using a disk-optimized database, it is important to arrange the data in disk volumes in an optimal fashion, to set up partitions and indexing schemes carefully and strategically, and to periodically review and revise them. Since the optimizations are completely different between transactional and analytical data, the two cannot be mixed. If the data structure changes for any reason, the optimization scheme must be reworked, and usually, the database needs to be reorganized, requiring weeks before the changes can be implemented. In a world that stresses business agility, this type of overhead for every change in an application or analysis is simply unacceptable. Also, obviously, having to wait hours or days for data makes it useless for decisions “in the moment”.
Fortunately, we are in the midst of a technological revolution in database; one that replaces the disk-optimized approaches of prior generations of DBMSs with a memory-optimized approach. Modern in-memory databases perform both transactions and analytics in memory using single data copy enabling “in the moment” decision making while eliminating data duplication and latency. The memory-optimized approach considers data to reside in main memory, and uses the disk for recovery purposes (so that no data is lost in case of a failure of some sort). Changes to data structures can be effected immediately, the system is largely self-tuning, and since all operations are in memory, processing is much, much faster. But the benefits go beyond speed and agility. They also include the ability to rapidly incorporate business intelligence data together with live transactional data in complex queries, because the ETL stage is gone; the analytical and transactional data are together in one database. We can even modify our applications to leverage those query results in carrying out transactions as they happen. The result is a “smart” application, which can do something that was never practical before: use complex queries to govern transaction processing. This capability is something we call “analytical transaction processing” (ATP).
Some vendors have developed technologies that can provide various levels of ATP support. Incorporating data from multiple sources with the core analytical-transactional data, however, brings enterprises to the next level — one that enables not only decisions “in the moment,” but decisions based on the very best information available. IDC has called a system capable of delivering such functionality an “analytical transaction data platform” (ATDP). Such a platform would have a memory-optimized core database, ideally with all or most of its active data always in memory (thus an in-memory database, or IMDB), surrounded by a capability to bring in related supporting data as needed for more complete analysis. It should be the aim of every enterprise that seeks to keep ahead of the curve in its market to deploy an ATDP as a cornerstone capability, either in the data center, or in the cloud.