SAP HANA Advanced Analytics
|In this blog series you will find quotes, backgrounds, suggested further readings and other information related to my latest book SAP HANA 2.0, An Introduction published by SAP Press.|
Each chapter in the book starts with a quote (or two) and for the chapter about SAP HANA advanced analytics, we quote a sociologist and a psychologist.
The moral is: Not everything that can be counted counts.
—William Bruce Cameron, The Elements of Statistical Confusion, Or: What Does the Mean Mean? (1959)
Also quoted as
Not everything that can be counted counts, and not everything that counts can be counted
attributed to Einstein and others, see the Quote Investigator for the full story.
Statistics provide fertile source for quotes, e.g.:
- There are three kinds of lies: Lies, damned lies, and statistics.
- If you torture the data long enough, it will confess to anything
It has been said: the whole is more than the sum of its parts. It is more correct to say that the whole is something else than the sum of its parts, because summing is a meaningless procedure, whereas the whole-part relationship is meaningful.
—Kurt Koffka, Principles of Gestalt Psychology (1935), p 176 [source].
As you probably know, Gestalt psychology is essential to good design.
In the original manuscript, a long list of references to other material was included, which for space constraints and other reasons did not made it to the final book.
Four and three letters words, of course, are always best to avoid, so we left out the often-quoted HBR article about the Data Scientist. It is a good read, though, and still relevant.
Data modeling and analytics is a core feature of the SAP HANA platform as the database combines both relational (OLTP) and analytical (OLAP) processing. Any additional analytical processing capabilities are referenced for this reason as advanced analytics.
Some of these advanced analytics capabilities were developed on and for the in-memory platform. SAP HANA Spatial is an example. Introduced in SAP HANA 1.0 SPS 06 (June 2013) and enhanced with each SPS, this functionality gradually matured and today provides a complete geospatial platform complemented by partners like Esri (as explained on esri.com).
To learn more about SAP HANA Spatial, take a look at
- bit.ly/hanaspatial – SAP HANA Spatial Resources Blog post by Sharom Om
- Spatial Analysis with SAP HANA Platform – openSAP Course (2017)
- Introduction to SAP HANA Spatial Data Types – Developer Center tutorial
Graph and Series
SAP HANA Graph, introduced in SAP HANA 1.0 SPS 12, and Series Data, introduced in SAP HANA 1.0 SPS 10, are other examples of advanced analytics capabilities developed specifically on and for the platform.
SAP HANA Graph, by the way, is not to be confused with the recently introduced SAP Graph. For this topic, see www.graph.sap (where you can sign-up for the beta).
Here are some links to free resources where you can learn more:
- Analyzing Connected Data with SAP HANA Graph – openSAP Course (2017)
- Get Started with SAP HANA Graph – Developer Center tutorial
- SAP HANA Graph – SAP HANA Academy YouTube playlist
But not everything was developed from scratch. Other advanced analytics capabilities were introduced through acquisitions and integrations. Take Text Search, Tex Analytics, and Text Mining, for example, included already with the initial release of SAP HANA in 2010.
The origins of text analysis go back to the finite-state technology used to model natural languages (NLP or natural language processing). This was developed in the Xerox Palo Alto Research Center (Xerox PARC), which also brought us the computer mouse, ethernet, laser printers, the GUI and WYSIWYG.
In 1997, InXight was founded to commercialize the finite-state NLP technology with the X from Xerox and with products like LiguistX, Categorizer, and ThingFinder. With success, as the company was acquired by Business Objects in 2007 to strengthen its portfolio of Business Intelligence applications. After another acquisition round in 2008, this time by SAP, the technology found its way into the SAP HANA platform and other SAP products, like Data Services, the Cloud Platform, and Hybris-as-a-service (now retired). Below Inxight Smart Discovery in action. Inxight still has a page on Wikipedia.
To learn more about Text Analytics, take a look at
- Full-Text Search with SAP HANA Platform – openSAP course (2017)
- Text Analytics with SAP HANA Platform – openSAP course (2016)
- Perform Text Analysis with SAP HANA Express Edition – Developer Center tutorials
- Search, Text Analysis, and Text Mining – SAP HANA Academy YouTube playlist
SAP HANA Streaming Analytics was introduced in SAP HANA 1.0 SPS 09 (December 2015) as Smart Data Streaming (SDS) together with a number of other “Smart” technologies like Smart Data Integration (SDI), Smart Data Quality (SDQ), Smart Data Access (SDA), and (almost) Smart Data Tiering (branded as Dynamic Tiering instead). SDI, SDA, and SDQ technologies were acquired from BusinessObjects (2007), Streaming Analytics and Dynamic Tiering (IQ) came from Sybase, acquired by SAP in 2010.
The technology itself, like text analytics, is much older. Generally, it is categorized as Complex Event Processing (CEP) and the technology made its way out of the labs in the 1990s for implementations like Operational Intelligence (OI) or Business Process Management (BPM) but also by financial services who were early adopters of the Event Driven Architecture (EDA) in algorithmic trading. Software vendors who pioneered in the technology were eventually all acquired by larger ones. This is how the Aleri CEP engine, the SQL-like CCL (Continuous Computation Language) from Coral8 and the Stream Processing Language Shell (SPLASH) made its way into Streaming Analytics. The explosion of IoT devices and Big Data provided new use cases under a more generic banner of stream processing.
To get hands-on with Streaming Analytics, see
- SAP HANA Streaming Analytics – Developer Center tutorials
- SAP HANA Streaming Analytics – SAP HANA Academy YouTube playlist
Wait, There is More
What about Predictive Analytics? Machine Learning with SAP HANA? Google TensorFlow Integration?
What about PAL, AFL, BFL, APL, EML?
What’s the story with the Knowledge eXtraction Engine (KXEN), the Data Mining Automation Company, and Infinite Insight? How did that get into SAP HANA?
For this, check out SAP HANA, An Introduction, which will give you a concise, yet comprehensive overview. In a single chapter, we pick the nuggets out of the alphabet soup and cover all the essential topics. It will also help you prepare to make the move from SAP HANA 1.0 to SAP HANA 2.0.
SAP HANA 2.0 – An Introduction
Just getting started with SAP HANA? Or do have a migration to SAP HANA 2.0 coming up? Need a quick update covering business benefits and technology overview. Understand the role of the system administrator, developer, data integrator, security officer, data scientist, data modeler, project manager, and other SAP HANA stakeholders? My latest book about SAP HANA 2.0 covers everything you need to know.
Get it from SAP Press or Amazon:
Questions, Comment, Suggestions
Anything I missed? Do not hesitate to post your questions or comments below.
Good stuff? Give it a like and share on social media. Much appreciated!
If you would like to receive updates, connect with me on
Denys van Kempen