SAP HANA Smart Data Access(1): A brief introduction to SDA
We have a Chinese version (”SAP HANA Smart Data Access（一）——初识SDA”) of this blog.
In the application scenarios of SAP HANA, it is common to analyze and process the data located in other systems. Usually, customers would like to replicate the data from other systems to SAP HANA, and then do analysis and processing in SAP HANA. However, data replication not only costs time and memory, but also, usually, requires another replication system deployed which is always not easy. SDA, abbreviation for Smart Data Access, provides customers a new way to access the date in remote data source.
What is SDA
SDA is a new method of SAP HANA for accessing the data stored in remote data sources. With the help of SDA, SAP HANA can create so-called “virtual table” mapping to tables located in remote data sources, and then SAP HANA can access the data directly by accessing the “virtual table”. “virtual table” can be manipulated by SAP HANA just like an ordinary table, which means the operations, such as select, update, insert, delete, and so on, are all available for “virtual table”. Besides, join operation between local table and “virtual table” is supported. When such join operation taken, optimizer of SAP HANA sends the relevant operations to remote data source for processing, and then the result set would be sent back to SAP HANA for further processing.
SDA was introduced in SAP HANA SPS06. At that time, the data sources supported by SAP HANA SDA includes: SAP HANA、Sybase ASE、SAP Sybase IQ、Teradata database and Apache Hadoop. And only read operation was permitted for virtual table. In SAP HANA SPS07, the data sources and operations supported by SAP HANA SDA are both extended. MSSQL Sever and Oracle are added to the list of supported data source, and write operation is permitted. The comparison of SDA in SPS06 and SPS07 is as below:
Supported data sources
SAP HANA、SAP Sybase ASE 15.7 ESD#4、SAP Sybase IQ version 15.4 ESD#3 and 16.0、
Teradata database version 13.0、Intel Distribution for Apache Hadoop version 2.3
All data sources supported in SPS06，Oracle Database 12c、Microsoft SQL Server version 11 for SQL Server 2012
Supported operations for virtual table
Note：the data sources officially supported by SAP HANA are limited to specific versions above, other versions are not guaranteed to work well.
Creating Data Source
The first step of accessing remote data source is to create remote data source in SAP HANA. The communication between SAP HANA and remote data source is based on ODBC protocol. The subsequent blogs of this series will talk about how to deploy remote data source in SAP HANA server side. Here, let’s simply talk about how to create remote data source in SAP HANA Studio.
In SAP HANA Studio, there are two ways to create remote data sources, one is by GUI, another is using SQL statement.
(1) Create remote data source with GUI:
Firstly, Open the folder called “Provisioning”. And then right click the “Remote Sources” with mouse, select “New Remote Source…”:
Secondly, choose one adapter from the adapter list in the popup dialog, and fill in corresponding connection and authentication information of the remote data source.
(2) Create remote data source with SQL:
CREATE REMOTE SOURCE <src_name>
ADAPTER <adapter_name> [CONFIGURATION FILE ‘filename’]
CREATE REMOTE SOURCE ORCL_11g_LNX
CONFIGURATION FILE ‘property_orcl.ini’
WITH CREDENTIAL TYPE ‘PASSWORD’
In above SQL statement, <adapter_name> can be one of: ASEODBC, IQODBC,TDODBC, HIVEODBC,ODBC. Obviously, ASEODBC is for Sybase ASE as data source, IQODBC is for Sybase IQ, TDODBC is for Teradata Database, HIVEODBC is for Hadoop. And ODBC adapter is for other common data sources. <connection_info_string> is used to specify the connection information for data source, the name of DSN is usually given here. <opt_credentials_clause> is used to specify the authentication information of data source. Attention please, only adapter ODBC requires the CONFGURATION FILE, the functionality of configuration file will be introduced in next section.
Generic Adapter Framework
With the help of SDA, SAP HANA can communicate with the data sources who supports the ODBC protocol. However, just as discussed above, the supported data source of SAP HANA SDA is still limited now. For the supported specialized data source, SAP HANA has provided native code to support their operations. But SAP HANA SDA can’t guarantee other data sources to work well. What stop SAP HANA SDA from supporting more ODBC data sources is that some operation and configuration of these ODBC data sources can’t be processed by standard ODBC interface. For example, prepare a transaction for Sybase ASE requires some additional code which is not included in standard ODBC protocol. As SAP HANA provides such code, the operation for Sybase ASE is supported.
In order to decrease the influence of this issue, SAP HANA SDA applies Generic Adapter Framework to implement the communication with those unsupported ODBC data sources instead of calling the specialized native code for that data source in SAP HANA. With the help of Generic Adapter Framework, you can customize the feature and action of data source by setting a configuration file. For example, you can specify the supported operations, function mapping, data type mapping of the data source in the configuration file. For convenience of illustration, we call the configuration file “Property Configuration File” in the rest of this blog.
When creating data source, SAP HANA SDA will use the Generic Adapter Framework to communicate with remote data source if the ODBC adapter is chosen. SAP HANA SDA will search the property configuration file in the folder specified by environment variable DIR_EXECUTABLE, and the file name is specified by CONFIGURATION FILE option. By SPS07, SAP HANA SDA has provided the template of property configuration file for MSSQL and Oracle. They are called property_mss.ini and property_orcl.ini, and they are both located in the folder: $DIR_EXECUTABLE/config.
After data source created, SAP HANA SDA will parse the relevant property configuration file, all the features, function mappings, data type mappings and other properties will be linked together with the data source, and influence the communication between the SAP HANA and the data source.
A part of content of property_orcl.ini is below, we can figure out some format and function of property configuration file:
Typical process of creating data source
Creating a remote data source in SAP HANA usually involves steps below：
- Check whether SAP HANA provides specialized adapter for the data source, such as “ASEODBC”,”IQODBC”,”TDODBC”;
- If specialized adapter is available, then just use it to create data source;
- If specialized adapter is not available, then check whether there is a specialized property configuration template file, such as template for Oracle, MSSQL;
- If specialized property configuration template exists, you can change the property configuration file according to your requirement, and then create data source using the modified file. For example, as long as the correctness of the modification ensured, you can disable the unnecessary functions, modify the mapping of data type or function based on you requirement.
- If specialized property configuration template exists, you have to create a brand new property configuration file from scratch. To create such a file, you must be familiar with the properties of the data source and the driver it use;
- Create the data source in SAP HANA Studio using the specialized adapter or the common adapter (ie. ODBC adapter). When using common adapter, you need to specify the property configuration file for the data source.
Note: when modifying or creating the property configuration file, only the property which is different from default value needs to be set. The mistake in property configuration file may result in incorrect action or result of data source.
Creating virtual table
After data source created in SAP HANA Studio, “virtual table” mapping to data in remote data source can now be created in SAP HANA Studio. Similar to creating data source, there are also two ways to create virtual table:
(1) Create virtual table with GUI:
(2) Create virtual table with SQL Statement below:
CREATE VIRTUAL TABLE <table_name> <remote_location_clause>
CREATE VIRTUAL TABLE sda.vt AT “ORCL_11G_WIN”.”NULL”.”OUTLN”.”CANDIDATES”;
In this blog, we have talked about basic content of SAP HANA SDA. In the subsequent blogs of this series, we will further talk about how to deploy SDA data source in SAP HANA Server, how to access Hadoop with the help of SAP HANA SDA, and so on. Please pay attention.
1. What’s New SAP HANA Platform Release Notes for SPS07:
2. Section 6.1.1 of SAP HANA Administrator Guide：