Difficult to consolidate data?
Do you want to combine Enterprise data with Big Data?
Hierarchies? OLAP models? Currency conversion?
The answer is VORA!
Big Data customers have always been struggling with these types of questions. Using Vora 1.3 we can now tackle those questions! In this blog, I have created an OLAP model and hierarchy based on HANA and HDFS data.
The modelling was done using VORA 1.3 Modeller. Vora Modeller offers range of new features that we introduced with Vora 1.3.
VORA modeller is available at the following host and port:
Before starting the modeller, make sure that all the services are working and started. Ambari can be used to check if the services are working. Ambari sits on the master node of Vora cluster.
There are currently, 4 tabs:
– Data Browser – you can browse through the tables and data which can be either in HANA source or Vora source.
– SQL Editor – To write SQL commands.
– Modeller – To create OLAP models, hierarchies, currency conversion etc.
– User Management – To manage users and roles.
- Creating tables and loading data in HDFS. The SQL below contains the create table script along with the path where the data sits (in this case in HDFS as a CSV file). It also contains the API (com.sap.spark.vora) which calls the VORA libraries.
CREATE TABLE SUPPLIER (
S_SUPPKEY integer, S_NAME varchar(25), S_ADDRESS varchar(40), S_NATIONKEY integer, S_PHONE varchar(15), S_ACCTBAL double , S_COMMENT varchar(200))
OPTIONS ( tableName “SUPPLIER”,paths “/user/hadoop /supplier.tbl”, csvdelimiter “|”);
To import files in HDFS, you will need to use the Hadoop commands such as hadoop dfs -put *.tbl <Hadoop directory>
- Register the tables created to import tables from the SAP HANA Vora data source into a new Spark context. REGISTER all TABLES USING com.sap.spark.vora IGNORING CONFLICTS
- Now, time to connect HANA data source. To connect to HANA Data source, HANA API – com.sap.spark.hana is used from VORA to HANA to call the tables in HANA. This is a dataframe which allows table to be read, if the table does not exist then a new table is created in the SAP HANA database with the DataFrame’s schema and data is inserted into that table. Please make sure that your VORA and HANA are on same network. To register HANA Tables, following command is used:
REGISTER ALL TABLES
OPTIONS( dbschema “<schema_name>”,
The image above shows the VORA and HANA tables on the left
- Creating OLAP model on both data sources.The following model is bringing data from HANA source and HDFS. This shows how enterprise data which sits on SAP system or non-SAP system can be merged/combined with data lake scenarios in Hadoop. VORA offers a lot more functions which spark cannot and VORA is very performant than Spark engine processing.
- Building Hierarchies: A level based hierarchy or parent-child hierarchy can be created. In this case, we have created parent-child hierarchy which illustrates the org chart below. Creating hierarchies on VORA modeller is very easy and it offers many functions to create and query hierarchy at any level. In this example, we are querying the level at which node 1 exists.
With VORA 1.3, there is hierarchy icon in VORA modeller. Simply, add the icon which automatically creates a node. Then add the table and assign the node to the table.
Once the parent-child hierarchy has been established, then you can add calculated columns to add or query any level of the hierarchy. Vora offers a range of Hierarchy functions, for e.g. “is_root” function brings the 1st level of hierarchy. Similarly, “is_sibling” will bring the sibling or corresponding levels in the hierarchy.
The above model can then be exposed to any of the BI tool.
If you have any questions, then please do not hesitate to contact me.