Hana Smart Data Integration – Fun with Transformation Services
This post is part of an entire series
Hana Smart Data Integration – Overview
- Hana Smart Data Integration – Adapters
- Hana Smart Data Integration – Batch Dataflows
- Hana Smart Data Integration – Realtime Table Replication
- Hana Smart Data Integration – Realtime Sources with Transformations
- Hana Smart Data Integration – Realtime Sources with History Preserving
- Hana Smart Data Integration – Architecture
- Hana Smart Data Integration – Fun with Transformation Services
By combining Hana Table Type objects as either source or target or both, complex data transformations can be designed easily thanks to the FlowGraph editor to provide services to the users. As these services are all based on Hana SQL, they can be consumed by all Hana applications, ABAP programs, XS applications, procedures,…
To quote Sheldon Cooper from the Big Bang Theory episode “Fun with flags”:
Now, today’s episode of ‘Fun with Flags’ is not fun, but it is important.
The Hana smart Data Integration option consist of multiple building blocks, including Hana objects as source/targets (tables) and Transformations. What do you get when combining Hana Table Types and SDI Transformations together? An infrastructure for transformation services.
I would like to provide a service in Hana to cleanse addresses. The user should be able to input any address and the return is the same data plus the corrected address.
This is pretty much the same kind of dataflow as if reading and writing table data, except this time a TableType is used. I have created a table type to be used as input by executing the command
create type address_input as table (
customer_id varchar(8), street nvarchar(60), city nvarchar(30),
postalcode nvarchar(20), country_code nvarchar(3));
and the result can be seen in the Hana catalog.
Similar a table type to be used as output
create type cleansed_address as table (
customer_id varchar(8), street nvarchar(60), city nvarchar(30),
postalcode nvarchar(20), country_code nvarchar(3), standard_city nvarchar(30),
standard_postalcode nvarchar(30), standard_street nvarchar(60), standard_house_number nvarchar(10));
The goal of this example is to take the input, derive more data from the input values and output the additional information. As seen from a transformation, that is a normal FlowGraph, one that reads a table, well table type in that case, does transformations and writes the results.
So the only difference is the source resp target, it is a “Input Type” instead of a “Data Source” and when adding it, we are asked to provide a table type structure from the catalog.
What kind of transformations are done between source and target is secondary, in this case is it just copying the ADDRESS_INPUT.STREET to the STANDARD_STREET output column in addition.
When this is activated a task is created which has two table type parameters. Like in stored procedures, the data has to be prepared and then the procedure, well Task, is called.
As a table type is just a structure, nothing that can hold data, two temporary tables are created with the ADDRESS_INPUT and CLEANSED_ADDRESS structure. In the input a record is added, then the task invoked and then the result selected from the output table.
Now it is time to play with the transformations. Obviously the Data Cleanse transform is much better suited for the goal, so I removed the Filter node (“Transformation1”) and replaced it with a Data Cleanse.
When dropping the transform into the canvas it provides a simple wizard like interface to configure the input and output.
The Data Cleanse transform requires the information in which column it can find what kind of information related to the cleansing task. Since this information was not provided, by clicking on Edit Defaults the content information for each column can be set. CustomerID is Unknown, meaning it has no impact on the cleansing process, the column STREET contains text like “Innsbrucker Straße 34”, so it is more than just a street, it is an entire address line.
Next step is to configure the output of the transform, the additional columns it should produce. To help with that it provides typical examples to pick from. Since the output structure should have a separate house number column, the 7-column address had been selected.
The transform does output all input columns plus the 7 additional address fields it generates. So once the transform is connected to the target table type, the available transform output fields are mapped to the various target fields.
After copying the address files for Austria and hence the postal information down to street and house number were available, a few more changes have been made to the input address.
It correctly found out the official street name is “Innsbrucker Bundesstraße” and not “Innsbrucker Straße” and although the street runs from Salzburg City (Postcode: 5020) into Salzburg-Wals (Postcode: 5071), the house number 34 indicates we are taking about the Salzburg City.
Trying the same with google maps leads to the same result, Google does change the postcode to 5020 as well.
So what is the impact of all that?
First of all it shows the openness of the Hana Smart Data Integration solution. The same kind of transformations used for Data Provisioning can be used for providing services as well. In above example an extreme case had been used, one where no Hana table was used at all. But by modifying the dataflow, all kinds of use cases can be solved:
- A service data is sent into, looked up in a Hana table, the result is returned. Example: Search for customers with this address in the customer master data table. Maybe this record should not be created a second time but exists already.
- A service data is sent into and after transformations loaded into Hana tables. Example: Create this record as new customer in the master data table.
- A service to be customized by the user. Example: Instead of writing tons of ABAP or JS code, the user can modify a pre-built dataflow to add his own transformations and configurations.
- Use above services via an XS application
- Or via OData
- Or in an ABAP program
do you know if there is some transpose function to pivot data ?
Yes, the Unpivot (and pivot) transform does that.
What is the added value of the Input Types and Output Types then? That you can run the same flowgraph for different date sources and target tables? Did I understand it well?