Skip to Content
Technical Articles
Author's profile photo Ian Henry

SDI SDQ Geocoding & Address Cleansing

Have you tried using the HANA Smart Data Quality (SDQ), and found the transforms aren’t working?  Well this blog should help, there are country specific files required for both geocoding (latitude, longitude) and address cleansing.

The steps covered below are

  1. Download the files
  2. Extract the zip file
  3. Verify existing configuration
  4. Update Configuration
  5. Validate Configuration
  6. Build SDI Flowgraph
  7. Errors

1. Download the files

Locate the required files from SAP Software Downloads.
Addresses and Geocoding are both within the “Addreess Directories & Reference Data” category, and data is held a country level, I am using some UK data so have downloaded the UK GEO DIRECTORY, this will be needed for the geocode SDQ transform.

Please note that the address directories need to be licensed.  If you don’t see this download available, please see the following SAP Note 2484632.

You usually want the latest file available

I download these directly onto the HANA box as they can be quite large.

2. Extract the zip file

The official way to install these file is using the hdblcm tool, but you can also do this manually.  Which can be handy if your hdblcm is not working.  The files need to be owned by the HANA SID ADM user (ih2adm) in my case.

3. Verify existing configuration

We can check to see if there’s an existing configuration of the SDQ reference data files

select * 
from SYS.M_INIFILE_CONTENTS 
where FILE_NAME = 'scriptserver.ini' 
and SECTION = 'adapter_framework' 
and KEY = 'dq_reference_data_path';

This should not return any rows.

The same is true in HANA Studio, Administration -> Configuration

4. Update Configuration (dq_reference_data_path)

We can go ahead and update our configuration as below, but replacing with your directory path.

-- EXECUTE ON SYSTEMDB
ALTER SYSTEM ALTER CONFIGURATION ('scriptserver.ini', 'SYSTEM') 
SET ('adapter_framework', 'dq_reference_data_path') = '/hana/SDQ_Reference/' WITH RECONFIGURE

After doing this change we need to restart the ScriptServer.

5. Validate Configuration

Repeating step 3 we should now see our new configuration in place.

6. Build SDI Flowgraph

We can now switch to the WebIDE and create an SDI (Smart Data Integration) Flowgraph.

Here I have just created a dummy .hdbview that doesn’t contain any real personal data.
For the “dummy” table I needed a an hdbsynonym, which has been excellently described by Christophe Gilde

We can then use this in a simple flowgraph as below

DataSource

 

Geocode

Here we can see the SDQ has automatically guessed correctly the Content Types, this is from the data that is being fed in.

It makes sense to add our input data as output columns

 

We can now save, build and Execute the Flowgraph

Before we celebrate the sucessful execution, we should check our target table.

All appears to be good, for one final verification we can put the LAT and LONG in to our favourite mapping tool and check the location looks correct.

Errors

If you receive this error below then you address and/or geocode reference data is likely missing.

15:57:12 (/Sample/db) Execution Failed : (dberror) [686]: start task error: “SAMPLE”.”UK_GEOCODE_SP”: line 5 col 0 (at pos 85): [2620] executor: plan operation failed;Execution of Adapter operation within node GeocodeUK_GEOCODE_TEMPLATE_ failed: exception 141005: Failed to create or initialize HanaTransform object

Looking at the diagnosis files in HANA Studio will give you more information.  Check the scriptserver_alert_<hostname>.trc

Here it appears as though I am out of memory, but this is actually caused by a permissions issue.  Make sure the geo or address files are owned by you HANA <SID>ADM OS user.

[65085]{200563}[2/66728980] 2019-04-04 16:08:16.664458 e af_core hanaLogging.cpp(00198) : <GEO0018>: GEO0018GEO_ERROR_DIR_OUT_OF_MEMORY: Out of memory during directory initialization: hana/SDQ_Reference/geo_gb_nt.dir.
[65085]{200563}[2/66728980] 2019-04-04 16:08:16.740559 e af_core hanaTransform.cpp(01240) : GeocodeUK_GEOCODE_TEMPLATE_:1: In HanaTransform, Failed to create transform
[65085]{200563}[2/66728980] 2019-04-04 16:08:16.742133 e CalcEngine cePopAdapter.cpp(00338) : Execution of Adapter operation within node GeocodeUK_GEOCODE_TEMPLATE_ failed: exception 141005: Failed to create or initialize HanaTransform object

 

Assigned Tags

      1 Comment
      You must be Logged on to comment or reply to a post.
      Author's profile photo Geetha L N S
      Geetha L N S

      Hi Ian,

      Thanks for an informative blog. From this blog I understand to enabled SDQ for address validation an SDI flowgraph should be created as explained at the Hana Database.

      Is it not possible to configure in S4 Hana system and activate the import predefined settings for SAP HANA smart data quality and use DQM integrated with SDQ without any coding or SQL query or config at SAP HANA database level.

      or In other words How I can activate DQM with SDQ from S/4 HANA system through config and let the basis know that the address file repositories need to be saved in Hana Database.

      can you please advise.

      thanks

      Geetha