Paddling Upstream: How to find all HANA views that...

kowsik_tm · ‎04-04-2017

Requirement:

Find all HANA views that use a particular table/column.

If a field is dropped/modified from a table, identify the upstream model impacts.

Problem Faced:

Our team's analytics was based on tables from multiple source systems.

There were instances where fields were being dropped from source tables as a part of continuous changes.

These were affecting the HANA models built on top of those tables and there was no direct way to find out which models will be affected if fields from tables were dropped/modified.

This activity had to be done manually, the effort required might depend on how many models/sub models needed checking.

Current Scenario:

There is no table in HANA which will give this info.

Solution Approach:

The approach is to look for "specific" tags in the repository XML to find if a table/field is used in a model.

The field is identified irrespective of whether it is propagated to the top most node.

The repository XML is stored in CDATA field of "_SYS_REPO"."ACTIVE_OBJECT" table.

Prerequisites:

Select access to table "_SYS_REPO"."ACTIVE_OBJECT"

Solution:

This is automated through a procedure.

The steps to create the procedure and associated tables are below.

Run the below code snippet to create the tables and the procedure.

CREATE COLUMN TABLE "FI_INPUT" ("NUM" INTEGER CS_INT PRIMARY KEY,

	 "SCHEMA" NVARCHAR(40),

	 "TABLE" NVARCHAR(40),

	 "FIELD" NVARCHAR(100)) UNLOAD PRIORITY 5 AUTO MERGE 

;

ALTER TABLE "FI_INPUT" ADD ("FIELD_QUERY" NVARCHAR(200) GENERATED ALWAYS AS '%source="' || UPPER("FIELD") || '"%') 

;

ALTER TABLE "FI_INPUT" ADD ("TABLE_QUERY" NVARCHAR(200) GENERATED ALWAYS AS '%<input node="#' || UPPER("TABLE") || '%') 

;

CREATE COLUMN TABLE "FI_OUTPUT" ( "TABLE" NVARCHAR(40) CS_STRING,

	 "FIELD" NVARCHAR(100) CS_STRING,

	 "PACKAGE" NVARCHAR(300) CS_STRING,

	 "OBJECT_NAME" NVARCHAR(300) CS_STRING ) 

;

CREATE COLUMN TABLE "FI_TEMP" ( "ID" INT CS_INT,

	 "PACKAGE" NVARCHAR(300) CS_STRING,

	 "OBJ_NAME" NVARCHAR(300) CS_STRING,

	 "CDATA" NCLOB MEMORY THRESHOLD 1000 ) 

;



CREATE PROCEDURE FIELD_DEPENDENCY (IN PACKAGE_NAME NVARCHAR(100)) LANGUAGE SQLSCRIPT AS 

BEGIN DECLARE I INTEGER

;

DECLARE GEN_QUERY NVARCHAR(1000)

;

DECLARE COUNTER INTEGER

;

DECLARE C1 INTEGER

;

DECLARE PK NVARCHAR(100)

;

PK := ''''||PACKAGE_NAME||'%'''

;

---Find out Existence of table/field in HANA Schemas

T1 = SELECT

	 TABLE,

	FIELD,

	CASE WHEN COALESCE(POSITION,

	0) = 0 

THEN 'WRONG FIELD/TABLE' 

ELSE 'YES' 

END AS "FIELD_VALIDITY" 

FROM FI_INPUT 

LEFT OUTER JOIN TABLE_COLUMNS ON TABLE = TABLE_NAME 

AND FIELD = COLUMN_NAME 

AND SCHEMA = SCHEMA_NAME

;



SELECT COUNT(*) INTO C1  FROM :T1 

WHERE FIELD_VALIDITY = 'WRONG FIELD/TABLE';



IF C1 > 0 

THEN SELECT

	 * 

FROM :T1 

WHERE FIELD_VALIDITY = 'WRONG FIELD/TABLE' 

;



ELSE DELETE 

FROM FI_OUTPUT

;

DELETE 

FROM FI_TEMP

;

SELECT

	 COUNT(*) 

INTO COUNTER 

FROM FI_INPUT

;

FOR I IN 1..:COUNTER DO GEN_QUERY = 'INSERT INTO FI_TEMP (SELECT ' ||:I|| ',PACKAGE_ID,OBJECT_NAME ,"CDATA" FROM "_SYS_REPO"."ACTIVE_OBJECT" 

		 WHERE ("PACKAGE_ID" LIKE '||:PK||') 

		 AND (CDATA LIKE (SELECT FIELD_QUERY FROM FI_INPUT WHERE NUM = '||:I||') AND CDATA LIKE (SELECT TABLE_QUERY FROM FI_INPUT WHERE NUM = '||:I||') ))'

;

EXEC (:GEN_QUERY)

;



END FOR

;

INSERT 

INTO FI_OUTPUT (SELECT

	 TABLE,

	FIELD,

	PACKAGE,

	OBJ_NAME 

	FROM FI_INPUT 

	INNER JOIN FI_TEMP ON ID = NUM)

;

SELECT

	 * 

FROM FI_OUTPUT

;



END 

IF

;



END 

;

The Procedure and the below tables will be created once the snippet is run.

Running the Procedure:

The FI_INPUT table needs to be populated with the fields to be identified.

Insert values for the first 4 fields, the last two fields are auto generated, incorporating the XML tag info mentioned earlier.

INSERT INTO "FI_INPUT" VALUES(1,'SCHEMA_NAME','TABLE_NAME','FIELD_NAME');

The procedure only runs for valid Table/fields and will keep displaying the erroneous table/field in the output until corrected.

The procedure takes a package name as input, If a root package is specified then all models under the root/sub-packages will be searched.

CALL "FIELD_DEPENDENCY"('PackageName');

Example:

INSERT INTO FI_INPUT VALUES(1,'ECC','VBRP','NETWR');

INSERT INTO FI_INPUT VALUES(2,'ECC','VBRK','VBELN');

INSERT INTO FI_INPUT VALUES(3,'ECC','VBRK','ABCD');--THIS IS FOR A TEST

The procedure won't run until the invalid entries are corrected.

Once the correction is made the Procedure can be run to get the output.

These fields are being used in the below models.

Please let me know if there are any questions.