Skip to Content
Author's profile photo Roland Bouman

SAP HANA: On which base columns do my information views depend?

NOTE: The code samples may look mangled on the SAP community blog platform. This is not something I can help – it looks fine in the preview, but not when published. A better readable version of this article is available here: http://rpbouman.blogspot.nl/2016/10/sap-hana-on-which-base-columns-do-my.html

For one of Just-BI‘s customers, we’re currently working to productize a custom-built proof-of-concept SAP/HANA application.

This particular application is quite typical with regard to how we use SAP/HANA features and techniques: it is a web-application for desktop and tablet devices, that we serve through SAP/HANA’s XS engine. For database communication, we mostly use OData (using XS OData Services), as well as the odd xsjs request (in this case, to offer MS Excel export using ODXL)

The OData services that our application uses are mostly backed by SAP/HANA Calculation views. These, in turn, are built on top of a mixed bag of objects:

  • Some of these are custom base tables that belong to just our application;
  • Some are base tables that collect output from an advanced analytics recommendations algorithm that runs in an external R server
  • Some are information views (analytic views, attribute views and calculations views) that form a virtual datamart (of sorts) on top of base tables replicated from various SAP ERP source systems to our SAP/HANA database.

One of the prerequisites to productize the current solution is a re-design of the backend. Redesign is required because the new target system will be fed from even more ERP source systems than our proof-of-concept environment, and the new backend will need to align the data from all these different ERP implementations. In addition, the R algorithm will be optimized as well: in the proof-of-concept environment, the advanced analytics algorithm passes through a number of fields for convenience that will need to be acquired from elsewhere in the production environment.

To facilitate the redesign we need to have accurate insight into which base columns are ultimately used to implement our application’s data services.
As it turns out, this is not so easily obtainable using standard tools. So, we developed something ourselves. We think this may be useful for others as well, which is why we’d like to share it with you through this blog.

Information View Dependencies

The standard toolset offers some support to obtain dependencies for information views (analytic views, attribute views and calculation views):

As it turns out, these standard tools do not give us the detailed information that we need.
The HANA studio features are mainly useful when designing and modifying information views, but do not let us obtain an overview of all dependencies, and not in a way that we can easily use outside of HANA Studio.
The usefulness of querying the OBJECT_DEPENDENCIES system view is limited by the fact that it only reports objects – that is, base tables or information views – but not the columns contained therein.

It looks like we’re not the only ones struggling with this issue.

Getting the information view’s definition as XML from _SYS_REPO.ACTIVE_OBJECT

To get the kind of information we need, we’re just going to have to crack open the definition of the information view and look what’s inside.
As it turns out, HANA stores this as XML in the CDATA column of the _SYS_REPO.ACTIVE_OBJECT system table, and we can query it by package name, object name and object suffix (which is basically the extension of the file containing the definition that is stored in the repository):

SELECT  CDATA
FROM    _SYS_REPO.ACTIVE_OBJECT
WHERE   PACKAGE_ID    = 'my.package.name'
AND     OBJECT_NAME   = 'CA_MY_CALCULATION_VIEW'
AND     OBJECT_SUFFIX = 'calculationview'

With some effort, _SYS_REPO.ACTIVE_OBJECT can be joined to OBJECT_DEPENDENCIES to discover the objects on which the information view depends:

SELECT      od.BASE_SCHEMA_NAME
,           od.BASE_OBJECT_NAME
,           od.BASE_OBJECT_TYPE
FROM        _SYS_REPO.ACTIVE_OBJECT ao
INNER JOIN  OBJECT_DEPENDENCIES     od
ON          '_SYS_BIC'                          = od.DEPENDENT_SCHEMA_NAME
AND         ao.PACKAGE_ID||'/'||ao.OBJECT_NAME  = od.DEPENDENT_OBJECT_NAME
AND         'VIEW'                              = od.DEPENDENT_OBJECT_TYPE
WHERE       ao.PACKAGE_ID                       = 'my.package.name'
AND         ao.OBJECT_NAME                      = 'CA_MY_CALCULATION_VIEW'
AND         ao.OBJECT_SUFFIX                    = 'calculationview'

(Note: OBJECT_DEPENDENCIES reports all dependencies, not just direct dependencies)

Or we can query the other way around, and find the corresponding model for a dependency we found in OBJECT_DEPENDENCIES:

SELECT      ao.PACKAGE_ID
,           ao.OBJECT_NAME
,           ao.OBJECT_SUFFIX
,           ao.CDATA
FROM        object_dependencies     od
INNER JOIN  _SYS_REPO.ACTIVE_OBJECT ao
ON          SUBSTR_BEFORE(od.base_object_name, '/') = ao.package_id
AND         SUBSTR_AFTER(od.base_object_name, '/')  = ao.object_name
AND         ao.object_suffix in (
              'analyticview'
            , 'attributeview'
            , 'calculationview'
            )
WHERE       od.DEPENDENT_SCHEMA_NAME = '_SYS_BIC'
AND         od.DEPENDENT_OBJECT_NAME = 'my.package.name/CA_MY_CALCULATION_VIEW'
AND         od.DEPENDENT_OBJECT_TYPE = 'VIEW'

NOTE: It turns out that querying OBJECT_DEPENDENCIES fails at reporting dependencies between analytic views and the attribute views they use. To capture those dependencies, you need to query _SYS_REPO.ACTIVE_OBJECTCROSSREF.

Parsing the information view’s XML definition with stored procedure p_parse_xml

Once we obtained the XML that defines the information view, we still need to pull it apart so we can figure out how it is tied to our base table columns.
To do that, we first apply a general XML parser that turns the XML text into a (temporary) table or table variable, such that each row represents a distinct, atomic element inside the XML document.
For this purpose I developed a HANA stored procedure called p_parse_xml. Here is its signature:

create PROCEDURE p_parse_xml (
  -- XML string to parse
  p_xml nclob
  -- Parse tree is returned as a table variable
, out p_dom table (
    -- unique id of the node
    node_id           int
    -- id of the parent node
  , parent_node_id    int
    -- dom node type constant: 1=element, 2=attribute, 3=text, 4=cdata,
    --                         5=entityref, 6=entity, 7=processing instruction, 
    --                         8=comment, 9=document, 10=document type, 
    --                         11=document fragment, 12=notation
  , node_type         tinyint
    -- dom node name: tagname for element, attribute name for attribute, 
    --                target for processing instruction,
    --                document type name for document type,
    --                "#text" for text and cdata, "#comment" for comment, 
    --                "#document" for document, "#document-fragment" for document fragment.
  , node_name         nvarchar(64)  
    -- dom node value: text for text, comment, and cdata nodes, data for processing instruction node, null otherwise.
  , node_value        nclob
    -- raw token from the parser
  , token_text        nclob         
    -- character position of token
  , pos               int           
    -- lenght of token.
  , len               int           
  )
  -- flag whether to strip text nodes that only contain whitespace from the parse tree
, p_strip_empty_text  tinyint default 1
) 

Note that you can download the source dode for the entire procedure from github.
The p_parse_xml procedure depends on p_decode_xml_entities,
so if you want to run it yourself, be sure to install that first.

To see how you can use this, consider the following, simple example:

call p_parse_xml(
'<parent-element attribute1="value1">
  <child-element attribute2="value2" attribute3="value3">
    text-content1
  </child-element>
  <child-element att="bla">
    text-content2
  </child-element>
</parent-element>', ?);

This gives us the following result:

+---------+----------------+-----------+----------------+-----------------------------+-------------------------------------------------------------+-----+-----+
| NODE_ID | PARENT_NODE_ID | NODE_TYPE | NODE_NAME      | NODE_VALUE                  | TOKEN_TEXT                                                  | POS | LEN |
+---------+----------------+-----------+----------------+-----------------------------+-------------------------------------------------------------+-----+-----+
| 0       | ?              | 9         | #document      | ?                           | ?                                                           | 1   | 221 |
| 1       | 0              | 1         | parent-element | ?                           | <parent-element attribute1=\"value1\">                      | 1   | 36  |
| 2       | 1              | 2         | attribute1     | value1                      |  attribute1=\"value1\"                                      | 2   | 20  |
| 3       | 1              | 1         | child-element  | ?                           | <child-element attribute2=\"value2\" attribute3=\"value3\"> | 41  | 55  |
| 4       | 3              | 2         | attribute2     | value2                      |  attribute2=\"value2\"                                      | 42  | 20  |
| 5       | 3              | 2         | attribute3     | value3                      |  attribute3=\"value3\"                                      | 62  | 20  |
| 6       | 3              | 3         | #text          | text-content1               | text-content1                                               | 96  | 23  | 
| 7       | 1              | 1         | child-element  | ?                           | <child-element att=\"bla\">                                 | 139 | 25  |
| 8       | 7              | 2         | att            | bla                         |  att=\"bla\"                                                | 140 | 10  |
| 9       | 7              | 3         | #text          | text-content2               | text-content2                                               | 164 | 23  |
+---------+----------------+-----------+----------------+-----------------------------+-------------------------------------------------------------+-----+-----+

The result is a tabular representation of the XML parse tree. Each row essentially represents a DOM Node, and the column values represent the node’s properties:

  • The NODE_TYPE column tells us what kind of node we’re dealing with. Values in this column conform to the w3c standard document object model (DOM) enumeration of node type values.
    The most important ones are 1 for element nodes (“tags”); 2 for attributes, and 3 for text. The entire parse tree is contained in a document node, which has node type 9.
  • The NODE_ID is the unique identifier of the node while PARENT_NODE_ID points to whatever node is considered the parent node of the current node.
    The parent node is basically the container of the node.
    As you can see, the element with NODE_ID=3 has the element node with NODE_ID=1 as parent.
    These correspond to the first <child-element> and <parent-element> elements in the document.
    Attribute nodes are also marked as children of the element to which they belong. The DOM standard does not consider attributes children of their respective element node, but p_parse_xml does, mainly to keep the result table as simple as possible.
  • The NODE_NAME column is a further characterization of what kind of node we’re realing with. For most node types, the node name is a constant value which is essentially a friendly name for the node type.For example, document nodes (NODE_TYPE=9 always have #document as NODE_NAME, and text nodes (NODE_TYPE=3) always have #text as NODE_NAME.
    For element nodes and attribute nodes (NODE_TYPE is 1 and 2 respectively), the NODE_NAME is not constant. Rather, their node name conveys information about the meaning of the node and its contents. In other words, element and attribute names are metadata.
  • The NODE_VALUE column contains actual data. For element and document nodes, it is alway NULL. For attributes, the NODE_VALUE column contains the attribute value, and for text nodes, it is the text content.
  • The POS lists the position where the current element was found; the LEN column keeps track of the length of current item as it appears in the doucment. Typically you won’t need these columns, except maybe for debugging purposes. The TOKEN_TEXT column is also here mostly for debugging purposes.

Extracting Base Columns from Analytic and Attribute views

If you examine the XML definition of Analytic and/or Attribute views, you’ll notice that table base columns are referenced using <keyMapping>-elements like this:

<keyMapping schemaName="...database schema..." columnObjectName="...table name..." columnName="...column name..."/>

So, assuming we already parsed the model of an analytic or attribute view using p_parse_xml and captured its result in a table variable called tab_dom, we can run a query like this to obtain all <keyMapping> elements:

select     keyMapping.*
from       :tab_dom keyMapping
where      keyMapping.node_type = 1            -- get us all elements
and        keyMapping.node_name = 'keyMapping' -- with the tagname 'keyMappping'

While this gives us the actual elements, the actual data we’re interested in is buried in the attributes of the <keyMapping> elements. You might recall that in the p_parse_xml result, attribute nodes have NODE_TYPE=2 appear as childnode of their respective element. So, we can extract all attributes of all <keyMapping> elements with a self-join like this:

select      keyMapping_attributes.*
from        :tab_dom               keyMapping
inner join  :tab_dom               keyMapping_attributes
on          keyMapping.node_id   = keyMapping_attributes.parent_node_id -- find all nodes that have the keymapping element node as parent 
and         2                    = keyMapping_attributes.node_type      -- but only if their node type indicates they are attribute nodes
where       keyMapping.node_type = 1                                    
and         keyMapping.node_name = 'keyMapping'                         

Since we are interested in not just any attribute node, but attribute nodes having specific names like schemaName, columnObjectName and columnName, we should put a further restriction on the NODE_NAME of these attribute nodes. Also note that this query will potentially give us multiple rows per <keyMapping>-element (in fact, just as many as there are attributes). Since we’d like to have just one row for each <keyMapping>-element having the values of its schemaName, columnObjectName and columnName attributes in separate columns, we should rewrite this query so that each attribute gets its own self-join.

Thus, the final query becomes:

select     keyMapping_schemaName.node_value       as schema_name
,          keyMapping_columnObjectName.node_value as table_name
,          keyMapping_columnName.node_value       as column_name
from       :tab_dom               keyMapping
inner join :tab_dom               keyMapping_schemaName                      -- get the attribute called 'schemaName'
on         keyMapping.node_id   = keyMapping_schemaName.parent_node_id
and        2                    = keyMapping_schemaName.node_type 
and        'schemaName'         = keyMapping_schemaName.node_name 
inner join :tab_dom               keyMapping_columnObjectName                -- get the attribute called 'columnObjectName'
on         keyMapping.node_id   = keyMapping_columnObjectName.parent_node_id 
and        2                    = keyMapping_columnObjectName.node_type 
and        'columnObjectName'   = keyMapping_columnObjectName.node_name 
inner join :tab_dom               keyMapping_columnName                      -- get the attribute called 'columnName'
on         keyMapping.node_id   = keyMapping_columnName.parent_node_id       
and        2                    = keyMapping_columnName.node_type 
and        'columnName'         = keyMapping_columnName.node_name 
where      keyMapping.node_type = 1
and        keyMapping.node_name = 'keyMapping'

Extracting base columns from Calculation views

Getting the base columns used in calculation views is a bit more work.
However, the good news is that in terms of the queries we need to write, it does not get much more complicated than what we witnessed for analytic and attribute views in te previous section.
Querying the xml parse tree almost always boils down to finding elements and finding their attributes, and then doing something with their values.

The reason why it is more work to write queries against the model underlying calculation views is that the XML documents that define calculationviews use an extra level of mapping between the objects that represent the source of the columns and the way these columns are used inside the view.
The following snippet might illustrate this:

<Calculation:scenario ...>
  ...
  <dataSources>
    ...
    <DataSource id="...some id used to refer to this datasource..." type="DATA_BASE_TABLE">
      ...
      <columnObject schemaName="...db schema name..." columnObjectName="...table name..."/>
      ...
    </DataSource>
    ...
  </dataSources>
  ...
  <calculationViews>
    ...
    <calculationView>
      ...
      <input node="#...id of a DataSource element...">
        ...
        <mapping source="...name of a column used as input..." ... >
        ...
      </input>
      ...
    </calculationView>
    ...
  </calculationViews>
  ...
</Calculation:scenario>

The method for finding the base columns can be summarized as follows:

  1. Get all <DataSource>-elements having a type-attribute with the value "DATA_BASE_TABLE".
    These elements represent all base tables used by this view. Other types of objects used by this view will have another value for the type-attribute.To obtain the schema and table name of the base table, find the <columnObject>-childelement of the <DataSource>-element.
    Its schemaName and columnObjectName-attributes respectively contain the database schema and table name of the base table.The <DataSource>-elements have an id attribute, and its value is used as unique identifier to refer to this data source.
  2. Find all instances where the base table datasources are used.A calculation view is essentially a graph of data transformation steps, each of which takes one or more streams of data as input, turning it into a stream of output data.
    In the XML document that defines the calculation view, these transformation steps are represented by &ltcalulationView>-elements.
    These &ltcalulationView>-elements contain one or more <input>-child elements, each of which represents a data stream that is used as input for the transformation step.The <input>-elements have a node-attribute.
    The value of the node-attribute is the value of the id-attribute of whatever element it refers to, prefixed by a hash-sign (#).Note that this is a general technique to reference elements within the same XML document.
    So, in order to find where a <DataSource>-element is used,
    it is enough to find all elements in the same XML document that reference the value <DataSource>-element’s code>id-attribute in the value of their code>node-attribute.
  3. Once we have the elements that refer to our <DataSource>-element, we can find out which columns from the data source are used by looking for <mapping>-child elements.The <mapping>-elements have a source-attribute, which holds the column-name.

With these steps in mind, the SQL query we need to do on the calculation view parse tree becomes:

select     distinct
           ds_co_schemaName.node_value         schema_name
,          ds_co_columnObjectName.node_value   table_name
,          ds_usage_mapping_source.node_value  column_name
--
-- ds: DataSource elements (Note the WHERE clause)
-- 
from       :tab_dom                            ds
--
-- ds_type: demand that the value of the type-attribute of the DataSource elements equal 'DATA_BASE_TABLE'
--          this ensures we're only looking at base tables.
-- 
inner join :tab_dom                            ds_type
on         ds.node_id                        = ds_type.parent_node_id
and        2                                 = ds_type.node_type
and        'type'                            = ds_type.node_name
and        'DATA_BASE_TABLE'                 = cast(ds_type.node_value as varchar(128))
--
-- ds_co: get the columnObject childelement of the DataSource element.
--        Also, get the schemaName and columnObjectName attributes of that columnObject-element. 
--
inner join :tab_dom                            ds_co
on         ds.node_id                        = ds_co.parent_node_id
and        1                                 = ds_co.node_type
and        'columnObject'                    = ds_co.node_name
inner join :tab_dom                            ds_co_schemaName
on         ds_co.node_id                     = ds_co_schemaName.parent_node_id
and        2                                 = ds_co_schemaName.node_type
and        'schemaName'                      = ds_co_schemaName.node_name
inner join :tab_dom                            ds_co_columnObjectName
on         ds_co.node_id                     = ds_co_columnObjectName.parent_node_id
and        2                                 = ds_co_columnObjectName.node_type
and        'columnObjectName'                = ds_co_columnObjectName.node_name
--
-- ds_id: get the id-attribute of the DataSource element.
--
inner join :tab_dom                            ds_id
on         ds.node_id                        = ds_id.parent_node_id
and        2                                 = ds_id.node_type
and        'id'                              = ds_id.node_name
--
-- ds_usage: find any attributes that refer to the id of the DataSource 
--
inner join :tab_dom                            ds_usage
on         'node'                            = ds_usage.node_name
and        2                                 = ds_usage.node_type
and        '#'||ds_id.node_value             = cast(ds_usage.node_value as nvarchar(128))
--
-- ds_mapping: find any mapping child elements of the node that references the DataSource 
--
inner join :tab_dom                            ds_usage_mapping
on         'mapping'                         = ds_usage_mapping.node_name
and        1                                 = ds_usage_mapping.node_type
and        ds_usage.node_id                  = ds_usage_mapping.parent_node_id
--
-- ds_mapping_source: get the source of the mapping elements. These are our base column names.
--
inner join :tab_dom                            ds_usage_mapping_source
on         'source'                          = ds_usage_mapping_source.node_name
and        2                                 = ds_usage_mapping_source.node_type
and        ds_usage_mapping.node_id          = ds_usage_mapping_source.parent_node_id
where      ds.node_type                      = 1
and        ds.node_name                      = 'DataSource'

Putting it all together

To recapitulate, we discussed

  1. How to do general queries for dependencies using OBJECT_DEPENDENCIES, but that you need to query _SYS_REPO.ACTIVE_OBJECTCROSSREF to find out which Attribute views are used by Analytic views.
  2. How to find the model XML code underlying our information views from the _SYS_REPO.ACTIVE_OBJECT table.
  3. How to parse XML, and how to query the parse tree for elements and attributes
  4. How the XML documents for information views are structured, and how to find base columns used in their models

With all these bits and pieces of information, we can finally create a procedure that fullfills the original requirement to obtain the base columns used by our information views.
This is available as the p_get_view_basecols stored procedure.
Here is its signature:

create PROCEDURE p_get_view_basecols (
  -- package name pattern. Used to match packages containing analytic, attribute or calculation views. Can contain LIKE wildcards.
  p_package_id    nvarchar(255)
  -- object name pattern. Used to match name of analytic, attribute or calculation views. Can contain LIKE wildcards.
, p_object_name   nvarchar(255) default '%'
  -- object suffix pattern. Can be used to specify the type of view. Can contain LIKE wildcards.
, p_object_suffix nvarchar(255) default '%'
  -- flag to indicate whether to recursively analyze analytic, attribute or calculation views on which the view to be analyzed depends. 
  -- 0 means only look at the given view, 1 means also look at underlying views.
, p_recursive     tinyint default 1
  -- result table: base columns on which the specified view(s) depends.
, out p_cols table (
    -- schema name of the referenced base column
    schema_name nvarchar(128)
    -- table name of the referenced base column
  , table_name  nvarchar(128)
    -- column name of the referenced base column
  , column_name nvarchar(128)
    -- list of view names that depend on the base column
  , views       nclob
  )
)

Obtaining the list of base columns on which our application depends is now as simple as calling the procedure, like so:

call p_get_view_basecols(
  -- look in our application package (and its subpackages)
  'our.application.package.%' 
  -- consider all information views
, '%'
  -- consider all types of information views
, '%'
  -- consider also information views upon which our information views depend
, 1
  -- put the results into our output table
, ?
);

Finally

I hope you enjoyed this post!
Feel free to leave a comment to share your insights or to give feedback.

Please note that all source code for this topic is freely available as open source software in our just-bi/hades github reposiory.
You are free to use, modify and distribute it, as long as you respect the copyright notice.

We welcome contributions! You can contribute in many ways:

  • Simply use the procedures. Give us feedback. You can do so by leaving a comment on this blog.
  • Spread the word: tell your colleagues, and maybe tweet or write blog post about it. Please use hashtag #justbihades
  • Share your requirements. Create an issue to ask for more features so we can improve our software.
  • Fork it!. Send us pull requests. We welcome your contribution and we will fully attribute you!

Assigned Tags

      23 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Jason Muzzy
      Jason Muzzy

      That's fantastic!  Thank you for sharing how to deal with the CDATA column.

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Hi Jason, thanks for the kind words! Glad this was useful to you 🙂

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Note - since initial release,  a few bugs have been fixed, and performance has been improved. Get the latest from github.

      Author's profile photo Former Member
      Former Member

      Hi Roland,

       

      I was working on the exactly same thing without knowing you already had this done. The difference being I found the Repository XML rather crpytic. So I picked up the View XML to parse. I had some success as well. Would  you know happen to know where(in which table) the View XML is stored?

       

      Regards,

      Nehal Fonseca

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Would  you know happen to know where(in which table) the View XML is stored?

      I’m not sure whether is stored in a table. But if it is, I suppose you should be able to find out by querying the M_SQL_PLAN_CACHE table whilst opening the information view with hana studio.

      select STATEMENT_STRING
      from M_SQL_PLAN_CACHE
      where SESSION_USER_NAME = session_user
      order by LAST_EXECUTION_TIMESTAMP desc
      limit 100ss

      Alternatively, you might figure it out by monitoring the requests that the web ide sends when you open a view, and then try to figure out how it gets it by cracking open the xsjs/xsjslib sources of the web ide.

       

      Author's profile photo Anand Govindarajan
      Anand Govindarajan

      Hi Roland - Thanks for the fantastic blog! We managed to parse the View xml and get column level details for all the Calculation views and their dependencies.

      Now with option for using WebIDE, I have the same question where the XML can be found with the calculation views owned by the HDI container schema and design metadata probably stored in Git.

      You have suggested M_SQL_PLAN_CACHE. I have not tied this but M_ being dynamic views that can give only run-time info (with my limited knowledge, I presume data might be available only when the actual Calculation view in question is executed - please correct my understanding).

      Any update or help here will be much appreciated!

       

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Hi Anand!

       

      Thanks for the kind words and your interest in HADES.

       

      Now with option for using WebIDE, I have the same question where the XML can be found with the calculation views owned by the HDI container schema and design metadata probably stored in Git.

       

      I'm not sure - this may sound silly, but what is the "HDI Container"? Is this about HANA 2.0? I haven't had a chance to play with that.

      You have suggested M_SQL_PLAN_CACHE. I have not tied this but M_ being dynamic views that can give only run-time info (with my limited knowledge, I presume data might be available only when the actual Calculation view in question is executed – please correct my understanding).

      Yeah the plan cache will only contain traces of what is or has been executed, and frankly I don't think we'll ever see any XML in there.

      Perhaps if you can provide some links about this HDI container, or describe it in a way I can reproduce your use case exactly (point for point - click here, click there type of instructions) I can answer your question better.

      Thanks and kind regards,

       

      Roland.

      PS: I think HANA 2.0 has functionality to parse XML directly using a "XML Table " feature. I think it would be a good idea to rewrite the dom parser with that. I can't try though since I don't have access to HANA 2.0

       

       

      Author's profile photo Anand Govindarajan
      Anand Govindarajan

      Hi Roland,

      Yes I am referring to SAP HANA 2.0. Please refer to this blog on containers (https://blogs.sap.com/2015/12/08/sap-hana-sps-11-new-developer-features-hdi/)
      and specifically the statement "Database objects are now owned by a container-specific technical object owner. There is no longer a single all-powerful technical user (_SYS_REPO). Each technical user only has access to its local container objects. Any foreign objects must be accessed via Synonym and granted access by the foreign technical user.".

      Access to _SYS_REPO from the container is restricted and I unable to access the _SYS_REPO.ACTIVE_OBJECT, which means the access to the XML for the Info Views are not there any more. In one of the threads, I had this confirmation ( https://answers.sap.com/questions/628829/does-sys-repoactive-object-still-valid-when-done-u.html ).
      Still I am trying to dig more. In case you get a chance, please try from your end. I will see if I can send you a step wise approach.

       

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Thanks Anand!

      Yes I would really appreciate if you could let me know when you find out something more.

      Author's profile photo Anand Govindarajan
      Anand Govindarajan

      Hi Roland..hope you are doing well! Back to this topic after 2 years!

      Is the above method of parsing the View xml (from _SYS_REPO.ACTIVEOBJECT) and get column level details for all the Calculation views and their dependencies, valid for CDS views?

      I have tried these methods primarily for HANA Information views and not for CDS views.

      Need your inputs..thanks!

      Author's profile photo Srinivasa Rao
      Srinivasa Rao

      Hi Roland,

      Thanks for the blog.

      We were exactly working on the same requirement. I tried to execute these steps which you have mentioned in the blog but I'm getting the error message "wrong number or types of parameters in call: P_ERROR is not bound:" when calling the p_parse_xml function.

      Any idea how to encounter it.

      Regards,

      Ganapathi.

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Hi Srinivasa,

       

      after writing the blog, the interface of the stored proc changed.

      The signature of the p_parse_xml procedure is now

       

      create PROCEDURE p_parse_xml (
        p_xml nclob
      , out p_dom table (
          node_id           int           -- unique id of the node
        , parent_node_id    int           -- id of the parent node
        , node_type         tinyint       -- dom node type constant: 1=element, 2=attribute, 3=text, 4=cdata, 5=entityref, 6=entity, 7=processing instruction, 8=comment, 9=document, 10=document type, 11=document fragment, 12=notation
        , node_name         nvarchar(64)  -- dom node name: tagname for element, attribute name for attribute, target for processing instruction, document type name for document type, "#text" for text and cdata, "#comment" for comment, "#document" for document, "#document-fragment" for document fragment. 
        , node_value        nclob         -- dom node value: text for text, comment, and cdata nodes, data for processing instruction node, null otherwise.
        , pos               int           -- character position of token
        , len               int           -- lenght of token.
        )
      , out p_error table (
          error_code        int
        , error_message     nvarchar(255)
        , position          int
        , node_name         nvarchar(64)
        )
      , p_strip_empty_text  tinyint default 1
      ) 

      So, the first argument is a table to catch the parse tree, the second argument is a table to report any errors. The last argument is the flag to control whitespace behavior.

      After calling p_parse_xml, the caller should check the contents of the p_error table - it will contain 0 rows if the parse was succesful. If the p_error table contains 1 or more rows, then the caller can examine these to see what the problem is. In that case, the p_dom table may contain a partial parse tree, but even that partial tree may not be accurate.

      Here is a snippet that illustrates how that works (taken from p_get_view_basecols):

          call p_parse_xml(r_view.cdata, tab_dom, tab_error);
      
          select count(*) 
          into   v_error_count 
          from :tab_error
          ;
          if v_error_count != 0 then
            select * from :tab_dom;
            select * from :tab_error;
            signal sql_error_code 10000 
              set message_text = 'Error parsing '
              ||r_view.object_suffix
              ||' '||r_view.object_name
              ||' in package '||r_view.package_id
              ||'.'
              ;
          end if;
      Author's profile photo Rutvik Upadhyay
      Rutvik Upadhyay

      Hi Roland Bouman,

      Thanks a ton for the blog and providing valuable information for column level dependency view.

       

      Regards,

      Rutvik.

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Thanks Rutvik! Much appreciated 🙂

      Author's profile photo Rutvik Upadhyay
      Rutvik Upadhyay

      Hi Roland,

      A small query from my side : How can I restrict the result for specific schema ? Is there any way that gives the list for dependent object based on specific schema table?

       

      Regards,

      Rutvik.

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Rutivik,

      > How can I restrict the result for specific schema ?

      I understand what you're asking. However, this is about the dependencies of information views. Information views do not reside in a HANA database schema; rather they are HANA content, which sits in a repository package. The information views themselves can use database schema objects (such as base tables, database views etc), but they are not themselves schema objects.

      The signature for all procedures described here have the LIKE pattern for the package identifier as first argument.

      Author's profile photo Rutvik Upadhyay
      Rutvik Upadhyay

      Hi Roland,

      Thank you for the answer. I have modified the procedure to restrict  the information views based on the name text and it works. However, I would like to know whether we can retrieve such column dependency for script based calculation views. For example, in the view SQL script, if query is defined on column tables, is it possible to find the dependency of such tables?

       

      Thanks,

      Rutvik.

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      >  ” I have modified the procedure to restrict  the information views based on the name text and it works. ”

       

      I’m not quite sure what you mean. Can you explain? Do you think your modification could benefit others? Maybe consider creating a new procedure with your modification and submitting a pull request to our github repository so the rest of the world can benefit?

       

      > “However, I would like to know whether we can retrieve such column dependency for script based calculation views. For example, in the view SQL script, if query is defined on column tables, is it possible to find the dependency of such tables?”

       

      Given enough time and effort, anything and everything can be done. Question is, what’s it worth to you? You can either build it yourself, and hopefully, send a pull request to our github repository. Or you can file an issue on our github tracker, and maybe we’ll pick it up.

      Author's profile photo Romain Ferraton
      Romain Ferraton

      Hello Roland

      This is an incredible and huge amount of work you've done with this xml_parser for CDATA : Bravo and a BIG Thanks

      I was wondering if you had a chance to test the XMLTABLE table function with the column "_SYS_REPO". "ACTIVE_OBJECT". "CDATA" ?

      This is a HANA 2.0 "new thing" but you may have upgraded your database a while ago.

      I'm struggling with XMLTABLE right now without getting anything out of it, the documentation is thin regarding XMLTABLE and the real world examples are few.

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Hi Romain!

       

      thanks for the kind words. It's been a while since we built this.

       

      I was wondering if you had a chance to test the XMLTABLE table function with the column "_SYS_REPO". "ACTIVE_OBJECT". "CDATA" ?

      I heard about this, but I haven't taken the time yet to check this out. I should, but there's just been so much other work going on.

       

      I'm struggling with XMLTABLE right now without getting anything out of it, the documentation is thin regarding XMLTABLE and the real world examples are few.

      I understand. I'm sorry, I'm afraid I can't help you with this right now. I would say it seems like the better approach.

       

      That said, we also had good results parsing information view XML sources in xsjs using the built-in SAX parser. We built a tool around it that lets you explore dependencies in a graphical way:

       

      https://www.just-bi.nl/tacl/

      Author's profile photo Romain Ferraton
      Romain Ferraton

      Thanks for the quick answer !

       

       

       

      Author's profile photo Romain Ferraton
      Romain Ferraton

      Is it possible to try the app ?

      Author's profile photo Roland Bouman
      Roland Bouman
      Blog Post Author

      Hi Romain!

       

      unfortunately - no online trial.

      But you can contact the backoffice and arrange for a demo.

      https://www.just-bi.nl/contact/

      Just mention TACL and include the link in the contact form:

      https://www.just-bi.nl/tacl/

      If you have some very concrete requirements, please state those in the contact form as well.

       

      Cheers and best regards,

       

      Roland