The base idea was to share my exploration on SAP HANA Text Analysis thru this blog. In the process, I have used custom configurations and custom dictionary to build my full text analysis index tables and used a dynamic query in the procedure to fetch the number columns of a table.

If there are no entries in text analysis table, then the query will return the fields maintain in the custom configuration table, else it adds a new column to the query which was requested and stored in the text table. Thought was the same procedure should return the data with X or Y number of columns without changing the procedure code.


STEP 1: Creating a Custom Configuration .hdbtextconfig:

Custom dictionary is used to define our own tokens or variants. Variants will be categorized by TA_TYPE.

In this case, I have built a custom dictionary which is based on database table and fields.

  1. In order to do this, I have created a new file customconfig.hdbtextconfig
  2. I have copied the code inside EXTRACTION_CORE_VOICEOFCUSTOMER.hdbtextconfig
  3. Added the below line in my custom configuration file, customconfig.hdbtextconfig


Path for TA:

Path for TA.jpg

Code in Custom Configuration:

Code - VoC.jpg

STEP 2: Creating a Custom Dictionary .hdbtextdict:

I have added the lines in the file to create a custom dictionary. When the entries are populated in the table which has text analysis index created with the customconfig.hdbtextconfig, it will read the below dictionary to identify the TA_TOKEN and TA_TYPE. I have maintained the table name “” and field name “status” in the custom dictionary under different categories.

<?xml version="1.0" encoding="utf-8" ?>
<dictionary xmlns="">
    <entity_category name="querytable">
      <entity_name standard_form="">
            <variant name ="serial numbers"/>
  <entity_category name="queryfields">
      <entity_name standard_form="queryfields">
            <variant name ="status"/>

STEP 3: Tables & Text Analysis full text index:

  1. Table for default selection ““.

        Table was created to maintain the default query. This table will have 3 fields Schema name, Table name and the default fields. This query will get                    executed when there are no entries in the text analysis table.


    2. Table to fetch the data. Maintain some sample records. In this case “


    3. Table to store the requests for the selection of the fields from the above table. In this case “WRK_SCH”.“”

    4. Create a full text index on the table “WRK_SCH”.“”. This table will hold the newly requested FIELDS. Refer the custom          configuration file for creating an Index.

CONFIGURATION 'amohas97.session.textproj::customconfig'
TOKEN SEPARATORS '\/;,.:-_()[]<>!?*@+{}="&'

STEP 4: Procedure:

  1. Get the default query maintained in the “” to fetch the data from “”
  2. Check whether there are entries in the text analysis table
  3. If entry exist, get the values of TA_TOKEN & TA_NORMALIZED. These entries will be the values of table and field.
  4. Build a new query using the step 3 information
  5. Execute the query which was built in step 4

                LANGUAGE SQLSCRIPT as                 
                --SQL SECURITY INVOKER
                --DEFAULT SCHEMA <default_schema_name>
                --READS SQL DATA AS
declare lv_query nvarchar(100) := null;
declare lv_new_q nvarchar(100) := null;
declare lv_old_q nvarchar(100) := null;
declare lv_q_fin nvarchar(1000) := null;
declare lv_table nvarchar(100) := null;
declare lv_fields nvarchar(100) := null;
declare lv_count nvarchar(3) := null;
                Write your procedure logic
-- Configurable table where the fields are maintained for default selection
WHERE schema_name = 'WRK_SCH'
-- Index table
select count(*) into lv_count from "WRK_SCH"."$TA_TXTIDX2" where ta_type = 'querytable';
                if lv_count > 0 then
                                -- --To get the table name, QUERYTABLE is category name. "serial numbers" this text is a token to identify the table
                                select top 1 ta_normalized into lv_table from "WRK_SCH"."$TA_TXTIDX2" where ta_type = 'querytable' order by time desc;
                                -- --To get the filed name, QUERYFIELDS is category name. "status" this text will be identified as field
                                select top 1 ta_token into lv_fields from "WRK_SCH"."$TA_TXTIDX2" where ta_type = 'queryfields' order by time desc;
                                -- --To get the COLUMN name, or to check whether column exists
                                select column_name into lv_new_q from TABLE_COLUMNS where schema_NAME = 'WRK_SCH' and table_name = lv_table and upper(comments) = upper(lv_fields);
                                -- --Concatenate the field name with the exisiting query
                                lv_query := lv_old_q || ',' || lv_new_q;
                                -- --Prepare the final query
                                lv_q_fin := ' select ' || lv_query|| ' from "WRK_SCH"."" ';
                -- --Existing query in the table
                lv_q_fin := ' select ' || lv_old_q|| ' from "WRK_SCH"."" ';
                end if;
-- Execute the QUERY

STEP 5: Execution:

Run the procedure in SQL console : CALL “amohas97.session.ta::dquery”();

when we first execute the procedure, the query will return the data/entries with three columns from the table “”

as the default selection of the fields is maintained in the “” table.

In this table I have maintained only 3 columns. so the query returned only 3 columns.


1.      Now in SQL console, I will insert a new entry in this table with a statement in English in the table “” table. I have created a full text index on this table. So whenever there is an entry created it, my text analysis table will have an entry with categorized TA_TOKENS and TA_NORMALIZED.

insert into "."""WRK_SCH"."" values('1','shahid','give me the status of serial numbers');

This is how the text table looks like


1.      Now run the procedure again, this time the procedure will return the data with 4 columns


Thank you for reading.

To report this post you need to login first.


You must be Logged on to comment or reply to a post.

  1. Franziskus Heep

    Is it right that a sentiment analysis with SAP HANA is based on dictionary and conducting certain machine learning scenarious is not possible within this medium ?



Leave a Reply