Skip to Content
Author's profile photo Esha Rajpal

Custom Rule Set – SAP HANA Text Search

In this blog, I’ll discuss how to create custom rule set in SAP HANA.  To implement certain custom use cases, customers have to implement their own rule set for performing Text Search Operations.

Search Rule Set

Figure1 below shows the structure of Rule Sets stored in XML/Tree Like Formation.

Figure1: Rule Set Structure

Figure2 below shows the steps to configure and use a Search Rule Set.

*****************************************************************************************

Step1: Add View

First step while configuring the rule set is to define a view. Search operation can be performed on Attribute Information views, Column views of type Join, and SQL views. Other database objects, such as row store tables, column store tables, calculation views, or analytic views, are not supported.

1.1 Create a column table and insert the records in column table as specified below:

CREATE COLUMN TABLE employee
(
  id            INTEGER          PRIMARY KEY,
  firstname     SHORTTEXT(100)   FUZZY SEARCH INDEX ON,
  lastname      SHORTTEXT(100)   FUZZY SEARCH INDEX ON,
  address    NVARCHAR(100)    FUZZY SEARCH INDEX ON,
  postcode      NVARCHAR(20)     ,
  cityname      NVARCHAR(100)    ,
  countrycode   NVARCHAR(2),
);

INSERT INTO employee VALUES(1, 'Michael', 'Milliken', '3999 WEST CHESTER PIKE', '001', 'NEWTON SQUARE', 'PA');

1.2 Create and Activate an Attribute View by selecting the customer table and projecting all the columns in output of attribute view as shown in Figure3.

Figure 3: Attribute View

*****************************************************************************************

Step2: Add Stop Words and Term Mappings

Second step is to configure the nodes “Stopwords (table-based)” and “Term Mappings (table-based)” by creating two tables Stopwords and Termmapping.

2.1 Stopwords

Stopwords are the terms which are less significant terms, therefore these terms are not used to generate the result set. However, these terms do influence score calculations.  A record with stopwords identical to the user input gets a higher score than a record with differing or missing stopwords.

Stopwords can be defined either as single terms or as stopword phrases consisting of multiple terms.

Syntax for creating a term mapping table via SQL:

CREATE COLUMN TABLE stopwords
(
stopword_id VARCHAR(32) PRIMARY KEY,
list_id VARCHAR(32) NOT NULL,
language_code CHAR(2) NOT NULL,
term NVARCHAR(200) NOT NULL
);

Stopwords are stored in a column-store table with the following format:

-- to be able to use stopwords, a stopword table is needed:
CREATE COLUMN TABLE stopwords
(
  stopword_id    VARCHAR(32)    PRIMARY KEY,
  list_id        VARCHAR(32)    NOT NULL,
  language_code  VARCHAR(2),
  term           NVARCHAR(200)  NOT NULL
);

INSERT INTO stopwords VALUES('1', 'firstname', 'en', 'Dr');
INSERT INTO stopwords VALUES('2', 'firstname', 'en', 'Mr');
INSERT INTO stopwords VALUES('3', 'firstname', 'en', 'Mrs');
INSERT INTO stopwords VALUES('4', 'firstname', 'en', 'Sir');
INSERT INTO stopwords VALUES('5', 'firstname', 'en', 'Prof');

2.2 Term Mappings

Term mappings can be used to extend the search by adding additional search terms to the user input. When the user enters a search term, the search term is expanded, and synonyms, hypernyms, hyponyms, and so on are added. Term mappings are defined in a column table and can be changed at any time.

Syntax for creating a term mapping table via SQL:

CREATE COLUMN TABLE termmappings
(
    mapping_id    VARCHAR(32)   PRIMARY KEY,
    list_id       VARCHAR(32)   NOT NULL,
    language_code VARCHAR(2),
    term_1        NVARCHAR(200) NOT NULL,
    term_2        NVARCHAR(200) NOT NULL,
    weight        DECIMAL       NOT NULL
);

Term mappings are defined as a unidirectional replacement. For a term mapping definition of ‘term1’ -> ‘term2’, ‘term1’ is replaced with ‘term2’, but ‘term2’ is not replaced with ‘term1’. Term mappings are language-dependent.

Term mappings are stored in a column-store table with the following format:

-- and for term mappings another table:
CREATE COLUMN TABLE termmappings
(
  mapping_id    VARCHAR(32)   PRIMARY KEY,
  list_id       VARCHAR(32)   NOT NULL,
  language_code VARCHAR(2),
  term_1        NVARCHAR(255) NOT NULL,
  term_2        NVARCHAR(255) NOT NULL,
  weight        DECIMAL       NOT NULL
);

INSERT INTO termmappings VALUES('1', 'firstname', 'en', 'Michael', 'Mike', '0.9');
INSERT INTO termmappings VALUES('2', 'firstname', 'en', 'Mike', 'Michael',  '0.9');
INSERT INTO termmappings VALUES('3', 'firstname', 'en', 'Michael', 'Miky',  '0.8');
INSERT INTO termmappings VALUES('4', 'firstname', 'en', 'Miky', 'Michael',  '0.8');
INSERT INTO termmappings VALUES('5', 'lastname', 'en', 'Milly', 'Milliken', '0.9');
INSERT INTO termmappings VALUES('6', 'lastname', 'en', 'Mille', 'Milliken',  '0.9');

*****************************************************************************************

3 Step3: Add Rule

Next Step is to create a General Project in ABAP perspective. Under that Project create a search rule set file as shown in Figure 4. After creation of search rule set, validate and activate this search rule.

Figure 4: Create a Search Rule Set

**********************************************************************************

4 Step4: Final Step is to perform the search operation.

New Built in function SYS.EXECUTE_SEARCH_RULE_SET is available in HANA to execute defined rule set.

Execute the custom created search rule set with below mentioned command.

CALL SYS.EXECUTE_SEARCH_RULE_SET('
<query>
 <ruleset name="ZSearch_RuleProject:Search_Rule.searchruleset" />
 <column name="FIRSTNAME">Prof. Mike</column>
 <column name="LASTNAME">Milly</column>
 <resultsetcolumn name="_SCORE" />
 <resultsetcolumn name="_RULE_ID" />
 <resultsetcolumn name="_RULE_NUMBER" />
 <resultsetcolumn name="FIRSTNAME" />
 <resultsetcolumn name="LASTNAME" />
</query>
');

This command will give user-defined result set by specifying the result set columns name in call procedure command. Figure 5 below shows the Output of call procedure.

Figure 5: Result Set

While executing the below procedure, in case the below error messages appears in SQL console:

Could not execute ‘CALL SYS.EXECUTE_SEARCH_RULE_SET(‘ <query> <ruleset …’

SAP DBTech JDBC: [258]: insufficient privilege: Not authorized

Recommendation would be to provide authorization of SYS.EXECUTE_SEARCH_RULE_SET to _SYS_REPO.

An alternate way to hold the result set is by transferring the records in result table by using below mentioned command.

CREATE COLUMN TABLE RESULT_STORE (
_SCORE FLOAT,
_RULE_ID VARCHAR(255),
"FIRSTNAME" TEXT,
"LASTNAME" TEXT,
"ADDRESS" NVARCHAR(100),
"POSTCODE" NVARCHAR(20),
"CITYNAME" NVARCHAR(100),
"COUNTRYCODE" NVARCHAR(2)
);

CALL SYS.EXECUTE_SEARCH_RULE_SET('
<query>
 <ruleset name="ZSearch_RuleProject:Search_Rule.searchruleset" />
 <resulttablename name="RESULT_STORE"/>
 <column name="FIRSTNAME">Prof. Mike</column>
 <column name="LASTNAME">Milly</column>
</query>
');
	 
Select * from RESULT_STORE;

Figure 6 below shows the output by querying the result set table RESULT_STORE.

 

Figure 6: Result Set

******************************************************************************************

5 Dynamic Rule Set

There is a possibility to create a dynamic rule set by specifying the XML code in call procedure statement.

CALL SYS.EXECUTE_SEARCH_RULE_SET('
<query>
   <ruleset scoreSelection="firstRule">
      <attributeView name="TRAINING::ZAT_SEARCH_RULESET">
        <keyColumn name="ID"/>
      </attributeView>
      <termMappingsTableBased schema="SYSTEM" table="TERMMAPPINGS">
        <column name="FIRSTNAME">
          <list id="FIRSTNAME"/>
        </column>
        <column name="LASTNAME">
          <list id="LASTNAME"/>
        </column>
      </termMappingsTableBased>
      <rule name="Rule 1">
        <column minFuzziness="0.8"  name="FIRSTNAME">
          <ifMissing action="skipColumn"/>
        </column>
        <column minFuzziness="0.8"  name="LASTNAME">
          <ifMissing action="skipColumn"/>
        </column>
      </rule>
  </ruleset>
 <column name="FIRSTNAME">Mike</column>
 <column name="LASTNAME">Milly</column>
<resultsetcolumn name="_SCORE" />
 <resultsetcolumn name="_RULE_ID" />
 <resultsetcolumn name="_RULE_NUMBER" />
 <resultsetcolumn name="FIRSTNAME" />
 <resultsetcolumn name="LASTNAME" />
</query>
');

Figure 7 shows the output while specifying the dynamic rule set while calling the procedure.

Figure 7: Result Set

To read more about SAP HANA text capabilities refer to blog [ https://blogs.sap.com/2018/02/01/sap-hana-text-capabilities/].

 

 

 

Assigned Tags

      1 Comment
      You must be Logged on to comment or reply to a post.
      Author's profile photo Andre Sousa
      Andre Sousa

      Hi Esha,

       

      Thank you for the post, it's very usefull.

       

      I'm implementing a rule set for the BP search, the problem is that I need to lookup in to the fields NAME_ORG1, NAME_ORG2, NAME_ORG3 and NAME_ORG4 as a group and not field by field. Do you know if it is possible to doing that using the RuleSets?

       

      Thank you.

      BR,

      André Sousa