Skip to Content
Technical Articles

ABAP Text Search with ADBC

Implementing a plain text search in ABAP is more challenging than mere querying table attributes. A user-specified search pattern cannot be matched using a simple test for equality. Some sophisticated plain text matching algorithms are needed for attractive Google-like search:

  • the Full Text Search (FTS) matches a pattern exactly.
  • the Fuzzy search performs approximate string matching according to a distance algorithm tuned to a problem domain that finds both exact matches and deviations.
  • the Acoustic search is an approximate String matching according to a phonetic algorithm.

Those algorithms make the difference between searching and finding, but the challenge is to create an efficient implementation.

An approach used by search engines like TREX or the SAP Enterprise Search tool is to pre-process the text and store results in index tables, enabling fast access

https://en.wikipedia.org/wiki/String-searching_algorithm

Without TREX, generic text search might not be high on your agenda, but let us consider other options…

ABAP SQL

Fuzzy search is not supported in ABAP SQL (formerly known as Open SQL). A simple pattern matching is enabled by the LIKE statement of the WHERE clause

SELECT product FROM catalog WHERE name LIKE '%_agento%'.

Long texts saved in tables STXH and STXL are accessed with function modules like READ_TEXT that can also read the text buffer and the archive. This approach does not allow searching for long text.

ABAP Database Connectivity

ABAP Database Connectivity  (ADBC) can be used as a powerful alternative to ABAP SQL to

  • create and manage connections to all database maintained in transaction DBACOCKPIT (table DBCON).
  • execute DB specific statements
  • process the query results

Known Problems:

  • No automatic client handling
  • Security risk: SQL Injection.

Important ADBC Classes

  • CL_SQL_CONNECTION
  • CL_SQL_STATEMENT
    • execute_ddl( ) – CREATE TABLE, DROP TABLE
    • execute_query( ) – SELECT FROM WHERE
    • execute_update( ) – INSERT INTO VALUES ( )
  • CL_SQL_PREPARED_STATEMENT
    • prepare_statement( )
  • CL_SQL_RESULT_SET – Result set of an SQL query
    • set_param_table( )
    • next_package( )
    • close( )
  • CX_SQL_EXCEPTION
  • CL_SHDB_SELTAB method sql_where_condition( ) and the static method COMBINE_SELTABS( ) shall be used to convert SELECT-OPTIONS into Native SQL.
  • Demo Report DEMO_ADBC_DDL_DML

HANA DB

Queries with the CONTAINS( ) predicate can be created via ADBC on HANA DB (HDB) to implement FTS, Fuzzy and Acoustic search (cf. search, analysis and mining).

As a prerequisite, a full-text index must be created for the column in the dictionary (this is only possible for HDB) or the column must be defined as TEXT at HDB table level.

In my scenario, I replicated available long text content into a HDB TEXT column. This in effect creates an index, enabling queries using CONTAINS( ). 

Sample Code

generates SQL statement for fuzzy text search for the pattern TEST OR PALLET

SELECT TOP 200 DISTINCT SCORE() AS FUZZY_SCORE, K.EBELN, P.EBELP    
FROM EKKO AS K INNER JOIN EKPO AS P ON K.EBELN = P.EBELN 
AND K.MANDT = P.MANDT    
INNER JOIN LFA1 AS L ON K.LIFNR = L.LIFNR AND K.MANDT = L.MANDT    
LEFT OUTER JOIN ADRC AS A ON L.ADRNR = A.ADDRNUMBER 
AND L.MANDT = A.CLIENT    
WHERE K.MANDT = '140' AND A.NATION = ''      
AND CONTAINS(P.TXZ01, 'TEST OR PALLET', 
     FUZZY(0.8, 'textSearch=compare,similarCalculationMode=search' )) 
AND K.EKORG = 'GLOB'     
AND K.KDATB <= '20190213' AND K.KDATE >= '20190424'     
AND K.LOEKZ = '' AND P.LOEKZ = ''     
AND SCORE() >= 0.20000000     
ORDER BY FUZZY_SCORE DESC

It works! (and yes, you could also use AMDP).

After this successful endeavor, I searched the web and realized the CONTAINS( ) predicate is also available on other platforms.

Sybase

The SAP Netweaver demo system NPL runs on Sybase (SYB or SAP ASE). ABAP does not support full text index for other DBs.

Index ZTOAD-Y01 is of type Full Text. This is not implemented for DB SYB

Message no. DT558

But Sybase also support the CONTAINS( ) statement for exact and fuzzy text search.

So how to enable the indexing? Sybase’s help says NGRAM TEXT index search is mainly useful when words are misspelled. Sybase IQ does not support searches like synonyms and antonyms.

Oracle

Oracle supports fuzzy text search with variants of the CONTAINS( ) statement.

SQL Server

MS SQL Server supports full-text searching with the CONTAINS( ) statement.

Other DBs

You could try to use a custom of the string distance functions as a DB procedure if not already available on a given platform.

  • Here is a T-SQL (MS SQL server) implementation of the Levenshtein distance.
  • MySQL has Full Text Search WHERE MATCH(column) AGAINST('Rose', 'Crown') and the SOUNDEX( ) function for acoustic search
SELECT * FROM `author_bios` 
  WHERE SOUNDEX(`auth_last_name`) = SOUNDEX('Williams')

Summary

High performance full-text search is a feature implemented by many database in non standard ways. Using ABAP DB Connectivity opens up new vistas, like the helpful CONTAINS( ) statement supported by SAP HANA DB an other databases.

I would call Fuzzy Text search the poster child of ADBC.

 


Content reposted from www.informatik-dv.com

8 Comments
You must be Logged on to comment or reply to a post.