ABAP Text Search with ADBC
Implementing a plain text search in ABAP is more challenging than mere querying table attributes. A user-specified search pattern cannot be matched using a simple test for equality. Some sophisticated plain text matching algorithms are needed for attractive Google-like search:
- the Full Text Search (FTS) matches a pattern exactly.
- the Fuzzy search performs approximate string matching according to a distance algorithm tuned to a problem domain that finds both exact matches and deviations.
- the Acoustic search is an approximate String matching according to a phonetic algorithm.
Those algorithms make the difference between searching and finding, but the challenge is to create an efficient implementation.
An approach used by search engines like TREX or the SAP Enterprise Search tool is to pre-process the text and store results in index tables, enabling fast access
Without TREX, generic text search might not be high on your agenda, but let us consider other options…
Fuzzy search is not supported in ABAP SQL (formerly known as Open SQL). A simple pattern matching is enabled by the LIKE statement of the WHERE clause
SELECT product FROM catalog WHERE name LIKE '%_agento%'.
Long texts saved in tables STXH and STXL are accessed with function modules like READ_TEXT that can also read the text buffer and the archive. This approach does not allow searching for long text.
ABAP Database Connectivity
ABAP Database Connectivity (ADBC) can be used as a powerful alternative to ABAP SQL to
- create and manage connections to all database maintained in transaction DBACOCKPIT (table DBCON).
- execute DB specific statements
- process the query results
- No automatic client handling
- Security risk: SQL Injection.
Important ADBC Classes
- execute_ddl( ) – CREATE TABLE, DROP TABLE
- execute_query( ) – SELECT FROM WHERE
- execute_update( ) – INSERT INTO VALUES ( )
- prepare_statement( )
- CL_SQL_RESULT_SET – Result set of an SQL query
- set_param_table( )
- next_package( )
- close( )
- CL_SHDB_SELTAB method sql_where_condition( ) and the static method COMBINE_SELTABS( ) shall be used to convert SELECT-OPTIONS into Native SQL.
- Demo Report DEMO_ADBC_DDL_DML
As a prerequisite, a full-text index must be created for the column in the dictionary (this is only possible for HDB) or the column must be defined as TEXT at HDB table level.
In my scenario, I replicated available long text content into a HDB TEXT column. This in effect creates an index, enabling queries using CONTAINS( ).
generates SQL statement for fuzzy text search for the pattern TEST OR PALLET
SELECT TOP 200 DISTINCT SCORE() AS FUZZY_SCORE, K.EBELN, P.EBELP FROM EKKO AS K INNER JOIN EKPO AS P ON K.EBELN = P.EBELN AND K.MANDT = P.MANDT INNER JOIN LFA1 AS L ON K.LIFNR = L.LIFNR AND K.MANDT = L.MANDT LEFT OUTER JOIN ADRC AS A ON L.ADRNR = A.ADDRNUMBER AND L.MANDT = A.CLIENT WHERE K.MANDT = '140' AND A.NATION = '' AND CONTAINS(P.TXZ01, 'TEST OR PALLET', FUZZY(0.8, 'textSearch=compare,similarCalculationMode=search' )) AND K.EKORG = 'GLOB' AND K.KDATB <= '20190213' AND K.KDATE >= '20190424' AND K.LOEKZ = '' AND P.LOEKZ = '' AND SCORE() >= 0.20000000 ORDER BY FUZZY_SCORE DESC
It works! (and yes, you could also use AMDP).
After this successful endeavor, I searched the web and realized the CONTAINS( ) predicate is also available on other platforms.
The SAP Netweaver demo system NPL runs on Sybase (SYB or SAP ASE). ABAP does not support full text index for other DBs.
Index ZTOAD-Y01 is of type Full Text. This is not implemented for DB SYB
Message no. DT558
But Sybase also support the CONTAINS( ) statement for exact and fuzzy text search.
So how to enable the indexing? Sybase’s help says NGRAM TEXT index search is mainly useful when words are misspelled. Sybase IQ does not support searches like synonyms and antonyms.
Oracle supports fuzzy text search with variants of the CONTAINS( ) statement.
MS SQL Server supports full-text searching with the CONTAINS( ) statement.
You could try to use a custom of the string distance functions as a DB procedure if not already available on a given platform.
- Here is a T-SQL (MS SQL server) implementation of the Levenshtein distance.
- MySQL has Full Text Search
AGAINST('Rose', 'Crown')and the SOUNDEX( ) function for acoustic search
SELECT * FROM `author_bios` WHERE SOUNDEX(`auth_last_name`) = SOUNDEX('Williams')
High performance full-text search is a feature implemented by many database in non standard ways. Using ABAP DB Connectivity opens up new vistas, like the helpful CONTAINS( ) statement supported by SAP HANA DB an other databases.
I would call Fuzzy Text search the poster child of ADBC.
Content reposted from www.informatik-dv.com