Application Development Blog Posts
Learn and share on deeper, cross technology development topics such as integration and connectivity, automation, cloud extensibility, developing at scale, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 
nomssi
Active Contributor
Implementing a plain text search in ABAP is more challenging than mere querying table attributes. A user-specified search pattern cannot be matched using a simple test for equality. Some sophisticated plain text matching algorithms are needed for attractive Google-like search:

  • the Full Text Search (FTS) matches a pattern exactly.

  • the Fuzzy search performs approximate string matching according to a distance algorithm tuned to a problem domain that finds both exact matches and deviations.

  • the Acoustic search is an approximate String matching according to a phonetic algorithm.


Those algorithms make the difference between searching and finding, but the challenge is to create an efficient implementation.

An approach used by search engines like TREX or the SAP Enterprise Search tool is to pre-process the text and store results in index tables, enabling fast access

https://en.wikipedia.org/wiki/String-searching_algorithm

Without TREX, generic text search might not be high on your agenda, but let us consider other options...

ABAP SQL


Fuzzy search is not supported in ABAP SQL (formerly known as Open SQL). A simple pattern matching is enabled by the LIKE statement of the WHERE clause
SELECT product FROM catalog WHERE name LIKE '%_agento%'.

Long texts saved in tables STXH and STXL are accessed with function modules like READ_TEXT that can also read the text buffer and the archive. This approach does not allow searching for long text.

ABAP Database Connectivity


ABAP Database Connectivity  (ADBC) can be used as a powerful alternative to ABAP SQL to

  • create and manage connections to all database maintained in transaction DBACOCKPIT (table DBCON).

  • execute DB specific statements

  • process the query results


Known Problems:

  • No automatic client handling

  • Security risk: SQL Injection.


Important ADBC Classes



  • CL_SQL_CONNECTION

  • CL_SQL_STATEMENT

    • execute_ddl( ) - CREATE TABLE, DROP TABLE

    • execute_query( ) - SELECT FROM WHERE

    • execute_update( ) - INSERT INTO VALUES ( )



  • CL_SQL_PREPARED_STATEMENT

    • prepare_statement( )



  • CL_SQL_RESULT_SET - Result set of an SQL query

    • set_param_table( )

    • next_package( )

    • close( )



  • CX_SQL_EXCEPTION

  • CL_SHDB_SELTAB method sql_where_condition( ) and the static method COMBINE_SELTABS( ) shall be used to convert SELECT-OPTIONS into Native SQL.

  • Demo Report DEMO_ADBC_DDL_DML


HANA DB


Queries with the CONTAINS( ) predicate can be created via ADBC on HANA DB (HDB) to implement FTS, Fuzzy and Acoustic search (cf. search, analysis and mining).

As a prerequisite, a full-text index must be created for the column in the dictionary (this is only possible for HDB) or the column must be defined as TEXT at HDB table level.

In my scenario, I replicated available long text content into a HDB TEXT column. This in effect creates an index, enabling queries using CONTAINS( ). 

Sample Code




generates SQL statement for fuzzy text search for the pattern TEST OR PALLET
SELECT TOP 200 DISTINCT SCORE() AS FUZZY_SCORE, K.EBELN, P.EBELP    
FROM EKKO AS K INNER JOIN EKPO AS P ON K.EBELN = P.EBELN
AND K.MANDT = P.MANDT
INNER JOIN LFA1 AS L ON K.LIFNR = L.LIFNR AND K.MANDT = L.MANDT
LEFT OUTER JOIN ADRC AS A ON L.ADRNR = A.ADDRNUMBER
AND L.MANDT = A.CLIENT
WHERE K.MANDT = '140' AND A.NATION = ''
AND CONTAINS(P.TXZ01, 'TEST OR PALLET',
FUZZY(0.8, 'textSearch=compare,similarCalculationMode=search' ))
AND K.EKORG = 'GLOB'
AND K.KDATB <= '20190213' AND K.KDATE >= '20190424'
AND K.LOEKZ = '' AND P.LOEKZ = ''
AND SCORE() >= 0.20000000
ORDER BY FUZZY_SCORE DESC

It works! (and yes, you could also use AMDP).

After this successful endeavor, I searched the web and realized the CONTAINS( ) predicate is also available on other platforms.

Sybase


The SAP Netweaver demo system NPL runs on Sybase (SYB or SAP ASE). ABAP does not support full text index for other DBs.

Index ZTOAD-Y01 is of type Full Text. This is not implemented for DB SYB


Message no. DT558


But Sybase also support the CONTAINS( ) statement for exact and fuzzy text search.

So how to enable the indexing? Sybase's help says NGRAM TEXT index search is mainly useful when words are misspelled. Sybase IQ does not support searches like synonyms and antonyms.

Oracle


Oracle supports fuzzy text search with variants of the CONTAINS( ) statement.

SQL Server


MS SQL Server supports full-text searching with the CONTAINS( ) statement.

Other DBs


You could try to use a custom of the string distance functions as a DB procedure if not already available on a given platform.

  • Here is a T-SQL (MS SQL server) implementation of the Levenshtein distance.

  • MySQL has Full Text Search WHERE MATCH(column) AGAINST('Rose', 'Crown') and the SOUNDEX( ) function for acoustic search


SELECT * FROM `author_bios` 
WHERE SOUNDEX(`auth_last_name`) = SOUNDEX('Williams')

Summary


High performance full-text search is a feature implemented by many database in non standard ways. Using ABAP DB Connectivity opens up new vistas, like the helpful CONTAINS( ) statement supported by SAP HANA DB an other databases.

I would call Fuzzy Text search the poster child of ADBC.

 


Content reposted from www.informatik-dv.com
8 Comments