Skip to Content

Text Mining: A distinctive aspect of HANA

Hello Members,


This blog is about how we can use Text Mining feature of HANA. As we know, most of the data today is in the form of unstructured data. Need of today is to extract meaningful and useful information out of it. Text Mining provides us the solution of extracting the meaningful information out of unstructured Data.


More Information on Text Mining can be found in the below link

http://help.sap.com/hana/SAP_HANA_Text_Mining_Developer_Guide_en.pdf


Let us see on how we can use the power of text mining from the below scenario:


I have fetched the Unstructured Data from Samsung Facebook Page into HANA database for doing Text Mining. Following are the steps required for doing Text Mining.

  1. Create a table in HANA to store the unstructured data.
  2. Create a Full Text Index for the Column in which text is stored with Text Mining ON. (Syntax for creating Full Text Index can be referred from Reference Guide).
  3. Text Mining was supported only by SAP HANA XS API till SAP HANA SPS09. From SPS10 we can either use SAP HANA XS API or SQL to use text mining functions.


I have used SAP HANA XS API for implementing text mining functions and shown the output in the form of html table.

1. SUGGESTED TERMS

          This function suggest the terms based on the input substring. I have passed ‘Sams‘ as input and it provided me the suggestions of terms from the data loaded in HANA. I have restricted the output to top 16 rows.

Code Snippet:

SuggestedTerms.JPG


Output:

TMSuggestedTerms.jpg

2. RELATED DOCUMENTS

     This functions provides the related comments based on the input string. I have passed ‘Samsung S6 Edge‘ and I showed the Comment ID, Post ID, Message i.e. Comments on Facebook related to input string. By using this function we have fetched the comments posted by Facebook Users on the Specific Product. I have restricted the output to top 16 rows.


Code Snippet:

RelatedDocument.JPG


Output:

TMRelatedDocuments.jpg


This is just a glimpse of how we can use text mining feature of HANA for the websites/ applications in order to extract only useful data for the organization out of huge amount of unstructured data.


I hope you find my first blog interesting and useful! 🙂



Cheers,

Deepak Varandani

17 Comments
You must be Logged on to comment or reply to a post.