Skip to Content
Author's profile photo Former Member

Text Analysis of IPL Match using Twitter Data (Part 2)

Hi Friends, 

I am back with continuation to my below blog, 

http://scn.sap.com/community/hana-in-memory/blog/2014/05/16/text-analysis-of-ipl-match-using-twitter-data

In this part of document, we will be focusing on Custom Dictionaries.

Recap:

If you refer to below screen shot it indicates when SQL query is executed for 

TA_TYPE = ‘PERSON’, Virat Kohli & Ashwin repeated few times in separate rows.

Snap 16 - SQL commands_3.JPG

Why this is happening:

When comments are entered in Twitter by different Users, it depends on individuals 

how data is entered.

Possibility of having Cricketer names entered in different ways is a common scenario.

To make it Standard and for easy analysis, we need to create custom dictionaries and let system return a uniform name when SQL is executed.

Now let’s see how this can be achieved:

Create custom HANA Text Analysis configuration file

In HANA studio create a workspace followed by creating and sharing a project.

Under this project create a new file with extension “hdbtextconfig”. 

Copy all the contents of one of the predefined configurations delivered by SAP they are located in the HANA repository
package: “sap.hana.ta.config”.

For this exercise, let’s copy contents of the configuration file “EXTRACTION_CORE_VOICEOFCUSTOMER”.

Creating a Text Analysis Configuration: Section 10.1.3.2.1 of the
HANA developer guide SPS07: http://help.sap.com/hana/SAP_HANA_Developer_Guide_en.pdf

In next document I will highlight how to create Custom Dictionary and put in Custom Configuration that we created just now to achieve analysis on Twitter Data and avoide repeated names when running SQL to perform analysis.

Assigned Tags

      2 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member

      Please let me know if any open questions and I will be happy to answer those.

      Author's profile photo Former Member
      Former Member

      Hi Rahul

      Your post is really helpful. I'm a beginner at HANA i have a URL which returns a JSON containing all my LinkedIn Connections i want to add this data into my HANA tables. I noticed that you imported a excel but that is not possible for me.Can you suggest any steps ?