Skip to Content

It’s that time of year again when great things happen. During SAP TechEd in Las Vegas SAP introduced free HANA access for developers. See Rudi Leibbrandt introducing it here: Free access to SAP HANA, with SAP HANA, express edition.

In short, the HANA Express Edition gives easy access to the most powerfull platform on the market (HANA naturally 😉 ) on your own system of choice. In my case an Intel NUC.

I have been test driving the HXE for a couple of hours now and would like to give you some idea on how to get content on it and do some wicked analysis. I have used parts of previous blogs I’ve written to test out how the HXE compares to my previous AWS system and I can say, it works beautifully!

Some of my blogs I used as a reference:

Introducing the very first web to HANA extractor using import.io

The not so fuzzy “Fuzzy Search”

Importing data

I am a huge fan of import.io. You can basically use it to mine the web and load it into HANA. For this blog I decided to scrape the UI5 forum on stackoverflow to see what questions are most frequently used, actually to be more specific: which words are used the most often in questions. Just a small test to check the import possibilities of HXE and it’s text analysis capabilities.

Knipsel.PNG

First I created an extractor and told it to go 47 pages deep:

Knipsel.PNG

The great part of import.io is that you can even get the results back as OData which in turn you can use to load HANA with. For this blog I am going for a quick load using the import part via Eclipse:

Knipsel.PNG

After a split second the data is loaded:

Knipsel.PNG

Doing some analysis

Using HXE’s text analysis possibilities will give insights on the type of questions are asked in the UI5 forum.

CREATE FULLTEXT INDEX “nameofindex” On “SYSTEM”.”Stack2″(“Excerpt description”)

TEXT ANALYSIS ON

CONFIGURATION ‘LINGANALYSIS_FULL

LINGANALYSIS_FULL will go through the questions posted in the forum and break them up into type of words (noun, pronoun, verb,.).

The above command will create an index on my loaded table and create a shadow table with the analysis:

Knipsel.PNG

So what are the most frequently used words?:

select top 20 TA_TOKEN , COUNT (*) AS COUNT

From “SYSTEM”.”$TA_nameofindex” where TA_TYPE like ‘noun’ GROUP by TA_TOKEN ORDER by count desc

Knipsel.PNG

Impressive performance on the Intell NUC Skull canyon!

/wp-content/uploads/2016/09/intel_skull_canyon_nuc_2016_03_17_04_1042833.jpg

So based on the quick analysis, I guess SAP will need to put some extra efford in the UI5 documentation to make sure the awesome guys answering the questions will have an easier time ;-).

I’m looking at you UI5 rockstars:

Knipsel.PNG

See you at UI5CON in Eindhoven 🙂 !

Stay tuned for more and in case you did not get in the vibe from this blog, HXE Rocks!

Tx for reading!

Ronald.

To report this post you need to login first.

3 Comments

You must be Logged on to comment or reply to a post.

  1. Lars Breddemann

    Nice to see the “skull” becoming more popular 😎 !

    I’m also still very happy with the performance and the level of direct access compared to a “proper HANA server” you get from it.

    One thing though, I believe you meant to write “web scraping” and not “web scaping”. (hmm… thinking about it, it sound like a portmanteau of web and escape, so maybe that is what you wanted to write after all… 😀 )

    (0) 

Leave a Reply