Skip to Content

Using Data Services Text Data Processing with SAP Lumira to Analyze Tweets

Time for some self-introspection – what have I been tweeting about the last 3+ years?  At first I suspected that the ASUG hashtag would be the one that I tweeted most.  I was wrong.

I first downloaded an archive of my tweets.  Then I used the Data Services to read the 140 characters of unstructured data – what hashtags have I tweeted the most about?

/wp-content/uploads/2013/08/1textdataprocessing_256725.jpg

You can see on the left I am reading in my tweets, and using the Base Entity Transform from Data Services.

/wp-content/uploads/2013/08/2filterbysocialmedia_256726.jpg

There are several options in the Text Data Processing transform – people, product, person, organization, etc.  For now I am just selecting social media.

After running the Data Services batch job, it outputs it to a file which I take to SAP Lumira.

/wp-content/uploads/2013/08/3heatmap_256727.jpg

Above is the initial heat map.  It’s hard to read so next I filter down to the highest number of tweets.

/wp-content/uploads/2013/08/4filteringheatmap_256728.jpg

So you can see that ASUG is not my most tweeted hashtag, as I thought, but it’s SCN, then SAP, then ASUG.

/wp-content/uploads/2013/08/5casematters_256729.jpg

When I look at the grid view, you can see case matters with the hashtags – note #SAPTechEd SAPTechED (wrong) and sapteched.

/wp-content/uploads/2013/08/6tagcloud_256733.jpg

Finally, what’s text data processing without a tag cloud?

Related links:

Get Your Data Geek Badges Today! by Anita Yuen

Sentiment Analysis using SAP BusinessObjects Data Services by Louis de Gouveia

SAP TechEd Las Vegas | October 21–25, 2013 | ASUG Pre-Conference Seminars – BI4.1 hands-on includes SAP BusinessObjects Lumira

Text Data Processing Transform – Going from Unstructured to Structured


14 Comments
You must be Logged on to comment or reply to a post.