Skip to Content
Author's profile photo Former Member

SAP PA and Twitter – Building Wordcloud

While doing some research on Sentiment and Text Analysis for one of my projects, I came across a really nice blogspot.

http://www.slideshare.net/jeffreybreen/r-by-example-mining-twitter-for

Inspired by the above, I thought of doing some sentiment analysis in SAP PA using twitter tweets.Hence decided to go ahead and do some text mining and Sentiment Analysis using the twitteR package of R.

I have created a multi-series blog where we see the different things we can do using SAP PA, R and Twitter.

First blog here talks about how get the twitter data inside SAP PA and build a word-cloud by building a text corpus.

Scenario:

I downloaded some public opinion data regarding Car Manufacturer from the NCSI-UK website.

http://ncsiuk.com/index.php?option=com_content&task=view&id=18&Itemid=33

The data is from 2009-2013. My intention was to just see what is the public sentiment of people for these manufacturers on Social Networking Site twitter and build a probable score for 2014 based on twitter sample population. I loaded the data in SAP PA. First I build a word cloud for some of the hashtags of the cars and plot a graph on number of re-tweets. In the next blog postings I will be doing Sentiment Analysis of this data and Emotion Classification.

Before I start let me make it clear that this is only sample data which was analyzed only for the purpose learning. It’s not to target any brand or influence any brand. The outputs and analysis shown here are just based on opinion and should not be considered facts.


Step1: Setting up the Twitter account and API for handshake with R

Please refer this step by step document to setup the twitter API and the settings required to call the API and get tweet data inside R.

Setting up Twitter API to work with R

Step2: Getting the tweet data in SAP PA and building a word-cloud.

Now we need to create a custom R component to get the data into SAP PA and create a text corpus and display it as a word-cloud. I have used the tm_map function comes that comes with the tm package for setting up the text corpus data for word-cloud. The various commands are self-explanatory as shown in the comments. I have used wordcloud package to generate the word-cloud.

The code below lists down the steps you need to do to get the desired output. The configuration settings are shown in the screenshots below.

mymain<- function(mydata, mytweet, mytweetnum)

{

##Load the necessary packages

library(twitteR)

library(RJSONIO)

library(bitops)

library(RCurl)

library(wordcloud)

library(tm)

library(SnowballC)

## Enable Internet access.

setInternet2(TRUE)

##Load the environment containing twitter credential data (saved in Step 1)

load(‘C:/Users/bimehta/Documents/twitter authentication.Rdata’)

##Establish the handhsake with R

registerTwitterOAuth(credential)

options(RCurlOptions = list(cainfo = system.file(“CurlSSL”, “cacert.pem”, package = “RCurl”)))

##Get the tweet list from twitter site (based on parameters entered by user)

tweetList <- searchTwitter(mytweet, n=mytweetnum)

##create text corpus

r_stats_text <- sapply(tweetList, function(x) x$getText())

r_stats_text_corpus <- Corpus(VectorSource(r_stats_text))

##clean up of twitter Text data by removing punctuation and English stop words like “the”, “an”

r_stats_text_corpus <- tm_map(r_stats_text_corpus, tolower)

r_stats_text_corpus <- tm_map(r_stats_text_corpus, removePunctuation)

r_stats_text_corpus <- tm_map(r_stats_text_corpus, removeWords, stopwords(“english”))

r_stats_text_corpus <- tm_map(r_stats_text_corpus, stemDocument)

##Build and print wordcloud

out2 <-wordcloud(r_stats_text_corpus, scale=c(10,1), random.order=FALSE, rot.per=0.35, use.r.layout=FALSE, colors=”blue”)

print(out2)

## Return the twitter data in a table

result <- as.data.frame(cbind(Audi.df$text, Audi.df$created, Audi.df$statusSource, Audi.df$retweetCount))

return(list(out=result))

}

Configuration Setting:

Pic6.PNG

Pic7.PNG

Running the Algorithm and getting the output:

Pic8.PNG

The output table (created on is char):

Pic9.PNG

Visualizations:

Pic10.PNG

Pic11.PNG

The general opinion of the public from wordcloud seems positive. However we will do a detailed sentiment analysis of the various brands in our source file and plot the heat map based on 2013 survey findings in my next blog. This will help us know whether current public sentiment is in line with survey findings.

To be continued in Sentiment Analysis.

Assigned tags

      17 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member

      Hello,

      I am trying to implement a similar model.

      Since OAuthFactory is no longer supported, I tried using

      setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret)


      When I run the model, this particular code crashes the "Expert Analytics" application.


      Do you have any idea how to handle this?


      Thank you.

      Tamilnesan

      Author's profile photo Antoine CHABERT
      Antoine CHABERT

      Hi,

      What is the error you get in the PA logs?

      Logs are here: C:\Users\<Your Windows User>\AppData\Local\Temp\sappa\logs

      Thanks & regards

      Antoine

      Author's profile photo Former Member
      Former Member

      Hi Antoine,

      Attaching the log snippet.

      TraceLog.png

      This happens only with SAP PA, the same script runs properly in R Studio.

      Looking forward to your suggestions.

      Thank you.

      Author's profile photo Antoine CHABERT
      Antoine CHABERT

      Thanks.

      Can you also provide us with the R script and the error message you see ? You can also share via this post or send them to my email address at sap.com.

      Cheers,

      Antoine

      Author's profile photo Antoine CHABERT
      Antoine CHABERT

      Hi,

      Sorry for the delay. Is it a possibility that you can share the LUMS file as well? I tried to reproduce this morning with no luck - as I am missing the original data I guess.

      Thanks & regards

      Antoine

      Author's profile photo M. van Foeken
      M. van Foeken

      Any luck already how you resolved this issue? I'm bumping into the same...

      With kind regards,

      Martijn

      Author's profile photo Former Member
      Former Member

      For me the code runs perfectly in R studio, but when I try this in SAP PA expert mode version 2.3, the tool automatically gets closed. I tried it several time but the tool exits whenever I tried to run the custom component.

      Author's profile photo Antoine CHABERT
      Antoine CHABERT

      Ranajay Sit is it a possibility that you can share the LUMS file with me? I tried to reproduce this morning with no luck - as I am missing the original data I guess.

      Author's profile photo Former Member
      Former Member

      Hi Antoine,

      I am using the below dataset.

      Cars Baseline 2009 2010 2011 2012 2013 Sentiments
      Volkswagen 79 80 79 81 82 80 7
      Toyota 80 83 80 82 83 82 23
      Audi 83 84 83 11
      Nissan 82 82 8
      BMW 80 78 78 81 82 82 9
      All Others 77 76 81 81 81 81 0
      Ford 75 73 76 76 79 77 26
      Peugeot 73 73 77 77 77 78 13
      Vauxhall 75 75 75 77 77 78 129
      Renault 73 72 80 74 76 75 1

      Regards

      Ranajay

      Author's profile photo Antoine CHABERT
      Antoine CHABERT

      Bingo - I reproduced it. It took me a while to configure the missing R packages (twitteR and the likes) but I am now done. Thanks for the hint!

      Antoine

      Author's profile photo Former Member
      Former Member

      So did you figure out the problem?

      Author's profile photo Antoine CHABERT
      Antoine CHABERT

      Not yet, I passed this to our engineering team for further investigation.

      Author's profile photo Former Member
      Former Member

      Thanks Appreciate it 🙂

      Author's profile photo Former Member
      Former Member

      Hi,

      I try to run this in PA but it seems Twitter has ceased registration for new apps. I could not generate the twitter_authentication.rdata which is required by the component in the main post (because lack of Cosumer Key and Consumer Secret). Anyone knows how to deal with this?

      Cheers

      Wei

      Author's profile photo Ian Henry
      Ian Henry

      Hi Wei,

      You can still get the consumer key and consumer secret from Twitter.  I have recently done this for the HANA Data Provisioning agent which also requires the same parameters.  To find these you can look at the HANA Academy video, SAP HANA Academy - Smart Data Integration/Quality : Twitter Replication Pt 1 of 3 [SPS09] - YouTube at 2:40.

      Author's profile photo Former Member
      Former Member

      Hi Ian,

      Twitter needs to register mobile phone to the account to setup an Twitter App but it fails to register my mobile phone. It seems it has been like this for a while. Anyway, problem is solved. I got a script that has twitter auth info. Thank you for the information. 

      Thanks,

      Wei

      Author's profile photo Yuvan Ramez
      Yuvan Ramez

      Hi . . I have a Question here . .

      You did Text mining in R with the #WT20 Data. But when it comes to SAP EA why the results are "Car Satisfaction reviews" ?