While doing some research on Sentiment and Text Analysis for one of my projects, I came across a really nice blogspot.

http://www.slideshare.net/jeffreybreen/r-by-example-mining-twitter-for

Inspired by the above, I thought of doing some sentiment analysis in SAP PA using twitter tweets.Hence decided to go ahead and do some text mining and Sentiment Analysis using the twitteR package of R.

I have created a multi-series blog where we see the different things we can do using SAP PA, R and Twitter.

First blog here talks about how get the twitter data inside SAP PA and build a word-cloud by building a text corpus.

Scenario:

I downloaded some public opinion data regarding Car Manufacturer from the NCSI-UK website.

http://ncsiuk.com/index.php?option=com_content&task=view&id=18&Itemid=33

The data is from 2009-2013. My intention was to just see what is the public sentiment of people for these manufacturers on Social Networking Site twitter and build a probable score for 2014 based on twitter sample population. I loaded the data in SAP PA. First I build a word cloud for some of the hashtags of the cars and plot a graph on number of re-tweets. In the next blog postings I will be doing Sentiment Analysis of this data and Emotion Classification.

Before I start let me make it clear that this is only sample data which was analyzed only for the purpose learning. It’s not to target any brand or influence any brand. The outputs and analysis shown here are just based on opinion and should not be considered facts.


Step1: Setting up the Twitter account and API for handshake with R

Please refer this step by step document to setup the twitter API and the settings required to call the API and get tweet data inside R.

Setting up Twitter API to work with R

Step2: Getting the tweet data in SAP PA and building a word-cloud.

Now we need to create a custom R component to get the data into SAP PA and create a text corpus and display it as a word-cloud. I have used the tm_map function comes that comes with the tm package for setting up the text corpus data for word-cloud. The various commands are self-explanatory as shown in the comments. I have used wordcloud package to generate the word-cloud.

The code below lists down the steps you need to do to get the desired output. The configuration settings are shown in the screenshots below.

mymain<- function(mydata, mytweet, mytweetnum)

{

##Load the necessary packages

library(twitteR)

library(RJSONIO)

library(bitops)

library(RCurl)

library(wordcloud)

library(tm)

library(SnowballC)

## Enable Internet access.

setInternet2(TRUE)

##Load the environment containing twitter credential data (saved in Step 1)

load(‘C:/Users/bimehta/Documents/twitter authentication.Rdata’)

##Establish the handhsake with R

registerTwitterOAuth(credential)

options(RCurlOptions = list(cainfo = system.file(“CurlSSL”, “cacert.pem”, package = “RCurl”)))

##Get the tweet list from twitter site (based on parameters entered by user)

tweetList <- searchTwitter(mytweet, n=mytweetnum)

##create text corpus

r_stats_text <- sapply(tweetList, function(x) x$getText())

r_stats_text_corpus <- Corpus(VectorSource(r_stats_text))

##clean up of twitter Text data by removing punctuation and English stop words like “the”, “an”

r_stats_text_corpus <- tm_map(r_stats_text_corpus, tolower)

r_stats_text_corpus <- tm_map(r_stats_text_corpus, removePunctuation)

r_stats_text_corpus <- tm_map(r_stats_text_corpus, removeWords, stopwords(“english”))

r_stats_text_corpus <- tm_map(r_stats_text_corpus, stemDocument)

##Build and print wordcloud

out2 <-wordcloud(r_stats_text_corpus, scale=c(10,1), random.order=FALSE, rot.per=0.35, use.r.layout=FALSE, colors=”blue”)

print(out2)

## Return the twitter data in a table

result <- as.data.frame(cbind(Audi.df$text, Audi.df$created, Audi.df$statusSource, Audi.df$retweetCount))

return(list(out=result))

}

Configuration Setting:

Pic6.PNG

Pic7.PNG

Running the Algorithm and getting the output:

Pic8.PNG

The output table (created on is char):

Pic9.PNG

Visualizations:

Pic10.PNG

Pic11.PNG

The general opinion of the public from wordcloud seems positive. However we will do a detailed sentiment analysis of the various brands in our source file and plot the heat map based on 2013 survey findings in my next blog. This will help us know whether current public sentiment is in line with survey findings.

To be continued in Sentiment Analysis.

To report this post you need to login first.

17 Comments

You must be Logged on to comment or reply to a post.

  1. Tamilnesan Ganesan

    Hello,

    I am trying to implement a similar model.

    Since OAuthFactory is no longer supported, I tried using

    setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret)


    When I run the model, this particular code crashes the “Expert Analytics” application.


    Do you have any idea how to handle this?


    Thank you.

    Tamilnesan

    (0) 
        1. Antoine CHABERT

          Thanks.

          Can you also provide us with the R script and the error message you see ? You can also share via this post or send them to my email address at sap.com.

          Cheers,

          Antoine

          (0) 
        2. Antoine CHABERT

          Hi,

          Sorry for the delay. Is it a possibility that you can share the LUMS file as well? I tried to reproduce this morning with no luck – as I am missing the original data I guess.

          Thanks & regards

          Antoine

          (0) 
      1. Ranajay Sit

        For me the code runs perfectly in R studio, but when I try this in SAP PA expert mode version 2.3, the tool automatically gets closed. I tried it several time but the tool exits whenever I tried to run the custom component.

        (0) 
          1. Ranajay Sit

            Hi Antoine,

            I am using the below dataset.

            Cars Baseline 2009 2010 2011 2012 2013 Sentiments
            Volkswagen 79 80 79 81 82 80 7
            Toyota 80 83 80 82 83 82 23
            Audi 83 84 83 11
            Nissan 82 82 8
            BMW 80 78 78 81 82 82 9
            All Others 77 76 81 81 81 81 0
            Ford 75 73 76 76 79 77 26
            Peugeot 73 73 77 77 77 78 13
            Vauxhall 75 75 75 77 77 78 129
            Renault 73 72 80 74 76 75 1

            Regards

            Ranajay

            (0) 
            1. Antoine CHABERT

              Bingo – I reproduced it. It took me a while to configure the missing R packages (twitteR and the likes) but I am now done. Thanks for the hint!

              Antoine

              (0) 
  2. Wei Tai

    Hi,

    I try to run this in PA but it seems Twitter has ceased registration for new apps. I could not generate the twitter_authentication.rdata which is required by the component in the main post (because lack of Cosumer Key and Consumer Secret). Anyone knows how to deal with this?

    Cheers

    Wei

    (0) 
    1. Wei Tai

      Hi Ian,

      Twitter needs to register mobile phone to the account to setup an Twitter App but it fails to register my mobile phone. It seems it has been like this for a while. Anyway, problem is solved. I got a script that has twitter auth info. Thank you for the information. 

      Thanks,

      Wei

      (0) 
  3. Yuvan Ramez

    Hi . . I have a Question here . .

    You did Text mining in R with the #WT20 Data. But when it comes to SAP EA why the results are “Car Satisfaction reviews” ?

    (0) 

Leave a Reply