Sentiment Analysis on comments made by customers or users can provide insights into several aspects just not only on what they like or not like. Key is to approach it like a good questionnaire or survey would, which aims to find apart from obvious inferences from the individual ratings or responses of the questions but also not so easy inferences by analyzing responses using multiple attributes , like gallop surveys which can quite accurately predict results for a big constituency even by sampling only minuscule set of cross section of people.


In continuation of my thoughts on Feedback and Sentiment analysis shared earlier here at Voice of Customer , Sentiment analysis & Feedback service where I used text analysis features available on HANA to do sentiment analysis on the customer feedback collected from external web sites.

It was done by providing users or consumers a HTML based form to input their ratings and free text as review & feedback on a web site and then using JQuery/AJAX  call to a service on HANA Cloud Platform to collect these customer or user ratings and feedback .

But how about gathering the feedback from social media and analyze what people or customers are talking on social media, for example on Twitter.

Twitter provides a REST API service to search message feeds or tweets , however it requires requesting applications to authenticate all requests with OAuth. It means an oauth token or access token must be present in each request. As a consequence of this requirement, requestor of Twitter search service has to create a developer account on http://dev.twitter.com.  The consumer and access token keys will get assigned to the requestor account .

With these tokens, REST API search can be made on all messages or messages of a specific person or account.

Though there are several ways to invoke this Twitter API, Client based JQuery/AJAX methods do not work. Work around may exist to use JQuery/AJAX however this method is not recommended, as server side call to twitter API is the recommended secure method.

Searching  for Twitter messages can be implemented in many ways , for example a Java application to search twitter messages using Eclipse IDE and open source Java library available at  http://twitter4j.org can be a way. However I chose to use PHP script to search twitter messages on my local server. There are many open source libraries and widgets which work with OAuth. I have used twitteroauth PHP library. It can be downloaded from GitHub · Build software better, together and source can be easily included in PHP script.

Example of a search could be say If you want to search for all the tweets with word “architecture” in it and want to limit the search to return only 10 tweet messages.

Snippet of PHP code that will do this search is shared below.


<?php
$consumer = "put your consumer key";
$consumersecret = "put your consumer secret id";
$accesstoken = "put your access token";
$accesstokensecret = "put your access token secret";
$twitter = new TwitterOAuth($consumer,$consumersecret,$accesstoken,$accesstokensecret);
$tweets = $twitter->get('https://api.twitter.com/1.1/search/tweets.json?q=architecture&result_type=mixed&count=10');
?>

However If you want to make the search more flexible and would like to search for any word in Twitter messages then below is a snippet of PHP that i have used.


<html>
  <head>
               <meta charset = "UTF-8" />
               <title>Twitter Search using PHP</title>
  </head>
  <body>
               <form action = "" method = "post">
               <label> Search : <input type="text" name="keyword" /> </label>
               </form>
<?php
  if ( isset($_POST['keyword'])) {
  $tweets = $twitter->get('https://api.twitter.com/1.1/search/tweets.json?q='.$_POST['keyword'].'&lang=en&result_type=mixed&count=50');
if(isset($tweets->statuses) && is_array($tweets->statuses)) {
 if(count($tweets->statuses)) {
 foreach($tweets->statuses as $tweet) {
 echo $tweet->user->screen_name.'<br>';
 echo $tweet->created_at.'<br>';
 echo $tweet->text.'<br>';
 echo '*************************************************************************************'.'<br>';
 }
 }
       }
  }
?>
</body>
</html>

We can now search and print the twitter feeds using PHP script. However still an important question remains unanswered, which is How to save these message feeds or texts into HANA ? Again several approaches can be taken. In the past I have cURL libraries to invoke REST Web services and POST method to update. This time around I thought of using ODBC. I am using HANA on a Cloud Platform so I need to open a secure DB Tunnel to get access to database from my client machine. To open the secure DB Tunnel, i used another software application called Neo which is available at HANA cloud platform tools download area. When the tunnel is open, it provides you with a temporary host, port, user and password to access the HANA database on cloud instance. These values need to be provided for in the PHP script using ODBC to connect to the  HANA database . One will need similar values to connect to on premise HANA database in the same fashion. PHP Code snippet to connect to HANA database is provided below.


<?php


$driver = 'HDBODBC32'; // i am using 32 bit
$host = "localhost:port";
// Default name of your hana instance
$db_name = "HDB";
$username = "Your user name";
$password = "Your password";
// Connect now
$conn = odbc_connect("Driver=$driver;ServerNode=$host;Database=$db_name;", $username, $password, SQL_CUR_USE_ODBC);
if (!$conn)
{
    // connection failed
     echo "ODBC error code: " . odbc_error() . ". Message: " . odbc_errormsg();
}
else
{
echo " connection sucess";
?>

Once the ODBC connections are established , all SQL operations like Select & Insert statements can be done by forming the SQL command in a string variable $sql and using  function odbc_exec()  call to execute the SQL statement . For example $result = odbc_exec($conn, $sql);

When all operations on HANA database are finished, for the final action to close the data base connection function odbc_close() be performed. For example statement odbc_close($conn);

HANA table which stores the Tweets data’s named as “MyTweetsTable” with columns to store tweets information like the tweet user, created date, message or TEXT and hashtags values in the message .

Using HANA SQL statement  Create FullText Index “TWEETS_myindex” On “MyTweetsTable”(“TEXT”) TEXT ANALYSIS ON CONFIGURATION ‘EXTRACTION_CORE’; another HANA table  gets created with text analysis performed on the Tweet Messages. Another configuration option for text analysis is  LINGANALYSIS_FULL.

This Text analysis option parses the text message into different components of grammar like noun, adjectives etc. & these words once categorized by their grammar called tokens are grouped into different categories . For example word SAP is a noun of category called Organization. CNN will be categorized as Media.

Since Text analysis Table also maintains how many times a token or word has occurred.

As i was talking in my previous post , it is now fairly easy to know what word is trending most.

For example i searched Twitter messages for word “SAP” and I found “SAP Package Technologies”  was being talked most in the tweets .This text analysis index table can be queried in many different ways to get more insights into the data.


twitter feed text analysis.gif

Similar to Text analysis index table another index table can be built  to do the sentiment analysis on these tweet messages. This time use configuration option EXTRACTION_CORE_VOICEOFCUSTOMER instead of EXTRACTION_CORE used in previous example.

Note HANA will not allow creation of another text index , if a index already exists on a column . i.e. since the text column in MyTweetsTable already had text analysis index  built previously. I needed to create copy of the original table storing the tweet messages. It can be done easily by leveraging the same SQL create statement used to create the original table, however this time I used a different table name. The data could be easily copied by another SQL statement  — insert into “Name of  the target table_copy”  select * from “Name of the source table_orginal” –.

In the new voice of customer table index  table, the tokens  categories are analyzed much further to not only indicate if these are any organization names, product names or social media names but can also  categorize the tokens into a list of positive and negative sentiments like weak or strong . It can even probe messages into identifying if the messages are calling for information  and categorizes these as Request types.

Examples of what was mentioned most in these tweets ? and what kind of sentiments were in these tweets ? are easily feasible in the voice of customer table index. See  snapshot images below from a query on the table.

VOC - what was mentioned most.JPG

Notice most sentiments were weak positive comments.

VOC - What Sentiments used  in tweets.JPG

If we want to explore what were some of the words used in strong positive sentiment tweet messages. Those can be also easily identified. Shared below are results for the same. The same analysis can be done for negative sentiment comments as well.

VOC - What made Strong Postive Sentiments used  in tweets.JPG

Also , since younger in age people tend to use emoticons more, query can be done on this data to get what emoticons were used in these twitter messages to get the general feeling of happy , smily and sad faces and image below shared does show that.

VOC - What emoticons were used in tweets.JPG

One of the categories, in which words were grouped together’s PRODUCT .

Image below shows what kind of Products were being referred in the tweets.

VOC - What PRODUCTS  were named in tweets.JPG

If one wants to know what Organizations were mostly talked about in these tweeter messages, query on the indexed table can provide the result.

VOC - What organizations were used in tweets.JPG

Since Twitter messages have lots of other useful information like how many times it has been re-tweeted or liked, user demographics and location etc. By capturing all these attributes from the Twitter API , a much more meaningful sentiment analysis can be done.

So next time when i watch TV , broadcasters are telling people in different parts of the world are cheering which team most ,  it will be no more a mystery for me !!.

Analysis need to be carefully done on this twitter ( or any social media) data to get insights, which could be useful to the business. Needless to mention it is highly dependent upon quality of data. Perhaps it makes more sense to monitor the customer sentiment during certain time windows, like state of union address by president, or during special business events i.e. whenever there is new product launch or commercial being played on media.

Technology is here to support business and looking to make it work in real use cases, where positive sentiments over time increase (or there’s decline in negative sentiment ) .

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply