Real-time sentiment rating of movies on SAP HANA (part 6) – wrap-up and looking ahead
Hello and welcome back to the last blog post of movie sentiment rating series. Now it’s time to recap what we have done and discuss some future work. First of all I wanna share the series and project with you as I promised in the first blog. Here you go.
After you import the XS project, the project structure should look similar with mine. For better understanding, I made some notes on the structure and you can find these design-time objects in corresponding blog posts easily.
OK. Now let’s take a look at what we’ve done. We’ve rebuilt the movie sentiment rating app using pure SAP HANA XS and text analysis. And now we have two real-time apps using the same dataset, one for desktop based on SINA and the other for mobile using sap.m. During the implementation, we’ve used some new XS features released in recent SPSs such as CDS and outbound connectivity in SPS06, job scheduling in SPS07. These features did not exist when I built the first version of this smart app in May 2013. With the rapid development of SAP HANA XS, we can always improve our SAP HANA native apps. Then let’s have a quick look at these two apps respectively.
Movie sentiment rating based on SINA
– Search everything in real-time. With the full text index, you can search movie titles, release dates, studios, sentiments, tokens and so on. For example, you can search tweets about a specific movie in this evening from iPhone.
– Analyze everything in real-time. You can analyze how many people like/dislike a certain movie or the overall sentiment of movies in a certain studio. You can even track the sentiment trending. There are lots of things you can analyze.
Movie sentiment rating using sap.m
– For movie goers, you can see the sentiment rating of new release movies based on tweets analysis and select your favorite movie in real-time. From # of mentions you can also infer the popularity. And you can even select the release date range to check more.
– You can jump into details to see what people are now talking about this movie.
Compared with Rotten Tomatoes?
Out of curiosity I compared the rating of top 10 popular new release movies this week in my app with Rotten Tomatoes. The following result is based on 2014-11-23 04:00:00 UTC.
|Movie title||Movie sentiment rating||Rotten Tomatoes|
|The Hunger Games: Mockingjay – Part 1||7.8||8.2|
|A Girl Walks Home Alone at Night||9.4||8.2|
|Pulp: a Film About Life, Death & Supermarkets||7.3||8|
I watched The Hunger Games: Mockingjay – Part 1 today and have just posted a tweet. You see both apps show my tweet immediately since with the combination of XSJS outbound connectivity and job scheduling, we can crawl tweets in real-time!
Now let’s jump into the looking ahead part and discuss some future work about the movie sentiment rating app. There are still lots of things we can improve for this app. I just listed several points as follows.
1. Genres, cast and directors
Do you remember in the first blog, besides the basic movie metadata, we also have three additional tables “Genres”, “AbridgedCast” and “AbridgedDirectors”? Yeah, in the second blog, we searched and inserted data into these tables as well. So why not display genres, cast and directors in the UI? In order to show this info, we can Create Hierarchies in an Attribute View.
2. Geospatial analysis
The “Tweets” table has two columns which we did not use in our app, i.e., longitude and latitude. Actually we search the location info of tweets in our smart app and if the user provides the info we will store it. You can find the logic in the second blog. With the location info of tweets, we can make some interesting geospatial analysis, e.g., we can analyze the sentiment of movies in each state or for a certain movie/genre, we can compare the sentiment between eastern US and western US. Moreover, if we have age and gender info, we can analyze something more interesting.
3. Search more tweets for popular movies
Because of API Rate Limits | Twitter Developers, it’s impossible for us to search all related tweets. So, the current approach is searching tweets about new release movies every two minutes. There is no difference between movies and each movie is equal to handle. Now the problem comes, for popular movies maybe there are thousands of tweets in just two minutes which we cannot get them once and we will miss some. On the other hand, for some unpopular movies, maybe there is no related tweet in an hour, so we do not need to search them so frequently. It’s a waste if we still search them every two minutes. I think it would be better if we can create a dynamic mechanism which can search tweets for popular movies more frequently and tweets for unpopular movies less frequently.
4. Average sentiment of tweets
Currently if a tweet contains several sentiments, they will be regarded as several mentions. So, sometimes you will see several consequent mentions with the same username and the same time. Actually they belong to the same tweet. In order to avoid this, we can create an average sentiment for each tweet, something like 1 positive + 1 negative = 1 neutral…
5. Local time
Now the movie app uses UTC time everywhere, but it’s not a good user experience, especially for mobile app. You are more willing to use your local time. So, this is also a feature which we can improve.
Further resources about the movie app
That’s it! Hope you enjoyed the movie sentiment rating app the same way we did. Why not use it to pick up your favorite movie? Have fun! 😆