We’ve already seen on SCN how to gauge sentiment on movies using Twitter data. An additional measure of sentiment can be gained from analysing song lyrics. This has been already been done, to an extent, for rap music lyrics by the team over at the Rap Genius website. You may well ask what possible value there could be in analysing rap song lyrics, and granted this is more of a novelty than a serious application, but as you’ll see there is some interesting data there.
I’ll say up front that this is not something that is currently implemented using HANA (to the best of my knowledge), but I believe it is worth highlighting it’s existence to the HANA community because:
- when you look at what’s been done it very well could be done in HANA, which would then allow for some very easy improvements (more on this below)
- because Rap Genius offer an API it would be very easy to integrate their data into a HANA or SAPUI5 application (more on this below)
- it is a novel application, and may be of interest to those working with unstructured text analysis
How it Works
The above chart shows word frequency in songs since 1988 and suggests that Nike are currently regarded as being “cooler” than other brands, at least in the rap music world. A big change clearly happened around 2000 when Nike rose to prominence, although they’ve slipped a little over the last few years, losing ground to Adidas. Another example – the below chart is for: Twitter, Facebook, Google, Instagram, MySpace.
The above shows the decline of MySpace and the rise of Instagram. We can also group words together like in this chart:
In the rap music world, it seems Bentley is now the most desirable luxury car, and Lexus has lost it’s prime position.
A couple of limitations are apparent:
- No actual sentiment is present in the analysis. The graphs just measure occurrences as a proportion of the overall population of words, they don’t distinguish between someone singing about Nike being good or Nike being bad.
- No drill down to source data. You’ll notice there is no drill-down functionality to see what songs used the lyrics entered, you just have to take it on faith.
To be fair to the people behind Rap Genius, this stats tool is not their core product. Their site is all about annotating lyrics, and this stats tool is an offshoot of having collected the source lyric data.
A Hypothetical HANA Implementation
The “crown jewels” of the Rap Genius web site is of course the source song lyric data, most or all of which has copyright implications. On the Rap Genius site, the source lyric data is crowdsourced – people are encouraged to upload lyrics in order to earn points on the site. Assuming we could somehow get the source lyric data, we could implement this in HANA and get “voice of customer” sentiment data pretty much for free, just like was done in the movie recommendation engine. Once we’d got the lyric data in a table, we could make this call to derive sentiment data:
CREATE FULLTEXT INDEX … CONFIGURATION ‘EXTRACTION_CORE_VOICEOFCUSTOMER’
Using the Rap Genius API
Rap Genius offer an API and it is very easy to use. If you wanted to get the data underlying the first graph image shown above (the one containing the data on trainers) you make this URL call: http://rap-stats-api.herokuapp.com/adidas-converse-nike-puma-reebok which returns some simple JSON data: