Text Analysis of IPL Match using Twitter Data (Part 1)
Hi All ,
IPL is going on in INDIA these days and I thought of doing TEXT ANALYSIS of Match between Chennai Super Kings and Royals Challenger Bangalore.
1. Data which will be analyzed – Twitter
2. Source – Excel
Note – We can download twitter data and few blogs guide on how to do it. For this blog I have the Twitter Data ready over excel which I will be using for analysis.
ℹ M S Dhoni – Captain of Chennai Super Kings & ℹ Virat Kohli – Captain of Royals Challenger Bangalore
First of all we need to Create Schema.
Please use SQL highlighted below to create Schema. I am using “IPL” as Schema name for this blog.
Once the SQL in above Screen Shot is executed, Schema with “IPL” name is created.
Refer to below screen shot which indicates Schema is created.
In this Blog we are uploading Twitter Data using Excel.
Excel reside in my Computer and I will be uploading the same for this Blog.
Fields from the Excel which has ‘tweetId’, ‘memberId’ ‘tweetDate’ ‘tweetContent’.
Refer to below screen shot
Below screen shot reflects that Table is created and Data which is imported from Excel
We need to create Index on table which we created
Table Name – IPL_Match_Twitter Data
Below SQL need to be executed which will created Index name ‘ipl’ in Schema ‘IPL’
Configuration Used – ‘EXTRACTION_CORE_VOICEOFCUSTOMER’
Use the SQL as it is , just change the Index name as per choice and Schema/Table Name followed by Column name on which Index need to be created.
$TA_IPL in below screen shot indicates , Index table is created
If we check the Table Contents , Screen like below appears
If we want to Generate SQL of already created table then Right Click on Index Table – Generate – Select Statement
Benefit – We can avoid writing SQL statement for fetch on table and remove the column name which is not needed for analysis.
I have removed few Column name and kept only below, If I execute the SQL table content will be displayed only for selected column.
Screen if we execute the above SQL
🙂 Lets play with SQL now 🙂
Below Query on Index Table will fetch Count of TA_TOKEN in Descending Order.
Benefits – It will display maximum discussed word
ℹ TA_TYPE column carries details related to Sentiment.
If we see below Screen Shot we can see few Sentiments like
4. StrongNegativeSentiment ℹ
If we use below query on TA_TYPE equal to Person , we notice Virat Kohli and MS Dhoni have been used differently like
2. Virat Kohli
➕ Note – How above issue can be solved will be explained in next blog which cover ‘Custom dictionary’ and ‘Custom Configuration’
We will be using TA_NORMALIZED column for this and writing more SQL ➕
In the next blogs I will cover
1. Custom Dictionary ➕
2. Custom Configuration ➕
3. Creating Analytical View
4. Analyzing data using SAP Lumira