Challenges in analyzing Big data for social networks
According to the Global Web Index, as of Jan 2016, there are about 3.4 billion internet users in the world and within that there are about 2.3 billion active social media users. The ubiquitous presence of the internet has made our social interactions easy and convenient. Social collaboration through the internet has increased as well. No matter it makes us smart. As we interact and conduct transactions on the internet, there is immense amount of data that is generated. This is big data per se. We will come to it in a while from now. Personalization is the new trend with the advent of Web 3.0 and popular social networks like Facebook have begun personalization campaigns and deliver exclusive post feeds based on users’ interests and past behavior. Smart indeed.
The situation is quite similar in enterprise social networks as well as there is personalization of data with user security and privacy concerns looming large.
Social media channels over the internet generate enormous amount of data. Gleaning information from the data and searching for meaning in that data will be the key differentiating factor for competition. This could also lead to rise in productivity and innovation says Mckinsey who published a popular report “Big data: The next frontier for innovation, competition, and productivity” sometime back.
The question is how to better use this big data? What are the challenges in analyzing it in the realm of social networks? The questions look stark even after 5 years since the term ‘Big data’ was first coined. The way Big data is analyzed today for insights is quite different from how traditional managers would look at analytics. Before we explore the challenges, we will look at what is big data and what all does it cover.
Big data in Social networks and media
Big data is all the huge unstructured data generated from social networks from a wide array of sources ranging from Social media ‘Likes’, ‘Tweets, blog posts, comments, videos and forum messages etc. Just to give you some information, Google in any given day processes about 24 petabytes of data. For your information, most of the data is not arranged in rows and columns. Big data also takes into account real time information from RFIDs and all kinds of sensors. The social intelligence that can be gleaned from data of this magnitude is tremendous. Professionals and technologists from organizations in the enterprise and consumer social networks have begun to realize the potential value of the data that is generated through social conversations. Sometimes Big data is also often known as ‘bound’ data and ‘Unbound’ data.
Social networks have a geometric growth pattern. Big data technologies and applications should have the ability to scale and analyze this large unstructured data. It should have the capability to analyze in real time as it happens. Social media conversations generate lot of context to the information. Such context is an invaluable resource for knowledge and expertise sharing. Context in conversations is key to a social network’s success. It is not an easy task to analyze millions of conversational messages every day. This is where traditional analytics can help mainstream Big data analysis and both need to go hand in hand.
The challenges are very many. They can be divided into two types.
Firstly, the very notion and traditional thinking processes on data capture, processing and storage needs to change. This is incremental and a gradual process. Mining such huge data requires data mining technologies such as data mining grid and Map reduce infrastructure such as Hadoop. The technology might not be cost effective and the learning curve is steep. It also requires a non-linear, non-deterministic software architecture.
Secondly, the well-known adage ‘What we measure is what we manage’ stands quite tall here. That means, we need to know as an organization what we want to measure for the day. It is important that we understand clearly what we are looking for. If we want to identify trend patterns for the day and predict a path where a social conversation might lead to, then need to know ‘when to ask the question’. This is quite difficult as events are dynamic.
With these trending challenges in Big data analysis the future is laden with big productive tasks as well as new innovation to kick in, in the years ahead. One view is that Big data applications will blur both consumer and organizational data and will move towards a comprehensive social network footprint for individual as well as the organization in the future.