I recently moved from Tel-Aviv to Ra’anana, which is also where I work. With the drawbacks of leaving a vibrant city that never sleeps, came the convenience of living about a 5 minute drive from work.
This made possible an earlier start to my workday, before most people arrive at the office. In the first weeks, I arrived at my desk and did what millions of information workers do first thing in the morning – check emails as all of my work comes to me in some form through emails and meetings. I was surprised to realize that although I have a complete hour to myself, my mornings are not more efficient compared to days I arrived at later hours. Thinking about this, it occurred to me that the reason is the emails I’m going over. Sure, some are exactly on the spot with regards to my daily tasks and goals, but most are far from being relevant and just waste my time.
In the broader context, content we all consume from many other channels is quite the same – some content is very relevant to what we need, but most is a waste of time.
What if we could get only the most relevant content tailored for our needs right now?
We at the SAP Portal group are asking ourselves exactly that in our enduring goal to provide users with the right content at the right time. But what does this actually mean?
Content can be anything.
The term content is very generic, basically any piece of meaningful data can be considered “content”. Consuming content serves a goal.
It can be a document, a BI report, a person’s profile, an event, a business object, a webpage, a blog post, a video, a notification, an application and this is clearly a partial list. The amount of formats or types content is represented by is very large and constantly increasing both in the consumer and in the enterprise world. And this content is alive. It is far from being static pieces of data; people and processes are updating it, sharing it and collaborating over it all the time. Needless to say that the accumulated volume of all this content is huge and getting bigger by the second. The three V’s: big Volume, big Variety and big Velocity is the definition of Big Data and it makes getting the right content much, much harder.
What is the right content? What is the right time?
But what is the “right” content? This really depends who you’re asking and when. At the other end of the content consumption channel is a person. It’s not a role, it’s not a job title and not an anonymous silhouette that serves some kind of function, it’s a person. It’s an amazingly complex individual. This complexity drives the requirement for content: different people have a different past, different interests, different social circles, different aspirations, different everything; also this complexity is far from being static: meetings, tasks, presence, location, the time of year all affect the required content. Even the time of day affects content requirements – think of what you do in the morning compared to late in the evening.
Let’s take the familiar example of a sales exec: the required content towards a customer meeting, a convention, before closing the quarter or right after it, is very different.
We can already deduce that when speaking about the right content at the right time, the right time means right now. Both the content itself and the user requirements are constantly changing on the time axis.
So on one end we have content with huge volume and variety that changes all the time and on the other end we have an amazingly complex individual that needs only the right content. Content that is most relevant to who this user is and what she / he needs right now; content that is most up-to-date; and quite importantly, content that the user might not even be aware exists.
Today myApp tomorrow theWorld.
The concept of getting only the right content at the right time might sound futuristic, but there are applications where this requirement is already a reality. Any real-time monitoring application has this requirement: think command and control, fraud management, digital marketing, location based retailing. In most of the current use cases, either the content, the context or both are fixed. For example, in command and control, a field officer’s context is always the event at hand, with little significance who the specific person is. In fraud management, the content is always financial transactions.
As technology evolves, we believe that more and more business processes – both analytical and transactional – will demand the real-time responsiveness capabilities, requiring both real-time content and real time context. This is not just the Portals group vision, this is the core of SAP’s Innovation platform – SAP HANA as the Real-Time Data Platform.
So far we’ve discussed what we need. Now let’s talk about how we plan to do this. By all means, not a simple challenge.
The working title of the project aimed to tackle these requirements is Magnet. You will soon understand why.
At the core of the solution is the ability to contain all the content in one central structure, optimizing the retrieval of the right content objects. We aim to do this by maintaining a content graph that includes content object nodes with a set of properties, and arcs that represent relations between these nodes. Each content object references content that resides in external content sources. The content graph will have a few important characteristics:
· Dynamic – it is an entirely schema-less structure that enables modifications to objects, properties and relations.
· Semantic – relations between objects can be both modeled in design time for specific relationship types or calculated by algorithm analyzing the objects semantics and modifying relations based on this interpretation.
· Real-time – the content sources are constantly polled to detect and reflect changes in content as they occur.
· Self-maintained – the above capabilities makes it possible for the content graph to stay up-to-date with minimal administrator intervention.
The semantic content graph
The content graph is a single, objective, smart index of the entire possible content that can be of interest. When a user logs-on, we need to query this huge content graph for the most relevant content in the user’s current context.
We do this by maintaining a rich contextual profile for each user, that like the content itself is also updated in real-time for appropriate attributes. The contextual profile is kept up-to-date by connecting to context sources that are also polled periodically in a similar mechanism feeding the content graph. Not all contextual attributes have the same change frequency. For example, today’s meetings and the current location are contextual attributes that need to be updated in real-time, while other contextual attributes, like job title or quarterly quota change less frequently.
After obtaining an up-to-date contextual profile, the different attributes are translated into query parameters resulting in a quite complex contextual query that is finally run on the content graph. The rich contextual profile is actually the user’s Magnet; the different contextual attributes are “magnetic forces” that when activated upon the content graph, draw closer only the most relevant objects (“organic results”) and in turn also related objects with strong semantic relations to them.
After running the contextual query, the content graph is “charged” with the user’s magnetic forces restructured to reflect the user’s own private subjective content graph showing content relevant to the current context.
The Magnet has the following characteristics:
- Real-time – as previously mentioned, all attributes are up-to-date.
- Transparent – all contextual attributes are transparent to the user, the user always knows what the system knows about her / him.
- Editable – the user can turn any contextual attribute on and off or specify refined settings; moreover some combinations of contextual attributes can be saved as “predefined” queries tailored for different aspects of a user (e.g.: my competitors, me as a manager, project X, etc.).
- Learning – previous usage is an important contextual attribute that is constantly evolving with every usage session; usage of the user’s social circle members also effect this contextual attribute.
So how can this look?
The experience of presenting only relevant content from an almost infinite world of content that changes all the time is nearly as important as the algorithms that run under the hood.
Magnet user experience mockup
- Map – Since there is an effectively infinite world of content, we represent this by an infinite 2D map; this map has one single center which is the permanent location of the magnet.
- Content Tiles – all content objects are represented in the same way – by a content tile that shows an image, a title and description, elements representing previous usage on this content object and controls that enable interaction with this content object.
- Ranking representation – The relevance of a content object is represented by two attributes of the content tile – size and location. The larger and the closer the tile is to the magnet (in the center of the map), the more relevant it is to the user’s current context.
- Clusters and related content – Together with the most relevant content objects also objects most closely related to these objects are presented. The related objects are presented as smaller tiles around the main content object tiles; this creates the formation of clusters around certain topics.
- Magnet and magnetic forces – The Magnet is represented statically in the center of the map, but it also acts as a dynamic control; when clicking on it, it opens presenting the magnetic forces composing it; this serves both as a transparency mechanism and a means for the user to refine the current context.
The magnet – inside look
- Radar and real time notifications – Since the world of content is alive and also the user’s context may change in real-time, the content tiles on the map are also alive in that they can move, appear, grow, shrink and disappear as content objects’ relevance ranking change. Critical changes in content objects that are less relevant and don’t make it to the frame around the magnet, are represented in a separate control – the radar. When something interesting happens outside the currently visible frame, the radar blinks; it can then be enlarged to show where in the map this occurrence is taking place and also allows direct navigation to it.
The radar showing a trending content object outside of the current frame
- Navigation – just like any other standard map, the standard navigation patterns like center, zoom and pan apply here as well. Navigation is also supported from any content tile to the original content coming from an external source (i.e., a different application).
I know, we all know, this is an ambitious project. It requires some very complex algorithms and very strong hardware. This is where SAP HANA comes in; with capabilities like built-in graph engine and text analysis engine, very strong calculation engines, semantic capabilities and a fully integrated application layer and UI services, HANA is made for this challenge. We are pacing ourselves with the implementation, validating every component of the complete solution, and starting at well-defined use cases with limited scope, but we believe in the big picture.
Fast forward to sometime in the future – I’ll probably still arrive early at the office, but with Magnet, my and many others productivity is boosted – getting only the right content at the right time.
For more information, please contact me:firstname.lastname@example.org