Federated Enterprise Search – The Time is Now
Have you ever wondered why enterprise search does not seem to work? Or why it takes so long to implement? Or even why some enterprise search is called federated, when it is blatantly searching one index?
Well the answers to these questions are also pretty simple.
Firstly: Enterprise Search does not seem to work, because it doesn’t. Well, not in the way that most users want it to. We have found that users actually want their search to find. Having hundreds of results back that are similar to, or maybe sound like the information that you were searching for, and then you must sieve through them to get what you really want is great. But would it not be better to have the search actually find the information straight away? Most enterprise search works by searching on Keywords and then creating a ranking structure based on these keywords. I.e. How many occurrences, what positions etc.? Not actually on the content that you were looking for.
Then there is the time old question that generally happens a few months after an enterprise search implementation has started. “When will the system be fully functional?” This is always a hard question to answer, because it does require an in-depth knowledge of your document stores, your email systems, and the content held within, along with all of the other data sources that are being indexed into the search index. Content has to be tagged, sorted in to containers, and de-duplicated in the index before you can start to fully get the benefits of having hundreds of answers that may be to your liking.
So if it is a single Index that is being searched, why is it called federated? Well this is because federated is such a cool word. Sorry not much of an answer, is it? It is because the single index is made up of multiple original indexes that the system re-indexes to give you “a better search experience,” or so they say.
OK so let us move on: Federated Enterprise Search – The Time is Now. Yes, and I mean truly federated and truly enterprise. A number of advances have happened over the past few years that make re-indexing your content irrelevant. To start with, the power of servers and the speed of networks have massively increased, meaning that the lag time between each separate search returning is tiny compared to what it once was. Also, most enterprise systems now have open API’s that allow connections to them. An example of this is MS Exchange. There is no need to re-index, as it has a perfectly good one already in place and its API’s enable a very advanced search query (not just keyword) to be called.
So speed is important, but it is not the only reason. To get a fulfilling search result, you also need the correct technology. A decent enterprise search application has to fulfill a number of requirements, the most important of which is the user and their desktop. What is the point of an enterprise search that cannot search where the enterprise user stores a lot, if not most, of their content? If a search application is not aimed at the user, then it should not be discussed at any planning meeting. The users content is paramount to search. If it is ignored, then the results are going to be missing large pieces of content. Vendors are just starting to realize that users do exist and that they are important. We have the mobile apps to thank for this. The rise of companies such as Dropbox and Evernote have shown the way when it comes to a simple and user centric UI.
Once we have the user covered, we then need to look at how the application searches. I have already mentioned Keyword Search. This has to be one of the most unsuccessful methods of producing a satisfactory response that has ever been devised. Yet still most companies use it. If you are not sure what I am talking about and you use Outlook, type 2 or more words into your Outlook search bar (CNN News – for example) and see what happens. You will get results back with both words in. If you have emails with Yahoo News or BBC News they will be returned, maybe a CNN show you emailed about. So now you have to go through these results manually to find what you were really looking for. – Hey Enterprise Search is called Search, not Find! Processing power has increased enough to allow us to create a truly encompassing advanced search query that can search the different indexes. So when you type in CNN News you only get back relevant items that have been found.
So now the application has results from multiple sources (Federated), it will see that some are duplicates. As a user I do not want to have to manually search my results and mentally take out the duplicates. So a good Search engine should do this automatically prior to displaying the results. This was one of the reasons that the single index was put in place, so that duplicates could be managed. The downside is that the index is always 24 hours out of date and is a single point of failure. Oh and needs massive management! Processing power has again made this old way outdated and cumbersome.
In addition to this enterprise search should also be able to bring back data from your Back-Office databases. These are relevant too. Any Enterprise Search tool should be able to map databases together providing a single Real-Time view of the content that has been searched for. This was once hard and unwieldly to do, but today with the efficiencies gained in database technologies and the new ways there are of mapping different data sources together this should be a given.
The final piece of the jigsaw is a simple one. People are starting to realize that Enterprise Search (Lets start to call it Enterprise Federated Find) is not web search. Searching for content on the web and finding content inside your place of work are completely different. For starters you do not need to worry about your boss trying to sell you something. Searching for “Productivity Quotes” I suppose could result in an advert from HR saying “Looking for overtime? Call this number now!” No, Enterprise Find and Web Search are completely different. What works in one does not work for the other.