Search. You’re probably saying “now that’s an exciting topic” and what could possibly be so complex about it. You just slap a search engine on your website or portal, let it crawl over your data sources and surely users will be able to find what they’re looking for. Well, it’s not quite as simple as that and that’s why a lot of folks from all kinds of different companies and organizations recently got together at the Enterprise Search Summit and Taxonomy Bootcamp in San Jose, CA.
A few years ago the above scenario may have been enough but user expectations regarding search have increased a great deal especially with the popularity of Google’s internet search. As the amount of information grows on websites and portals more folks are turning to search but they want it to be fast, simple, and efficient. The keynote speaker, Sue Feldman of the market research firm IDC, pointed out that information workers spend about 9-10 hours a week searching for information and that one-third to one-half of that time is wasted since they’re not finding what they’re looking for.
So, what can be done to help make this content more findable? The short answer here is: a lot! Since I can’t possibly cover everything here I’ll instead focus on one thing that was constantly discussed at the conference and is of most interest to me and that is “tagging”. This has definitely become a buzzword recently especially since sites like flickr and delicious introduced social tagging where users tag the content themselves. The idea of tagging files, however, is nothing new. Librarians have been doing this for years when they assign, for example, Library of Congress subject headings to a book. With any kind of tagging you are essentially assigning metadata to a file. Metadata is information that describes a piece of content such as the Title, Author or Subject and is used by the search engine in retrieving the results when a user does a search.
In an enterprise, searching for information is a bit different than doing an internet search. Information is usually stored in many different places. There’s the less structured information such as wikis, blogs and discussion forums and then there’s more structured information such as the files in content management or document management systems. And each of these “repositories” usually has its own structure for classifying and organizing that information.
This means it makes most sense to use different methods to organize the different content types. For example, you might want to use social tagging for wikis and blogs, whereas content such as best practices or solution briefs that have been reviewed and approved might benefit from more structured tagging (eg. controlled vocabulary, taxonomy). The goal should be to allow content to be tagged throughout its lifecycle and each time it’s handled whether it be by the author, a librarian, an end user or an auto-indexing tool. In the end, though, no matter how the tagging is done, metadata is a very important component of search.
On the SDN/BPX site we use metadata as well and have in fact just undertaken a big project to help improve the findability of content. A couple of months ago we migrated content from SAP Service Marketplace (SMP) to SDN’s Library, making SDN the single point of access for the SAP NetWeaver knowledge base. This also meant that we needed to integrate two sets of metatags since the SMP content came with its own set of metadata. The project was challenging and we learned a lot. In my next two blogs I will go into details on exactly what this content migration meant for our existing content management system and how we plan to use all the metadata we collect to help improve the findability on the SDN/BPX Community site.