How to write a Search Application using the KM Ind...

Former Member · ‎01-17-2007

Our search here at SDN is powered by a custom search application which communicates with our search engine of choice, TREX. SAP's Knowledge Management API provides an interface for communicating with TREX and executing searches. This weblog will show you how to use this API to write your own search application for TREX.

Prerequisites

This weblog assumes that you are familiar with writing Java web applications ("iViews") in the SAP Enterprise Portal/KMC environment. You need to have TREX set up with at least one search index available.

Basics

So where can you find this mysterious search API? It's part of the Knowledge Management API in the package: com.sapportals.wcm.service.indexmanagement.retrieval.search. This package provides functions for text mining and search.

Your starting point for searching is the IFederatedSearch Interface. It contains several methods for different types of search calls. Let's have a closer look at some of the methods.

search: A basic search method. It returns an ISearchResultList containing the search results for the provided query.
searchWithSession: This does not return an ISearchResultList directly but rather returns an ISearchSession which in turn provides a method for retrieving an ISearchResultList. The advantage of the search session is that you can request a subset of search results, e.g. only the amount of results that your search application displays per result page.
Working with only a small subset of a possibly large number of search result is much more efficient than always working on the complete list since a much smaller set of resources need to be requested from the data source. In addition the search session has an internal caching mechanism which delivers the results from cache if the same subset is requested again. This also increases performance.
searchSimilarDocuments: Searches for documents similar to the provided document. Similarity is determined by TREX's text mining engine.

For performance reasons I recommend using the searchWithSession method rather than the simple search method.

So let's have a look at the different signatures for this method. Generally there are two types of signatures. The difference is the way in which you specify the search scope. One type of method uses a KM folder (ICollection) as the starting point of the search. You need to have a search index including the chosen folder defined for this to work. The other type needs a list of search indexes as input.

Both types of methods additionally have a few different signatures. The basic signature is this:

searchWithSession(IQueryEntryList queryEntryList, searchScope, IResourceContext context)

searchScope is either an ICollection or a list of indices, as described above. Furthermore you need to specify an IQueryEntryList, a list of IQueryEntry objects. The last parameter the method requires is an IResourceContext, which stores the current context the application is running in (user, request, session, etc.).

An IQueryEntry is the smallest entity of a query sent to TREX. It can be something very basic like a logical operator (AND, OR, NOT are supported) or a bracket or something more complicated, like an attribute query (e.g. find all documents where the given property has value X).

Building an IQueryEntryList even for a basic query is rather complicated since you need to string several attribute queries together with boolean operators in order to search the attribute values you need. Fortunately the Knowledge Management API offers another very useful class for building queries. It is the SearchQueryListBuilder class. It has been written to support the creation of SearchComponents for the standard Enterprise Portal search but can just as well be used for your custom search application.

The most basic type of query you can create with the SearchQueryListBuilder is just using the method setSearchTerm(String searchTerm) to set a search term. The SearchQueryListBuilder will build an IQueryEntryList which searches in the following properties of a resource:

the content of the document (if it can be text mined)
the display name
the description

More complex queries can be built and additional properties can be searched with a different method. But more to this later. Let us collect the different bits and pieces we have now to write a first basic search application:



// get federated search instance

private IIndexService indexService = (IIndexService)ResourceFactory.getInstance().getServiceFactory().getService("IndexmanagementService");

private IFederatedSearch federatedSearch = (IFederatedSearch)indexService.getObjectInstance("federatedSearchInstance");



// get list of all active indexes; alternatively use the indexService to get indexes by index ID and add them to a list

List indexList = indexService.getActiveIndexes();



// build IQueryEntryList

SearchQueryListBuilder queryBuilder = new SearchQueryListBuilder();

queryBuilder.setSearchTerm("some search term");

IQueryEntryList qel = queryBuilder.buildSearchQueryList();



// get ResourceContext

IUser user = (IUser)request.getUser().getUser();

ResourceContext resContext = new ResourceContext(user);



// search

ISearchSession searchSession = federatedSearch.searchWithSession(qel, indexList, resContext);



// retrieve the first 10 search results from the search session

ISearchResultList results = searchSession.getSearchResults(1, 10);



// get the total number of results

int totalResults = searchSession.getTotalNumberResultKeys();

Now all you need is a way to display those search results. The SDN search uses JSPs to iterate over the result list and render the results. The ISearchResultList contains ISearchResult objects which allow you to get the indexed properties of the underlying IResource object. This is done via the getLocalProperties() method which returns a PropertyMap containing all properties that the index knows for the resource. From this PropertyMap you should be able to retrieve all the information you might want to display, e.g. display name, last modified date, content link, etc.

I recommend retrieving all the properties you want to display from this PropertyMap for performance reasons. The PropertyMap comes directly from TREX and therefore can be accessed efficiently without security checks or some such. If you absolutely have to you can access the IResource via the getResource() method and retrieve properties directly from the resource. Note however that in this case KM is accessed directly which is more time consuming.

Advanced Search Options

The SearchQueryListBuilder can do a lot more than just building an IQueryEntryList for the search in the above mentioned standard properties. Below is a list of some of the useful functionalities the SearchQueryListBuilder offers:

setSearchAddProps(String value): This method allows you to specify properties to be searched in addition to the standard properties (content, display name, description). Multiple properties need to be specified as a comma-separated list of unique property IDs, e.g. createdby,modifiedby,embedded-keywords
      setSelectedCustomProps(String value): This very convenient method lets you specify conditions that need to be met by a search        result in order to be considered as a valid result. Basically this can be used to filter search results, displaying only those results that        meet certain conditions. The conditions are specified as property name and value pairs. This means that a document will only be returned       in the search result list if it has the specified property with the specified value.
      It is possible to specify multiple property name/value pairs in a comma-separated list. The following rules apply:
      Different properties are connected by AND.
      Multiple occurrences of the same property with different values are connected by OR.

      The following syntax is used for these query conditions:
      propertyID(modifiers)
            Possible modifiers are: value, comparator. Comparators are used for number based properties like dates. The following syntax        is used for modifiers:
            modifiername=value, several modifiers are separated by slashes.

      An example:
            modified_on(value=2006-12-01/comparator=EQ),modified_on(value=2006-11-01/comparator=EQ),author_name(value=Esther Schmitz)
      This condition will cause the search to only return documents that were modified either on Dec. 1st, 2006 OR on Nov. 1st, 2006 AND where the       author is Esther Schmitz.
buildDidYouMeanQueryEntries(IQueryEntryList oldQuery, String oldTerm, String didYouMeanTerm): This method provides an adapted query entry list for performing a "Did you Mean" search. Use getDidYouMeanTerm(IQueryEntryList queryEntryList, String oldSearchTerm) to determine the did you mean term.

The End

I hope this sneak peak into the inner workings of SDN was interesting to some of you. If you find any errors or have other input don't hesitate to comment here. Thanks!