OpenSearch on SCN
I’m part of an SAP team specialized in search applications. This team is working among other things on the engine that powers the SCN search and its interfaces. One of these interfaces, is the well known search UI: http://search.sap.com , which you see when you execute a search on SCN. Another, less publicized interface, is the Opensearch: http://search.sap.com/opensearch . This search API is available alongside the search UI and enables client applications to access and integrate the SCN search functionality in their own environments or browsers. As a result, I thought it would be a good idea to spend some words on the subject and hopefully open up new consumption scenarios to the SCN search.
Some words about Opensearch
Opensearch can be seen as a simple API standard for search results. By implementing it, any content provider, like SCN, can export the functionality of its internal search engine to be used by external parties. The argument behind it was that traditional search engines often cannot do a good job of crawling and indexing content at specialized web sites and that local search engines may understand local content far better, so why not call them directly.
With Opensearch you can execute searches, determine the size of the results, page through them and, depending on the extensions implemented, even filter and sort them. It’s like taking search results and cutting out the UI. Interesting for developers wanting to access and aggregate results from different search engines, but also to end users wanting to integrate a search provider in their browser, as we’ll see.
In Opensearch, Search Requests are made via HTTP Requests with information sent as URL parameters and Results received as HTTP Responses in XML (RSS and ATOM) or HTML format.
A Simple Search
Let’s take a simple example and search for “hana”. Just call the Opensearch base URL and append the “query” parameter at the end. Don’t forget to URL encode the search term.
Results in HTML, ATOM and RSS
Results appear by default in a rudimentary HTML format that can be integrated in your browser’s search box. However, to access the full functionality provided, you’ll need to use one of the Feed formats available RSS or ATOM. This can be achieved by introducing the “format” parameter. Then you can also visualize the results with your favorite Feed Reader.
The standard feed formats have been extended by the OpenSearch standard and SAP with some extra elements in order to accommodate data related to the search results, as you will see in the next sections.
Here’s an example, with the RSS format:
Paging through results
Paging can be achieved via the parameters “itemsperpage” and “startindex”. The first one indicates the number of results per page and the second the index of the first result desired.
For example, you can define pages of 10 results and retrieve the first and second page via the following URLs:
-> first page (first 10 results)
-> second page (next 10 results)
Metadata associated with each result can be retrieved and inspected by using the parameter “extended”. Three values are possible: “id”, “label” or “both”, depending on whether you are interested in getting only the IDs, labels or both of the attributes and corresponding values associated with each result.
Here’s an example, with the “both” value:
After having inspected the metadata associated with each result you might want to sort your results by one of the attributes available. This can be achieved via the parameter “sort” and the following syntax:
- attributeName = name of the attribute (ID)
- sortDirection = ascending or descending
Don’t forget to URL encode ”attributeName:sortDirection”.
As an example, to sort by descending order of modification date you would URL encode the value “scm_a_moddate:descending” and add it to the URL as indicated below:
In a similar way, after having inspected the metadata associated with each result you might want to do some filtering based on that. For example, get only the results of Asset Type “Wiki Page” and Solution Suite “SAP NetWeaver”.
This can be accomplished by using the “filter” parameter and the following syntax:
- attributeName = name of the attribute (ID)
- attributeValue = value of the attribute
- attributeOperator = operator linking attribute and value (EQ, LT or GT)
Don’t forget to URL encode ”attributeName(value=attraibuteValue/operator=attributeOperator)” and note the following:
- following operators are supported for properties of data type INTEGER, DATE and FLOAT: EQ (equals), GT (greater than) and LT (less than)
- following operators are supported for properties of data type STRING and BOOLEAN: EQ (equals)
- date values need to be provided in the following format: “yyyy-mm-dd”
- several “filter” URL parameters are linked together with AND if they contain different properties or with OR if they refer to the same property (i.e. multi-value properties)
So, let’s try that out and filter only results of type “Wiki Page” from site “SCN” and author “Rafael Rhoden”. The corresponding filters are:
- filter=scm_a_site (value=scm_v_Site11/operator=EQ)
- filter=scm_a_author(value=Rafael Rhoden/operator=EQ)
which after encoding look like:
Resulting in the following URL for the search term “srm”:
You may find it interesting to notice, that when you use the search UI and go about filtering your results by clicking on the available filters on the left side of the screen, the corresponding URL changes to reflect the attributes and values you select (as filters) as you go along. These (URL) filters, although with a different syntax, contain the same attributes and attribute values that you use when filtering with OpenSearch and, in this sense, can be used as a helping hand to find out which attributes and attribute values you want to use in your OpenSearch filter. For example, the corresponding UI filter for the above OpenSearch one would be:
A closer look at Results
As you probably noticed, the search results in XML format don’t exactly match the standard RSS and ATOM feeds. They contain new tags, nonexistent in the definition of these standards. These are extensions. Some from the Opensearch standard (tags qualified with namespace “opensearch”) and others from SAP (tags qualified with namespace “sap_it”).
Let’s have a look at some of them:
- itemsPerPage = number of results displayed per page
- Example: <opensearch:itemsPerPage>10</opensearch:itemsPerPage>
- totalResults = total number of results found
- Example: <opensearch:totalResults>190</opensearch:totalResults>
- startIndex = index of the first result displayed
- Example: <opensearch:startIndex>1</opensearch:startIndex>
- status = status of the search. “success” if there were no errors and “error” otherwise.
- Example: <sap_it:status>error</sap_it:status>
- message = corresponding error message, if “status”=”error”
- Example: <sap_it:message> Please enter a correct Filter Expression: attributeName(value=attributeValue/operator=operatorValue)</sap_it:message>
- snippet = content snippet highlighting the search term.
- Example: <sap_it:snippet><![CDATA[[<b>HANA</b> Overview…]]></sap_it:snippet>
- metadataAttribute = metadata associated with each result. Depending on the URL parameter “extended” includes technical ID and label of each attribute and its values.
- Example: extended =both
<sap_it:valueLabel language=”en“>United States</sap_it:valueLabel>
This XML document specified by the OpenSearch standard, is the place where we publish the web interface of our search service. It’s like a formal description of the OpenSearch signature provided, so that clients know which URL templates are available and how to call them.
You can read more about the different elements in the standard OpenSearch Description Document Documentation and you can access ours under http://search.sap.com/opensearch/description
If you want more information on available features and usage, you can check out the delivered online documentation by clicking on “SAP IT Opensearch Results…” (title above search results) or going directly to http://search.sap.com/opensearch/description?type=extensions.
Browser Integration and Auto-Discovery
One of the nice features about OpenSearch is its browser integration capabilities. You can add SCN as a search provider to your browser’s search box and launch the SCN search from within the browser without having to go to SCN.
Most modern browsers support this functionality by allowing its search box to be customized. They support auto-discovery of OpenSearch description documents and integration of search providers in the search box.
Here’s how you can do it in Firefox:
First, you need to call a page containing a so-called auto-discovery link to our search provider in its HTML header. You can enable one of your own pages by including the link below in the header or you can just call our documentation page http://search.sap.com/opensearch/description?type=extensions, which already includes one. This link basically says to the browser that this page contains a search provider that can be added to the browser.
title=”SAP IT OpenSearch”
You can see this, as the browser immediately displays a different color on the dropdown arrow of its search box. If you click on it, you see the already installed search providers and the possibility to add a new one: “SAP IT Opensearch”. Add it and you’re good to go.
Now, you can select “SAP IT Opensearch” from the dropdown anytime and launch an SCN search from within the browser.
Note: In reality, and to be 100% correct, the description above integrates in your browser not only the SCN search, but rather the more general SAP External search, which includes SCN.
Accessing Private Content
Our OpenSearch Interface is intended to be used primarily by Anonymous clients to access Public content and therefore does not require authentication.
However, if you need to access restricted content and have the corresponding credentials, you can do it. A Basic Authentication mechanism is provided, which allows you to preemptively send the user credentials in the HTTP header of the request and the server will check for this. If credentials are present, it will try to authenticate the user and the results returned will be according to his authorizations. Otherwise, only pubic content will be returned.
The credentials should be send using the HTTP Authorization header, as a “userName:password” Base64 encoded pair in the form indicated by the example below.
We recommend to use this scenario only via HTTPS, so that credentials are sent via a secure connection.
If you want to authenticate using the following user:
username = user123
password = open123
You need to first base64 encode the string: “user123:open123”.
The result is: “dXNlcjEyMzpvcGVuMTIz”.
Finally, you need to include the header indicated below in the HTTP Request:
“Authorization: Basic dXNlcjEyMzpvcGVuMTIz”