Few people know that there is a search engine built right into the core of the SAP HANA platform. Whenever you deal with large amounts of unstructured textual data like patent documents, incident messages and consumer reviews etc., SAP HANA provides everything you need for a “Google-like” search experience.
- Plain vanilla keyword search
- Linguistic search (searching for ‘computer’ will also find document about ‘computers’)
- Error tolerant search in order to deal with typos (‘tadabase’ will find ‘database’)
- Semantic search (‘car’ will find ‘automobile’)
- Phrase search (‘”white house”‘ will find ‘white’ and ‘house’ in sequence)
- Pattern search (‘poly*’ will find ‘polymer’ and ‘polycarbonate’)
A standalone “engine” is not enough, however. That’s why SAP HANA also includes the Info Access “InA” toolkit for HTML5. The InA toolkit is a set of HTML5 templates and UI controls which you can use to configure a modern, highly interactive UI running in a browser. No code – just configuration.
The example below shows a search UI built with the InA toolkit on car complaints data. The large result area in the UI shows the individual complaints for a search with “battery”. The smaller areas on the left are so-called “search facets” that let you drill down into the results, e.g. filtering the results by MODEL = Explorer.
The big difference of SAP HANA compared to other open source or commercial search engines are its modeling capabilities. You can change the way your data is exposed in a flexible and instant manner. As an example, you start with a document model similar to the “car complaints” used in the UI shown above: MAKE, MODEL, COMPONENT, DESCRIPTION and so on. Now, you would like to include additional data about the car manufacturer, let’s say the manufacturer’s REVENUE or LOCATION. You can simply add the new data to the search model using the graphical modeling tools of SAP HANA Studio and that’s it – no re-indexing your data is required. Right away you can adapt the InA UI to expose the additional data.
SAP HANA’s search capabilities are fully exposed in SQL. So if you are an application developer accessing SAP HANA through ODBC/JDBC there is no need to learn a proprietary syntax. Just write
SELECT * FROM car_complaints_model WHERE CONTAINS (description, ‘battery’);
to retrieve “battery” related complaints.
Last but not least, let me highlight that you can combine the speed of SAP HANA for analytics with fulltext search, thereby bridging the gap between structured and unstructured data. The car complaints data used in the above UI also includes numeric data: MILEAGE for example. With HANA you can calculate the average mileage per car make for “battery” related complaints in plain SQL.