Product Information
SAP HANA Cloud’s Vector Engine Announcement
I am excited to share that SAP HANA Cloud is planned to include a vector engine as part of the Q1 2024 release.
Click here for the official SAP TechEd announcement.
Intelligent data applications are now the standard for business applications. Users no longer desire applications that require repetition of tasks within a business process. Instead, users need applications that replace the mundane with data-driven expertise for decisions made in the moment.
The addition of a vector engine will create many new intuitive possibilities for customers, partners, and even internal engineering teams. Some of the most popular use cases include …
- Similarity Search: find similar items, documents, or records by comparing their embeddings
- Content Based Filtering: recommend items to users based on their past interactions or preferences
- Information Retrieval: improve the relevance of search results by using text embeddings
- Generative AI: support Retrieval Augmented Generation (RAG) to obtain better results from large language models
SAP HANA Cloud is the best choice when selecting a cloud database to power these new applications. The upcoming vector engine will enable builders to deliver at every layer of the application’s architecture, including the …
- Storage Layer: unify all types of data while eliminating data silos and duplicate copies of data
- Logic Layer: combine similarity search results and business logic within the same database
- User Interface Layer: power natural human-like interaction for a more intuitive experience
Regarding the technical details, vectors will be stored within a table using a field of type REAL_VECTOR. The first release will also include the two vector search functions: cosine similarity and Euclidean distance. Cosine similarity finds the closest vector(s) using angles. Euclidean distance (or l2distance) determines the closest vector(s) by calculating the actual distance. See below for SQL syntax examples.
There are significant technical advantages when all types of data are utilized, whether physically or virtually, inside a single database. One notable benefit is the increased efficiency of a unifying SELECT statement. See below for a SQL example that uses SAP HANA Cloud’s multi-model engines.
When I think about a heterogenous SELECT statement that combines transactional, graph, spatial, and vector, I am reminded of a mentor’s wisdom, “make powerful statements and expect great results.” Now is the time to start architecting AI-driven applications that elevate the user’s performance as well as the business process. Begin planning today for SAP HANA Cloud’s new vector engine and to implement these types of use cases in your next project. In 2024, expect more from the database and solve those reoccurring business application challenges.
Recommended Links
SAP HANA Cloud Vector Announcement
Thank you for the blog post.
Thanks Daniel Dukes for sharing this insightful blog! It introduces us to the new SAP Hana Cloud engine named Vector. As far as I grasp it, this engine efficiently handles unstructured data, enabling the creation of recommendations based on historical requests through machine learning support. And it uses NLP framework, facilitating seamless human interaction.
Thanks for the post! Release of vector capabilities is big news.
Question related to this:
"The first release will also include the two vector search functions: cosine similarity and Euclidean distance"
Will there also initially be possibility to use approximate nearest neighbor functions, such as HNSW, to calculate good enough matches?
While Cosine Similarity as well as Euclidean Distance are referring to distance metrics, HNSW is referring to an approximative indexing technique, which speeds up similarity searches. From that perspective, HNSW is not another metric function, but rather an index, which could be created on a vector column.
Straight answer to your question: The very first release of the HANA Vector Engine (planned with QRC01/2024 of HANA Cloud) will most likely not include indexes. However, indexing is generally in scope and planned for a subsequent release in 2024.
Thanks Mathias for explaining the difference between distance metrics and indexing. Good to hear that indexing will be there later in 2024 if not yet in the first release.
How about vector search in pure ABAP? =)
oisee/zvdb: ABAP Vector Search Library (github.com)
(N.B.: on-premise, right now =)
A blog post on LinkedIn with explanation how and why it works.