Technical Articles
Improve Conversational Commerce Search with Knowledge Graphs
Conversational commerce apps answer from a wide variety of query types, both very detailed or generic:
Have you
lemon juice
?
I search forNestea Peach
.
Could you suggest me apale ale
andvanilla ice cream
for my party?
Query elements might have collations – word aggregations with different meaning from the single elements:
Please, give me a
red bull
.
With the present idea I wish to:
- Return search results that are pertinent to what the user is looking for
- Inform the user whether the returned items do not match exactly with the search query
- Warn when the user demands for something that is off-scope
The data structure that might fullfil those requirements could be expressed in this form:
entity clusters :
term : vanilla ice cream
catalog : false
subterms :
term: ice cream
catalog : true
term : party
catalog : false
Where does the collations might be obtained? Structured representations of human knowledge are available in every higher pace than ever. Open databases such as DBPedia, Wiktionary, WordNet could be grossly defined as generic sources of human knowledge. They are aggregated in ConceptNet.io, a semantic network. Collations are linked throughout a network of relationships such as: ”is part of”, ”is capable of”, ”is a type of”, “is related with” and their meaning could be extracted from the relations filtered from such databases for a specific domain case – in our example, food.
Query search data extraction
Once the relevant collations are acquired, it’s turn to describe how queries could be elaborated in order to filter out the key terms we are interested on. I individuated some desirable features the solution should provide:
- Query entities selection. When in the query there are more than one entity cluster, the conversational agent will be able to detect it and to ask the user to choose with entity will search first. For example: give me a
red bull
and acoke
- Partial term matching. The user is informed when the exact criteria does not match, and instead, a less ranking one is provided. for example in give me
vanilla ice cream
the specificvanilla ice cream
is not available but a genericice cream
it is. - Terms off scope. Warn the user when the inquired item is not for sale. for example: I’m looking for an
insurance
.
The terms extraction from the product catalog and the user text query share the same following proce- dures described below. They are Lemmatization, N-gram factorization.
Lemmatization
Lemmatization procedure returns the root for of the inflected word. For example runs and running are pointing to the same root run.
Indexing Sale Catalog
The product’s name and description are parsed, tokenized and finally stored in a in-memory Set
N-Grams extraction from search query
An n-gram is a contiguous sequence of n words[2]. In the above example could you suggest me vanilla ice cream for my party the collation vanilla ice cream will be exploded as: vanilla ice cream
, ice cream
, vanilla ice
, ice
, cream
.
The system will weight the filtered items according to their length: longer first. More consecutive collations terms’ are detected, better the search output will be.
Thanks for sharing your knowledge Giancarlo Frison!