Knowledge Graphs with Inductive Logic Programming on CML21
If it is the first time you read about the term Inductive Logic Programming (ILP), don’t be worried, you are in good company. I came across this paradigm recently when I started to investigate viable solutions to inject common-sense in eCommerce systems, and despite its potential, ILP is one the most under-appreciated branch of research in AI. I believe it is incredibly useful in many areas and I’m confident that you will recognize the advantages this paradigm will bring to your solutions.
I will talk about ILP in “Just a bit of Logic: a Socratic way to Win Customer” in CML21 on Tue 11 May 15:15 (CET)
I will introduce the topic with an example. John is content editor of ACME GmbH and he got a brilliant idea that might boost the sales. He wants to adapt the eCommerce content to the consumer’s profile – the so-called personalization that you might have heard from us. John needs to insert the latest item bought by influencers living in the same consumer’s neighborhood, in the homepage. He wants to have this change now so he can show the results by the next sales meeting, at the end of the week. Moreover, John has little familiarity with the exhausting query grid in Backoffice, and he’s not very willing to engage with the data science team. Impatientness and laziness are not the most wanted styles of work, but a new tool has been just released.
John knows what he means as “influencers” but is he able to explain that to the system? Well, apparently this is not necessary. With few clicks on the panel John has already instructed the system on what to do. John specifies a couple of samples and see the simulation in real-time. At the first toss the results look not so bad. He wipes out some unwanted results and add the other case he remembers from yesterday, a fancy customer in Munich. The tool creates on the fly the outcome John was looking for, enough accurate to be deployed immediately. John is happy, he did it online and without support.
John was right. His idea significantly increased sales and that feature is going to be included among the supported ones and it will follow the company’s standards on development and maintenance. The lead developer inspects the generated program – code is human readable and understandable – and he edits it manually because there was some unharmful non-sense to drop away (the graphical no-code editor is on the way, promise!).
This story brings some important advantages like small dataset, online fast training, and Explainability that can be brought by ILP. They are not all of course; I list them here:
- Explainability. Logic models are inspectable, editable, interpretable, justifiable. No need of interpreters like neural network (non-logic) models.
- Scalability. ILP training is easily parallelizable (map/reduce) with no specific hardware requirements.
- Transferability. Logic programs are highly composable.
- Safeness. Logic programming was invented precisely for formal software verification.
- Generalization. ILP generalizes very well, and the generalization is tunable at request.
- Training Data. Very few positive and negative samples. F1 score (precision/recall) wary based on quantity and quality of samples.
Roses come with thorns and ILP is not flawless. Since ILP is not based on statistical processes, it is not tolerant to noise or wrong data. For such a case, it will be interesting to merge the advantages of neural networks with those reported above, from logic programming. I hope to keep you up to date on that soon, but this is just an introduction. let’s keep things simple, we see now what ILP is.
The part of the acronym with LP is Logic Programming. It is a declarative language for defining composable rules and generate new desirable information out of a knowledge base. The first term of ILP is Inductive. Induction is the process of generating hypothesis from observable evidence. If we start from the specific case, induction is the way for generalizing the rules that can justify such case. ILP creates models that generalize the training samples. Yes, it is. It is exactly what you think, it is machine learning.
Unlike of what you might know of machine learning, here there is no statistics, no gradient descent, no error loss and indeed, no backpropagation. How can it work then? There are dozens of algorithms and their common denominator is combinatorial search on solutions. If you are worried of exponential complexity and combinatorial explosion, I ensure you this is not exactly the case. The algorithm I setup in my research is a guided search and it is particularly efficient. The system does not fetch the entire solution space but starts from the easiest solutions first from the provided samples. Have a look into Answer Set Programming, it is essential for unfolding the capabilities of ILP. But let’s see a concrete example, this is the knowledge base:
And I want the system to create new relations: grandfather and grandmother. I provide what it is right as output, and what it is wrong:
|positive samples||negative samples|
What you get out of ILP is:
grandfather(X, Y) :- (parent(X, Z), parent(Z, Y), isA(X, male)) grandmother(X, Y) :- (parent(X, Z), parent(Z, Y), isA(X, female))
ILP is not just for querying or classifying data, it could be used to generate real programs that implement algorithms you normally create by coding. Consider generating a sorting algorithm. Suppose we have the `sort` relation where the unsorted input in the first argument and the expected sorted output in the second:
sort([3,1], [1,3]) sort([5,2,7], [2,5,7])
Let’s assume the system has already learned some basic predicates for list operations such as empty, head, tail, partition, append:
empty(X) // matches if X is an empty list head([3,4,5], X) // head([3,4,5], 3) tail([3,4,5],X) // tail([3,4,5], [4,5]) partition(3, [5,6,7,8], X, Y ) // partition(3, [5,6,7,8], [5,6], [7,8]) append[1,2,3], [4,5], X) // append([1,2,3], [4,5], [1,2,3,4,5])
ILP can now generate the quicksort algorithm:
sort(A,B) :- empty(A),empty(B) sort(A,B) :- head(A,Pivot),partition(Pivot,A,L1,R1), sort(L1,L2),sort(R1,R2),append(L2,R2,B)
If this method is applied to Knowledge Graph (KG) something very interesting might happen. I like the metaphor of the climber (ILP) that climbs the mountain by leveraging all the asperities (KG) for reaching the peek, the program. In a smooth wall the climber can’t go anywhere, therefore information is vital, but it is not enough for reaching the peek. At the time of writing this post, I see a lot of hype around KGs like the panacea for any kind of problem, but at the end, KGs are just databases of incomplete, sometimes wrong, and usually contradictory information that require proper methods for extracting sense out of them. One of those is ILP. I will talk about it and how to make use of Knowledge Graphs for solving business cases in eCommerce with Logic Programming.
Giancarlo Frison is Technology Strategist at SAP Customer Experience Labs