I’m pleased to announce that Bayesian Network Builder is now open-source on Github! It is a utility I made when I implemented Zefiro – the autonomous driver of purchase journeys – and now, departed from its parent project, might be useful for other applications too. What can you do with that? BnB is ascribable to a software paradigm called probabilistic programming. There are several paradigms in software development, functional programming, object oriented programming, each of them is intended to solve a particular family of problems. Probabilistic programming is a paradigm that addresses statistical problems. Tools such as BnB bring in the hands of software developers the ability to infer posterior probabilities in software applications without being professional statisticians. I might imagine your worrying on this contorted posterior probability term but I hope I’ll get you to the point by the end of the article, and I start with an example.
You are the software lead of a successful retail chain for home improvement, gardening but also tissues and hygiene products. Even though the majority of customers are men, the cheap prices on babycare’s products bring a not negligible attendance of women. Let’s assume men outnumber women by a factor of ten.
You are commissioned to profile the customers of the store and you want to infer as much information as possible, in particular you want to estimate the probability whether the customer is male or female when s/he buys a package of diapers. You may start by searching on some marketing analysis platforms, and you find, worldwide, that 80% of baby diapers’ buyers are women. Within this data, it may sound realistic to affirm that that customer who is leaving the store with a pack underarm is actually a woman. Sometimes the trust on intuitions is overrated, and in this case our answer is wrong.
Bayes Rule as Reverse Gear
I like to compare the Bayes Rule to car driving. While we are confident in our ability to drive forward in the direction in front of us, we are a little bit reluctant to drive backward unless it is just for parking or for small maneuvers. The same applies on estimating probabilities. The forward probability is given by the marketing analysis (80% diapers’ buyers and 10% store customers are women), but we need to acknowledge that forward probability and move backward. We have evidence, the purchase of diapers and we want to know the gender of the buyer. While our judgment is more reliable in one direction – when it comes to map from cause to effect – it is certainly less accountable in the opposite direction, from effect to cause. This is where Bayes comes into the play.
Reverend Bayes is still well-known today for his famous rule, which is the reverse gear we need to get the posterior from the prior by using the evidence we get from the environment. With BnB you can easily estimate such value, by defining the structure of the bayesian network and calculate the chances with the evidence we have:
One aspect of good software development is composability and bayesian networks are graphs of random variables, single probabilities, and conditional probabilities, the relations among random variables. Posteriors are calculated along the nodes and edges of those graphs by applying the chain rule. The famous rule is applied recursively throughout the network. A complete setting of the Bayesian network may be found in the Burglary example.
You are at work and your neighbor John calls you to say your house alarm is ringing and you are really worried there is a burglar at home and you want to estimate what is the actual probability that it is the case or just a small earthquake.
You should consider that:
- Your neighbor Mary didn’t call you.
- John always calls when he hears the alarm, but sometimes confuses the telephone ringing with the alarm.
- Mary likes rather loud music and sometimes misses the alarm.
- The probability of burglary at home is small (0,001%) and an earthquake is just double than that.
- The alarm is good but not perfect, there are chances of false positives and negatives.
You may do all calculations and chain rule by yourself, which is always a good training on Bayesian inference, or you may use BnB:
If you are interested in learning the Bayesian rule and solving funny puzzles, here you can find a nice resource.
Giancarlo Frison is Technology Strategist at SAP Customer Experience