How to train a Bayesian Network to predict the Uncertain situation in Python
Table of Contents
The Python code to train a Bayesian Network according to the above problem
Introduction
A Bayesian network (also known as a Bayes network, belief network, or decision network) is a probabilistic graphical model or graph data structure. Each node represents a random variable and their conditional dependencies via a directed acyclic graph (DAG).
The Bayesian network helps us to represent bayesian thinking, it can be used in data science when the amount of data to model is moderate, incomplete, and/or uncertain. They also can use expert judgment to build or refine the network. They allow us to ‘simulate’ different scenarios.
Inference means something that is inferred. The act of passing from one proposition, statement, or judgment considered as true to another whose truth is believed to follow from that of the former.
We have to do Joint probability inference to build a Bayesian Network. Multiple libraries exist in Python to ease the process of probabilistic inference. We will take a look at the library pomegranate to see how the data can be represented in the code.
We will design a Graph Data Structure. In which each node will be a random variable. Then we will provide a probability distribution for each node. In this way, we will train the Bayesian Network in Python.
Background/ Interest
This report is a part of the “ CSE 418: Artificial Intelligence Lab” course at City University, Dhaka, Bangladesh conducted by Nuruzzaman Faruqui.
This is the best AI course in Bangladesh.
In this course, we learned AI from scratch. We started from Basic python and ended in Natural language processing. We learned theoretical concepts, essential mathematics properly in the “CSE 417: Artificial Intelligence” course, then implemented our knowledge in the Lab. We have done a lot of lab sessions to master the course and gradually we learned each necessary concept of Artificial Intelligence.
Now we can build our machine learning model and also can build a neural network to solve a complex problem.
Problem Statement
Let’s consider an example of a Bayesian network that involves variables that affect whether we get to our appointment on time.
Let’s describe this Bayesian network from the top down:
Rain is the root node in this network. This means that its probability distribution is not reliant on any prior event. In our example, Rain is a random variable that can take the values {none, light, heavy} with the following probability distribution:
Maintenance, in our example, encodes whether there is train track maintenance, taking the values {yes, no}. Rain is a parent node of Maintenance, which means that the probability distribution of Maintenance is affected by Rain.
The train is the variable that encodes whether the train is on time or delayed, taking the values {on time, delayed}. Note that Train has arrows pointing to it from both Maintenance and Rain. This means that both are parents of Train, and their values affect the probability distribution of Train.
The appointment is a random variable that represents whether we attend our appointment, taking the values {attend, miss}. Note that its only parent is Train. This point about the Bayesian network is noteworthy: parents include only direct relations. It is true that maintenance affects whether the train is on time and whether the train is on time affects whether we attend the appointment. However, in the end, what directly affects our chances of attending the appointment is whether the train came on time, and this is what is represented in the Bayesian network. For example, if the train came on time, it could be heavy rain and track maintenance, but that has no effect on whether we made it to our appointment.
For example, if we want to find the probability of missing the meeting when the train was delayed on a day with no maintenance and light rain, or P(light, no, delayed, miss), we will compute the following: P(light)P(no | light)P(delayed | light, no)P(miss | delayed).
The value of each of the individual probabilities can be found in the probability distributions above, and then these values are multiplied to produce P(no, light, delayed, miss).
The Python code to train a Bayesian Network according to the above problem
'''
pomegranate is a python package that implements fast, efficient, and extremely flexible probabilistic models
ranging from probability distributions to Bayesian networks to mixtures of hidden Markov models.
'''
from pomegranate import *
from pomegranate.base import Node
#Rain has no parent Node. So it's of DiscreteDistribution
rain = Node(DiscreteDistribution(
{
'none': 0.7,
'light': 0.2,
'heavy': 0.1
}
), name = 'rain')
#Maintenance Node is conditional on rain
maintanence = Node(ConditionalProbabilityTable(
[
['none','yes',0.4],
['none','no', 0.6],
['light','yes',0.2],
['light','no',0.8],
['heavy','yes',0.1],
['heavy','no',0.9],
] ,[rain.distribution]) , name = 'maintanence')
#train Node is conditional on rain,maintanence
train = Node(ConditionalProbabilityTable(
[
['none','yes','ontime',0.8],
['none','yes','delayed',0.2],
['none', 'no','ontime',0.9],
['none','no','delayed', 0.1],
['light','yes','ontime',0.6],
['light','yes','delayed',0.4],
['light', 'no','ontime',0.7],
['light','no','delayed', 0.3],
['heavy','yes','ontime',0.4],
['heavy','yes','delayed',0.6],
['heavy', 'no','ontime',0.5],
['heavy','no','delayed', 0.5],
], [rain.distribution, maintanence.distribution]),
name = 'train')
#Appointment Node is conditional on train
appointment = Node(ConditionalProbabilityTable(
[
['ontime','attend',0.9],
['ontime', 'miss',0.1],
['delayed','attend',0.6],
['delayed', 'miss',0.4],
], [train.distribution]), name = 'appointment')
#create a Bayesian Network to add states or Nodes
model = BayesianNetwork()
#Add Nodes
model.add_states(rain, maintanence, train, appointment)
#Add edges connecting Nodes
model.add_edge(rain,maintanence)
model.add_edge(rain,train)
model.add_edge(maintanence, train)
model.add_edge(train, appointment)
#Finalize the model
model.bake()
Our Bayesian Network model has been trained. Now we can use the model to predict any situation described in the Problem Statement section.
Let’s Find Probability of Rain = Light, Maintenance = No, Train = Delayed, Appointment = Miss.
Name our Bayesian Network model file as model.py (You are free to rename whatever you want). Write the code and run.
from model import model
#calculate probability for the given observation
reds_probability = model.probability(
[
['light','no','delayed','miss']
]
)
print(reds_probability)
The Result will be:
0.0192 .
We can predict more interesting things by our Network. Let’s consider, the train is delayed. Given this information, we compute and print the probability distributions of the variables Rain, Maintenance, and Appointment.
Let’s Code:
from model import model
#calculate predictions based on the evidence that the train was delayed
predictions = model.predict_proba(
{
"train":"delayed"
}
)
#print predictions for each node
for node, prediction in zip(model.states, predictions):
if isinstance(prediction, str):
print(f"{node.name} : {prediction} ")
else:
print(f" {node.name} ")
for value, probability in prediction.parameters[0].items():
print(f" {value}: {probability: .4f} ")
There the Result is:
Conclusion
The code above used inference by enumeration. However, this way of computing probability is inefficient, especially when there are many variables in the model. A different way to go about this would be abandoning exact inference in favor of approximate inference. Doing this, we lose some precision in the generated probabilities, but often this imprecision is negligible. Instead, we gain a scalable method of calculating probabilities.
However, you can develop your own model to predict any uncertain situation (ex: Future sales, Business Growth, Crime Detection, etc…) by implementing the code snippet given above. This is a simple and easier implementation of the “ CSE 418: Artificial Intelligence Lab” course at City University, Dhaka, Bangladesh conducted by Nuruzzaman Faruqui Sir. Which is the best AI course in Bangladesh.
You are free to copy the code from here or some other concluding remarks for your own Bayesian Network model.