# Inference in Discrete Bayesian Network¶

In this notebook, we show a simple example for doing Exact inference in Bayesian Networks using pgmpy. We will be using the Asia network (http://www.bnlearn.com/bnrepository/#asia) for this example.

## Step 1: Define the model.¶

```[1]:
```
```# Fetch the asia model from the bnlearn repository

from pgmpy.utils import get_example_model

asia_model = get_example_model("asia")
```
```[2]:
```
```print("Nodes: ", asia_model.nodes())
print("Edges: ", asia_model.edges())
asia_model.get_cpds()
```
```Nodes:  ['asia', 'tub', 'smoke', 'lung', 'bronc', 'either', 'xray', 'dysp']
Edges:  [('asia', 'tub'), ('tub', 'either'), ('smoke', 'lung'), ('smoke', 'bronc'), ('lung', 'either'), ('bronc', 'dysp'), ('either', 'xray'), ('either', 'dysp')]
```
```[2]:
```
```[<TabularCPD representing P(asia:2) at 0x7f08a40e6a90>,
<TabularCPD representing P(bronc:2 | smoke:2) at 0x7f08a40e6dc0>,
<TabularCPD representing P(dysp:2 | bronc:2, either:2) at 0x7f08a40fa730>,
<TabularCPD representing P(either:2 | lung:2, tub:2) at 0x7f08a40fa100>,
<TabularCPD representing P(lung:2 | smoke:2) at 0x7f08a40fa790>,
<TabularCPD representing P(smoke:2) at 0x7f08a40fa5e0>,
<TabularCPD representing P(tub:2 | asia:2) at 0x7f08a40fac40>,
<TabularCPD representing P(xray:2 | either:2) at 0x7f08a40fab80>]
```

If you would like to create a model from scratch, please refer to the Creating Bayesian Networks notebook: https://github.com/pgmpy/pgmpy/blob/dev/examples/Creating%20a%20Bayesian%20Network.ipynb

## Step 2: Initialize the inference class¶

Currently, pgmpy support two algorithms for inference: 1. Variable Elimination and, 2. Belief Propagation. Both of these are exact inferece algorithms. The following example uses `VariableElimination` but `BeliefPropagation` has an identifcal API, so all the methods show below would also work for `BeliefPropagation`.

```[3]:
```
```# Initializing the VariableElimination class

from pgmpy.inference import VariableElimination

asia_infer = VariableElimination(asia_model)
```

## Step 3: Doing Inference using hard evidence¶

```[4]:
```
```# Computing the probability of bronc given smoke=no.
q = asia_infer.query(variables=["bronc"], evidence={"smoke": "no"})
print(q)

# Computing the joint probability of bronc and asia given smoke=yes
q = asia_infer.query(variables=["bronc", "asia"], evidence={"smoke": "yes"})
print(q)

# Computing the probabilities (not joint) of bronc and asia given smoke=no
q = asia_infer.query(variables=["bronc", "asia"], evidence={"smoke": "no"}, joint=False)
for factor in q.values():
print(factor)
```
```Finding Elimination Order: : : 0it [00:00, ?it/s]
0it [00:00, ?it/s]

0it [00:00, ?it/s]
Finding Elimination Order: : : 0it [00:00, ?it/s]

0it [00:00, ?it/s]

0it [00:00, ?it/s]

Finding Elimination Order: : : 0it [00:00, ?it/s]

0it [00:00, ?it/s]
```
```+------------+--------------+
| bronc      |   phi(bronc) |
+============+==============+
| bronc(yes) |       0.3000 |
+------------+--------------+
| bronc(no)  |       0.7000 |
+------------+--------------+
+------------+-----------+-------------------+
| bronc      | asia      |   phi(bronc,asia) |
+============+===========+===================+
| bronc(yes) | asia(yes) |            0.0060 |
+------------+-----------+-------------------+
| bronc(yes) | asia(no)  |            0.5940 |
+------------+-----------+-------------------+
| bronc(no)  | asia(yes) |            0.0040 |
+------------+-----------+-------------------+
| bronc(no)  | asia(no)  |            0.3960 |
+------------+-----------+-------------------+
+------------+--------------+
| bronc      |   phi(bronc) |
+============+==============+
| bronc(yes) |       0.3000 |
+------------+--------------+
| bronc(no)  |       0.7000 |
+------------+--------------+
+-----------+-------------+
| asia      |   phi(asia) |
+===========+=============+
| asia(yes) |      0.0100 |
+-----------+-------------+
| asia(no)  |      0.9900 |
+-----------+-------------+
```
```
```
```[5]:
```
```# Computing the MAP of bronc given smoke=no.
q = asia_infer.map_query(variables=["bronc"], evidence={"smoke": "no"})
print(q)

# Computing the MAP of bronc and asia given smoke=yes
q = asia_infer.map_query(variables=["bronc", "asia"], evidence={"smoke": "yes"})
print(q)
```
```

0it [00:00, ?it/s]

Finding Elimination Order: : : 0it [00:00, ?it/s]

0it [00:00, ?it/s]

0it [00:00, ?it/s]

Finding Elimination Order: : : 0it [00:00, ?it/s]

0it [00:00, ?it/s]
```
```{'bronc': 'no'}
{'bronc': 'yes', 'asia': 'no'}
```
```
```

## Step 5: Inference using virtual evidence¶

```[6]:
```
```lung_virt_evidence = TabularCPD(variable="lung", variable_card=2, values=[[0.4], [0.6]])

# Query with hard evidence smoke = no and virtual evidence lung = [0.4, 0.6]
q = asia_infer.query(
variables=["bronc"], evidence={"smoke": "no"}, virtual_evidence=[lung_virt_evidence]
)
print(q)

# Query with hard evidence smoke = no and virtual evidences lung = [0.4, 0.6] and bronc = [0.3, 0.7]
lung_virt_evidence = TabularCPD(variable="lung", variable_card=2, values=[[0.4], [0.7]])
```
```Finding Elimination Order: : : 0it [00:04, ?it/s]
Finding Elimination Order: : : 0it [00:04, ?it/s]
Finding Elimination Order: : : 0it [00:02, ?it/s]
Finding Elimination Order: : : 0it [00:02, ?it/s]

0it [00:00, ?it/s]
Finding Elimination Order: : : 0it [00:00, ?it/s]

0it [00:00, ?it/s]
```
```+------------+--------------+
| bronc      |   phi(bronc) |
+============+==============+
| bronc(yes) |       0.3000 |
+------------+--------------+
| bronc(no)  |       0.7000 |
+------------+--------------+
```
```
```
```[8]:
```
```print(asia_model.get_cpds("lung"))
```
```+-----------+------------+-----------+
| smoke     | smoke(yes) | smoke(no) |
+-----------+------------+-----------+
| lung(yes) | 0.1        | 0.01      |
+-----------+------------+-----------+
| lung(no)  | 0.9        | 0.99      |
+-----------+------------+-----------+
```

## Step 4: Troubleshooting for slow inference¶

In the case of large models, or models in which variables have a lot of states, inference can be quite slow. Some of the ways to deal with it are:

1. Reduce the number of states of variables by combining states together.

2. Try a different elimination order by specifying `elimination_order` argument. Possible options are: MinFill, MinNeighbors, MinWeight, WeightedMinFill.

3. Try a custom elimination order: The implemented heuristics for computing the elimination order might not be efficient in every case. If you can think of a more efficient order, you can also pass it as a list to the `elimination_order` argument.

4. If it is still too slow, try using approximate inference using sampling algorithms.