Variable Elimination

class pgmpy.inference.ExactInference.VariableElimination(model)[source]
induced_graph(elimination_order)[source]

Returns the induced graph formed by running Variable Elimination on the network.

Parameters:

elimination_order (list, array like) – List of variables in the order in which they are to be eliminated.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.models import BayesianNetwork
>>> from pgmpy.inference import VariableElimination
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 5)),
...                       columns=['A', 'B', 'C', 'D', 'E'])
>>> model = BayesianNetwork([('A', 'B'), ('C', 'B'), ('C', 'D'), ('B', 'E')])
>>> model.fit(values)
>>> inference = VariableElimination(model)
>>> inference.induced_graph(['C', 'D', 'A', 'B', 'E'])
induced_width(elimination_order)[source]

Returns the width (integer) of the induced graph formed by running Variable Elimination on the network. The width is the defined as the number of nodes in the largest clique in the graph minus 1.

Parameters:

elimination_order (list, array like) – List of variables in the order in which they are to be eliminated.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.models import BayesianNetwork
>>> from pgmpy.inference import VariableElimination
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 5)),
...                       columns=['A', 'B', 'C', 'D', 'E'])
>>> model = BayesianNetwork([('A', 'B'), ('C', 'B'), ('C', 'D'), ('B', 'E')])
>>> model.fit(values)
>>> inference = VariableElimination(model)
>>> inference.induced_width(['C', 'D', 'A', 'B', 'E'])
3
map_query(variables=None, evidence=None, virtual_evidence=None, elimination_order='MinFill', show_progress=True)[source]

Computes the MAP Query over the variables given the evidence. Returns the highest probable state in the joint distribution of variables.

Parameters:
  • variables (list) – list of variables over which we want to compute the max-marginal.

  • evidence (dict) – a dict key, value pair as {var: state_of_var_observed} None if no evidence

  • virtual_evidence (list (default:None)) – A list of pgmpy.factors.discrete.TabularCPD representing the virtual evidences.

  • elimination_order (list) – order of variable eliminations (if nothing is provided) order is computed automatically

  • show_progress (boolean) – If True, shows a progress bar.

Examples

>>> from pgmpy.inference import VariableElimination
>>> from pgmpy.models import BayesianNetwork
>>> import numpy as np
>>> import pandas as pd
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 5)),
...                       columns=['A', 'B', 'C', 'D', 'E'])
>>> model = BayesianNetwork([('A', 'B'), ('C', 'B'), ('C', 'D'), ('B', 'E')])
>>> model.fit(values)
>>> inference = VariableElimination(model)
>>> phi_query = inference.map_query(['A', 'B'])
max_marginal(variables=None, evidence=None, elimination_order='MinFill', show_progress=True)[source]

Computes the max-marginal over the variables given the evidence.

Parameters:
  • variables (list) – list of variables over which we want to compute the max-marginal.

  • evidence (dict) – a dict key, value pair as {var: state_of_var_observed} None if no evidence

  • elimination_order (list) – order of variable eliminations (if nothing is provided) order is computed automatically

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.models import BayesianNetwork
>>> from pgmpy.inference import VariableElimination
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 5)),
...                       columns=['A', 'B', 'C', 'D', 'E'])
>>> model = BayesianNetwork([('A', 'B'), ('C', 'B'), ('C', 'D'), ('B', 'E')])
>>> model.fit(values)
>>> inference = VariableElimination(model)
>>> phi_query = inference.max_marginal(['A', 'B'])
query(variables, evidence=None, virtual_evidence=None, elimination_order='greedy', joint=True, show_progress=True)[source]
Parameters:
  • variables (list) – list of variables for which you want to compute the probability

  • evidence (dict) – a dict key, value pair as {var: state_of_var_observed} None if no evidence

  • virtual_evidence (list (default:None)) – A list of pgmpy.factors.discrete.TabularCPD representing the virtual evidences.

  • elimination_order (str or list (default='greedy')) – Order in which to eliminate the variables in the algorithm. If list is provided, should contain all variables in the model except the ones in variables. str options are: greedy, WeightedMinFill, MinNeighbors, MinWeight, MinFill. Please refer https://pgmpy.org/exact_infer/ve.html#module-pgmpy.inference.EliminationOrder for details.

  • joint (boolean (default: True)) – If True, returns a Joint Distribution over variables. If False, returns a dict of distributions over each of the variables.

  • show_progress (boolean) – If True, shows a progress bar.

Examples

>>> from pgmpy.inference import VariableElimination
>>> from pgmpy.models import BayesianNetwork
>>> import numpy as np
>>> import pandas as pd
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 5)),
...                       columns=['A', 'B', 'C', 'D', 'E'])
>>> model = BayesianNetwork([('A', 'B'), ('C', 'B'), ('C', 'D'), ('B', 'E')])
>>> model.fit(values)
>>> inference = VariableElimination(model)
>>> phi_query = inference.query(['A', 'B'])

Elimination Ordering

class pgmpy.inference.EliminationOrder.BaseEliminationOrder(model)[source]

Init method for the base class of Elimination Orders.

Parameters:

model (BayesianNetwork instance) – The model on which we want to compute the elimination orders.

abstract cost(node)[source]

The cost function to compute the cost of elimination of each node. This method is just a dummy and returns 0 for all the nodes. Actual cost functions are implemented in the classes inheriting BaseEliminationOrder.

Parameters:

node (string, any hashable python object.) – The node whose cost is to be computed.

fill_in_edges(node)[source]

Return edges needed to be added to the graph if a node is removed.

Parameters:

node (string (any hashable python object)) – Node to be removed from the graph.

get_elimination_order(nodes=None, show_progress=True)[source]

Returns the optimal elimination order based on the cost function. The node having the least cost is removed first.

Parameters:

nodes (list, tuple, set (array-like)) – The variables which are to be eliminated.

Examples

>>> import numpy as np
>>> from pgmpy.models import BayesianNetwork
>>> from pgmpy.factors.discrete import TabularCPD
>>> from pgmpy.inference.EliminationOrder import WeightedMinFill
>>> model = BayesianNetwork([('c', 'd'), ('d', 'g'), ('i', 'g'),
...                          ('i', 's'), ('s', 'j'), ('g', 'l'),
...                        ('l', 'j'), ('j', 'h'), ('g', 'h')])
>>> cpd_c = TabularCPD('c', 2, np.random.rand(2, 1))
>>> cpd_d = TabularCPD('d', 2, np.random.rand(2, 2),
...                   ['c'], [2])
>>> cpd_g = TabularCPD('g', 3, np.random.rand(3, 4),
...                   ['d', 'i'], [2, 2])
>>> cpd_i = TabularCPD('i', 2, np.random.rand(2, 1))
>>> cpd_s = TabularCPD('s', 2, np.random.rand(2, 2),
...                   ['i'], [2])
>>> cpd_j = TabularCPD('j', 2, np.random.rand(2, 4),
...                   ['l', 's'], [2, 2])
>>> cpd_l = TabularCPD('l', 2, np.random.rand(2, 3),
...                   ['g'], [3])
>>> cpd_h = TabularCPD('h', 2, np.random.rand(2, 6),
...                   ['g', 'j'], [3, 2])
>>> model.add_cpds(cpd_c, cpd_d, cpd_g, cpd_i, cpd_s, cpd_j,
...                cpd_l, cpd_h)
>>> WeightedMinFill(model).get_elimination_order(['c', 'd', 'g', 'l', 's'])
['c', 's', 'l', 'd', 'g']
>>> WeightedMinFill(model).get_elimination_order(['c', 'd', 'g', 'l', 's'])
['c', 's', 'l', 'd', 'g']
>>> WeightedMinFill(model).get_elimination_order(['c', 'd', 'g', 'l', 's'])
['c', 's', 'l', 'd', 'g']
class pgmpy.inference.EliminationOrder.MinFill(model)[source]
cost(node)[source]

The cost of eliminating a node is the number of edges that need to be added (fill in edges) to the graph due to its elimination

class pgmpy.inference.EliminationOrder.MinNeighbors(model)[source]
cost(node)[source]

The cost of eliminating a node is the number of neighbors it has in the current graph.

class pgmpy.inference.EliminationOrder.MinWeight(model)[source]
cost(node)[source]

The cost of eliminating a node is the product of weights, domain cardinality, of its neighbors.

class pgmpy.inference.EliminationOrder.WeightedMinFill(model)[source]
cost(node)[source]

Cost function for WeightedMinFill. The cost of eliminating a node is the sum of weights of the edges that need to be added to the graph due to its elimination, where a weight of an edge is the product of the weights, domain cardinality, of its constituent vertices.