Discrete

TabularCPD

Contains the different formats of CPDs used in PGM

class pgmpy.factors.discrete.CPD.TabularCPD(variable, variable_card, values, evidence=None, evidence_card=None, state_names={})[source]

Defines the conditional probability distribution table (CPD table)

Parameters:
  • variable (int, string (any hashable python object)) – The variable whose CPD is defined.

  • variable_card (integer) – Cardinality/no. of states of variable

  • values (2D array, 2D list or 2D tuple) – Values for the CPD table. Please refer the example for the exact format needed.

  • evidence (array-like) – List of variables in evidences(if any) w.r.t. which CPD is defined.

  • evidence_card (array-like) – cardinality/no. of states of variables in `evidence`(if any)

  • state_names (dict (default: dict())) – A dictionary of the form {variable: list of states} specifying the names of possible states for each variable (variable + evidence) in the TabularCPD. The order in which the states are specified should match the order in the values array. If state_names is not specified, auto-assigns state names starting from 0.

Examples

For a distribution of P(grade|diff, intel)

diff

easy

hard

intel

low

medium

high

low

medium

high

gradeA

0.1

0.1

0.1

0.1

0.1

0.1

gradeB

0.1

0.1

0.1

0.1

0.1

0.1

gradeC

0.8

0.8

0.8

0.8

0.8

0.8

the values array should be [[0.1,0.1,0.1,0.1,0.1,0.1],

[0.1,0.1,0.1,0.1,0.1,0.1], [0.8,0.8,0.8,0.8,0.8,0.8]]

>>> cpd = TabularCPD(variable='grade',
...                  variable_card=3,
...                  values=[[0.1,0.1,0.1,0.1,0.1,0.1],
...                          [0.1,0.1,0.1,0.1,0.1,0.1],
...                          [0.8,0.8,0.8,0.8,0.8,0.8]],
...                  evidence=['diff', 'intel'],
...                  evidence_card=[2, 3],
...                  state_names={'diff': ['easy', 'hard'],
...                               'intel': ['low', 'mid', 'high'],
...                               'grade': ['A', 'B', 'C']})
>>> print(cpd)
+---------+----------+----------+-----------+----------+----------+-----------+
| diff    |diff(easy)|diff(easy)|diff(easy) |diff(hard)|diff(hard)|diff(hard) |
+---------+----------+----------+-----------+----------+----------+-----------+
| intel   |intel(low)|intel(mid)|intel(high)|intel(low)|intel(mid)|intel(high)|
+---------+----------+----------+-----------+----------+----------+-----------+
| grade(A)| 0.1      | 0.1      | 0.1       | 0.1      | 0.1      | 0.1       |
+---------+----------+----------+-----------+----------+----------+-----------+
| grade(B)| 0.1      | 0.1      | 0.1       | 0.1      | 0.1      | 0.1       |
+---------+----------+----------+-----------+----------+----------+-----------+
| grade(C)| 0.8      | 0.8      | 0.8       | 0.8      | 0.8      | 0.8       |
+---------+----------+----------+-----------+----------+----------+-----------+
>>> cpd.values
array([[[ 0.1,  0.1,  0.1],
        [ 0.1,  0.1,  0.1]],
       [[ 0.1,  0.1,  0.1],
        [ 0.1,  0.1,  0.1]],
       [[ 0.8,  0.8,  0.8],
        [ 0.8,  0.8,  0.8]]])
>>> cpd.variables
['grade', 'diff', 'intel']
>>> cpd.cardinality
array([3, 2, 3])
>>> cpd.variable
'grade'
>>> cpd.variable_card
3
copy()[source]

Returns a copy of the TabularCPD object.

Examples

>>> from pgmpy.factors.discrete import TabularCPD
>>> cpd = TabularCPD('grade', 2,
...                  [[0.7, 0.6, 0.6, 0.2],[0.3, 0.4, 0.4, 0.8]],
...                  ['intel', 'diff'], [2, 2])
>>> copy = cpd.copy()
>>> copy.variable
'grade'
>>> copy.variable_card
2
>>> copy.evidence
['intel', 'diff']
>>> copy.values
array([[[ 0.7,  0.6],
        [ 0.6,  0.2]],
       [[ 0.3,  0.4],
        [ 0.4,  0.8]]])
get_evidence()[source]

Returns the evidence variables of the CPD.

static get_random(variable, evidence=None, cardinality=None, state_names={}, seed=None)[source]

Generates a TabularCPD instance with random values on variable with parents/evidence evidence with cardinality/number of states as given in cardinality.

Parameters:
  • variable (str, int or any hashable python object.) – The variable on which to define the TabularCPD.

  • evidence (list, array-like) – A list of variable names which are the parents/evidence of variable.

  • cardinality (dict (default: None)) – A dict of the form {var_name: card} specifying the number of states/ cardinality of each of the variables. If None, assigns each variable 2 states.

  • state_names (dict (default: {})) – A dict of the form {var_name: list of states} to specify the state names for the variables in the CPD. If state_names=None, integral state names starting from 0 is assigned.

Returns:

Random CPD – A TabularCPD object on variable with evidence as evidence with random values.

Return type:

pgmpy.factors.discrete.TabularCPD

Examples

>>> from pgmpy.factors.discrete import TabularCPD
>>> TabularCPD(variable='A', evidence=['C', 'D'],
...            cardinality={'A': 3, 'B': 2, 'C': 4})
<TabularCPD representing P(A:3 | C:4, B:2) at 0x7f95e22b8040>
>>> TabularCPD(variable='A', evidence=['C', 'D'],
...            cardinality={'A': 2, 'B': 2, 'C': 2},
...            state_names={'A': ['a1', 'a2'],
...                         'B': ['b1', 'b2'],
...                         'C': ['c1', 'c2']})
get_values()[source]

Returns the values of the CPD as a 2-D array. The order of the parents is the same as provided in evidence.

Examples

>>> from pgmpy.factors.discrete import TabularCPD
>>> cpd = TabularCPD('grade', 3, [[0.1, 0.1],
...                               [0.1, 0.1],
...                               [0.8, 0.8]],
...                  evidence='evi1', evidence_card=2)
>>> cpd.get_values()
array([[ 0.1,  0.1],
       [ 0.1,  0.1],
       [ 0.8,  0.8]])
marginalize(variables, inplace=True)[source]

Modifies the CPD table with marginalized values. Marginalization refers to summing out variables, hence that variable would no longer appear in the CPD.

Parameters:
  • variables (list, array-like) – list of variable to be marginalized

  • inplace (boolean) – If inplace=True it will modify the CPD itself, else would return a new CPD

Examples

>>> from pgmpy.factors.discrete import TabularCPD
>>> cpd_table = TabularCPD('grade', 2,
...                        [[0.7, 0.6, 0.6, 0.2],[0.3, 0.4, 0.4, 0.8]],
...                        ['intel', 'diff'], [2, 2])
>>> cpd_table.marginalize(['diff'])
>>> cpd_table.get_values()
array([[ 0.65,  0.4 ],
       [ 0.35,  0.6 ]])
normalize(inplace=True)[source]

Normalizes the cpd table. The method modifies each column of values such that it sums to 1 without changing the proportion between states.

Parameters:

inplace (boolean) – If inplace=True it will modify the CPD itself, else would return a new CPD

Examples

>>> from pgmpy.factors.discrete import TabularCPD
>>> cpd_table = TabularCPD('grade', 2,
...                        [[0.7, 0.2, 0.6, 0.2],[0.4, 0.4, 0.4, 0.8]],
...                        ['intel', 'diff'], [2, 2])
>>> cpd_table.normalize()
>>> cpd_table.get_values()
array([[ 0.63636364,  0.33333333,  0.6       ,  0.2       ],
       [ 0.36363636,  0.66666667,  0.4       ,  0.8       ]])
reduce(values, inplace=True, show_warnings=True)[source]

Reduces the cpd table to the context of given variable values. Reduce fixes the state of given variable to specified value. The reduced variables will no longer appear in the CPD.

Parameters:
  • values (list, array-like) – A list of tuples of the form (variable_name, variable_state).

  • inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Examples

>>> from pgmpy.factors.discrete import TabularCPD
>>> cpd_table = TabularCPD('grade', 2,
...                        [[0.7, 0.6, 0.6, 0.2],[0.3, 0.4, 0.4, 0.8]],
...                        ['intel', 'diff'], [2, 2])
>>> cpd_table.reduce([('diff', 0)])
>>> cpd_table.get_values()
array([[ 0.7,  0.6],
       [ 0.3,  0.4]])
reorder_parents(new_order, inplace=True)[source]

Returns a new cpd table according to provided parent/evidence order.

Parameters:
  • new_order (list) – list of new ordering of variables

  • inplace (boolean) – If inplace == True it will modify the CPD itself otherwise new value will be returned without affecting old values

Examples

Consider a CPD P(grade| diff, intel)

>>> cpd = TabularCPD('grade',3,[[0.1,0.1,0.0,0.4,0.2,0.1],
...                             [0.3,0.2,0.1,0.4,0.3,0.2],
...                             [0.6,0.7,0.9,0.2,0.5,0.7]],
...                  evidence=['diff', 'intel'], evidence_card=[2,3])
>>> print(cpd)
+----------+----------+----------+----------+----------+----------+----------+
| diff     | diff(0)  | diff(0)  | diff(0)  | diff(1)  | diff(1)  | diff(1)  |
+----------+----------+----------+----------+----------+----------+----------+
| intel    | intel(0) | intel(1) | intel(2) | intel(0) | intel(1) | intel(2) |
+----------+----------+----------+----------+----------+----------+----------+
| grade(0) | 0.1      | 0.1      | 0.0      | 0.4      | 0.2      | 0.1      |
+----------+----------+----------+----------+----------+----------+----------+
| grade(1) | 0.3      | 0.2      | 0.1      | 0.4      | 0.3      | 0.2      |
+----------+----------+----------+----------+----------+----------+----------+
| grade(2) | 0.6      | 0.7      | 0.9      | 0.2      | 0.5      | 0.7      |
+----------+----------+----------+----------+----------+----------+----------+
>>> cpd.values
array([[[ 0.1,  0.1,  0. ],
        [ 0.4,  0.2,  0.1]],
       [[ 0.3,  0.2,  0.1],
        [ 0.4,  0.3,  0.2]],
       [[ 0.6,  0.7,  0.9],
        [ 0.2,  0.5,  0.7]]])
>>> cpd.variables
['grade', 'diff', 'intel']
>>> cpd.cardinality
array([3, 2, 3])
>>> cpd.variable
'grade'
>>> cpd.variable_card
3
>>> cpd.reorder_parents(['intel', 'diff'])
array([[0.1, 0.4, 0.1, 0.2, 0. , 0.1],
       [0.3, 0.4, 0.2, 0.3, 0.1, 0.2],
       [0.6, 0.2, 0.7, 0.5, 0.9, 0.7]])
>>> print(cpd)
+----------+----------+----------+----------+----------+----------+----------+
| intel    | intel(0) | intel(0) | intel(1) | intel(1) | intel(2) | intel(2) |
+----------+----------+----------+----------+----------+----------+----------+
| diff     | diff(0)  | diff(1)  | diff(0)  | diff(1)  | diff(0)  | diff(1)  |
+----------+----------+----------+----------+----------+----------+----------+
| grade(0) | 0.1      | 0.4      | 0.1      | 0.2      | 0.0      | 0.1      |
+----------+----------+----------+----------+----------+----------+----------+
| grade(1) | 0.3      | 0.4      | 0.2      | 0.3      | 0.1      | 0.2      |
+----------+----------+----------+----------+----------+----------+----------+
| grade(2) | 0.6      | 0.2      | 0.7      | 0.5      | 0.9      | 0.7      |
+----------+----------+----------+----------+----------+----------+----------+
>>> cpd.values
array([[[0.1, 0.4],
        [0.1, 0.2],
        [0. , 0.1]],
       [[0.3, 0.4],
        [0.2, 0.3],
        [0.1, 0.2]],
       [[0.6, 0.2],
        [0.7, 0.5],
        [0.9, 0.7]]])
>>> cpd.variables
['grade', 'intel', 'diff']
>>> cpd.cardinality
array([3, 3, 2])
>>> cpd.variable
'grade'
>>> cpd.variable_card
3
to_csv(filename)[source]

Exports the CPD to a CSV file.

Examples

>>> from pgmpy.utils import get_example_model
>>> model = get_example_model("alarm")
>>> cpd = model.get_cpds("SAO2")
>>> cpd.to_csv(filename="sao2.cs")
to_factor()[source]

Returns an equivalent factor with the same variables, cardinality, values as that of the CPD. Since factor doesn’t distinguish between conditional and non-conditional distributions, evidence information will be lost.

Examples

>>> from pgmpy.factors.discrete import TabularCPD
>>> cpd = TabularCPD('grade', 3, [[0.1, 0.1],
...                               [0.1, 0.1],
...                               [0.8, 0.8]],
...                  evidence='evi1', evidence_card=2)
>>> factor = cpd.to_factor()
>>> factor
<DiscreteFactor representing phi(grade:3, evi1:2) at 0x7f847a4f2d68>

Discrete Factor

class pgmpy.factors.discrete.DiscreteFactor.DiscreteFactor(variables, cardinality, values, state_names={})[source]

Initialize a DiscreteFactor class.

Defined above, we have the following mapping from variable assignments to the index of the row vector in the value field:

x1

x2

x3

phi(x1, x2, x3)

x1_0

x2_0

x3_0

phi.value(0)

x1_0

x2_0

x3_1

phi.value(1)

x1_0

x2_1

x3_0

phi.value(2)

x1_0

x2_1

x3_1

phi.value(3)

x1_1

x2_0

x3_0

phi.value(4)

x1_1

x2_0

x3_1

phi.value(5)

x1_1

x2_1

x3_0

phi.value(6)

x1_1

x2_1

x3_1

phi.value(7)

Parameters:
  • variables (list, array-like) – List of variables on which the factor is to be defined i.e. scope of the factor.

  • cardinality (list, array_like) – List of cardinalities/no.of states of each variable. cardinality array must have a value corresponding to each variable in variables.

  • values (list, array_like) – List of values of factor. A DiscreteFactor’s values are stored in a row vector in the value using an ordering such that the left-most variables as defined in variables cycle through their values the fastest. Please refer to examples for usage examples.

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 2, 2], np.ones(8))
>>> phi
<DiscreteFactor representing phi(x1:2, x2:2, x3:2) at 0x7f8188fcaa90>
>>> print(phi)
+------+------+------+-----------------+
| x1   | x2   | x3   |   phi(x1,x2,x3) |
|------+------+------+-----------------|
| x1_0 | x2_0 | x3_0 |          1.0000 |
| x1_0 | x2_0 | x3_1 |          1.0000 |
| x1_0 | x2_1 | x3_0 |          1.0000 |
| x1_0 | x2_1 | x3_1 |          1.0000 |
| x1_1 | x2_0 | x3_0 |          1.0000 |
| x1_1 | x2_0 | x3_1 |          1.0000 |
| x1_1 | x2_1 | x3_0 |          1.0000 |
| x1_1 | x2_1 | x3_1 |          1.0000 |
+------+------+------+-----------------+
assignment(index)[source]

Returns a list of assignments (variable and state) for the corresponding index.

Parameters:

index (list, array-like) – List of indices whose assignment is to be computed

Returns:

Full assignments – Returns a list of full assignments of all the variables of the factor.

Return type:

list

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['diff', 'intel'], [2, 2], np.ones(4))
>>> phi.assignment([1, 2])
[[('diff', 0), ('intel', 1)], [('diff', 1), ('intel', 0)]]
copy()[source]

Returns a copy of the factor.

Returns:

Copy of self – A copy of the original discrete factor.

Return type:

pgmpy.factors.discrete.DiscreteFactor

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 3], np.arange(18))
>>> phi_copy = phi.copy()
>>> phi_copy.variables
['x1', 'x2', 'x3']
>>> phi_copy.cardinality
array([2, 3, 3])
>>> phi_copy.values
array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],
       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]]])
divide(phi1, inplace=True)[source]

DiscreteFactor division by phi1.

Parameters:
  • phi1 (DiscreteFactor instance) – The denominator for division.

  • inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns:

Divided factor – If inplace=True (default) returns None else returns a new DiscreteFactor instance.

Return type:

pgmpy.factors.discrete.DiscreteFactor or None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x1'], [2, 2], range(1, 5))
>>> phi1.divide(phi2)
>>> phi1.variables
['x1', 'x2', 'x3']
>>> phi1.cardinality
array([2, 3, 2])
>>> phi1.values
array([[[ 0.        ,  0.33333333],
        [ 2.        ,  1.        ],
        [ 4.        ,  1.66666667]],
       [[ 3.        ,  1.75      ],
        [ 4.        ,  2.25      ],
        [ 5.        ,  2.75      ]]])
get_cardinality(variables)[source]

Returns the cardinality/no.of states of each variable in variables.

Parameters:

variables (list, array-like) – A list of variable names.

Returns:

Cardinality of variables – Dictionary of the form {variable: variable_cardinality}

Return type:

dict

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi.get_cardinality(['x1'])
{'x1': 2}
>>> phi.get_cardinality(['x1', 'x2'])
{'x1': 2, 'x2': 3}
get_value(**kwargs)[source]

Returns the value of the given variable states. Assumes that the arguments specified are state names, and falls back to considering it as state no if can’t find the state name.

Parameters:

kwargs (named arguments of the form variable=state_name) – Spcifies the state of each of the variable for which to get the value.

Returns:

value of kwargs – The value of specified states.

Return type:

float

Examples

>>> from pgmpy.utils import get_example_model
>>> model = get_example_model("asia")
>>> phi = model.get_cpds("either").to_factor()
>>> phi.get_value(lung="yes", tub="no", either="yes")
1.0
identity_factor()[source]

Returns the identity factor.

Def: The identity factor of a factor has the same scope and cardinality as the original factor,

but the values for all the assignments is 1. When the identity factor is multiplied with the factor it returns the factor itself.

Returns:

Identity factor – Returns a factor with all values set to 1.

Return type:

pgmpy.factors.discrete.DiscreteFactor.

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi_identity = phi.identity_factor()
>>> phi_identity.variables
['x1', 'x2', 'x3']
>>> phi_identity.values
array([[[ 1.,  1.],
        [ 1.,  1.],
        [ 1.,  1.]],
       [[ 1.,  1.],
        [ 1.,  1.],
        [ 1.,  1.]]])
is_valid_cpd()[source]

Checks if the factor’s values can be used for a valid CPD.

marginalize(variables, inplace=True)[source]

Modifies the factor with marginalized values.

Parameters:
  • variables (list, array-like) – List of variables over which to marginalize.

  • inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns:

  • Marginalized factor (pgmpy.factors.discrete.DiscreteFactor or None)

  • If inplace=True (default) returns None else returns a new DiscreteFactor instance.

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi.marginalize(['x1', 'x3'])
>>> phi.values
array([14., 22., 30.])
>>> phi.variables
['x2']
maximize(variables, inplace=True)[source]

Maximizes the factor with respect to variables.

Parameters:
  • variables (list, array-like) – List of variables with respect to which factor is to be maximized

  • inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns:

Maximized factor – If inplace=True (default) returns None else inplace=False returns a new DiscreteFactor instance.

Return type:

pgmpy.factors.discrete.DiscreteFactor or None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [3, 2, 2], [0.25, 0.35, 0.08, 0.16, 0.05, 0.07,
...                                              0.00, 0.00, 0.15, 0.21, 0.09, 0.18])
>>> phi.variables
['x1', 'x2', 'x3']
>>> phi.maximize(['x2'])
>>> phi.variables
['x1', 'x3']
>>> phi.cardinality
array([3, 2])
>>> phi.values
array([[ 0.25,  0.35],
       [ 0.05,  0.07],
       [ 0.15,  0.21]])
normalize(inplace=True)[source]

Normalizes the values of factor so that they sum to 1.

Parameters:

inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor

Returns:

Normalized factor – If inplace=True (default) returns None else returns a new DiscreteFactor instance.

Return type:

pgmpy.factors.discrete.DiscreteFactor or None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi.values
array([[[ 0.,  1.],
        [ 2.,  3.],
        [ 4.,  5.]],
       [[ 6.,  7.],
        [ 8.,  9.],
        [10., 11.]]])
>>> phi.normalize()
>>> phi.variables
['x1', 'x2', 'x3']
>>> phi.cardinality
array([2, 3, 2])
>>> phi.values
array([[[ 0.        ,  0.01515152],
        [ 0.03030303,  0.04545455],
        [ 0.06060606,  0.07575758]],
       [[ 0.09090909,  0.10606061],
        [ 0.12121212,  0.13636364],
        [ 0.15151515,  0.16666667]]])
product(phi1, inplace=True)[source]

DiscreteFactor product with phi1.

Parameters:
  • phi1 (float or DiscreteFactor instance) – If float, all the values are multiplied with phi1. else if DiscreteFactor instance, mutliply based on matching rows.

  • inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns:

Multiplied factor – If inplace=True (default) returns None else returns a new DiscreteFactor instance.

Return type:

pgmpy.factors.discrete.DiscreteFactor or None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> phi1.product(phi2, inplace=True)
>>> phi1.variables
['x1', 'x2', 'x3', 'x4']
>>> phi1.cardinality
array([2, 3, 2, 2])
>>> phi1.values
array([[[[ 0,  0],
         [ 4,  6]],
        [[ 0,  4],
         [12, 18]],
        [[ 0,  8],
         [20, 30]]],
       [[[ 6, 18],
         [35, 49]],
        [[ 8, 24],
         [45, 63]],
        [[10, 30],
         [55, 77]]]]
reduce(values, inplace=True, show_warnings=True)[source]

Reduces the factor to the context of given variable values. The variables which are reduced would be removed from the factor.

Parameters:
  • values (list, array-like) – A list of tuples of the form (variable_name, variable_state).

  • inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

  • show_warnings (boolean) – Whether to show warning when state name not found.

Returns:

Reduced factor – If inplace=True (default) returns None else returns a new DiscreteFactor instance.

Return type:

pgmpy.factors.discrete.DiscreteFactor or None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi.reduce([('x1', 0), ('x2', 0)])
>>> phi.variables
['x3']
>>> phi.cardinality
array([2])
>>> phi.values
array([0., 1.])
sample(n, seed=None)[source]

Normalizes the factor and samples state combinations from it.

Parameters:
  • n (int) – Number of samples to generate.

  • seed (int (default: None)) – The seed value for the random number generator.

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi1.sample(5)
    x1  x2  x3
0    1   0   0
1    0   2   0
2    1   2   0
3    1   1   1
4    1   1   1
scope()[source]

Returns the scope of the factor i.e. the variables on which the factor is defined.

Returns:

Scope of the factor – List of variables on which the factor is defined.

Return type:

list

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12))
>>> phi.scope()
['x1', 'x2', 'x3']
set_value(value, **kwargs)[source]

Sets the probability value of the given variable states.

Parameters:
  • value (float) – The value for the specified state.

  • kwargs (named arguments of the form variable=state_name) – Spcifies the state of each of the variable for which to get the probability value.

Return type:

None

Examples

>>> from pgmpy.utils import get_example_model
>>> model = get_example_model("asia")
>>> phi = model.get_cpds("either").to_factor()
>>> phi.set_value(value=0.1, lung="yes", tub="no", either="yes")
>>> phi.get_value(lung='yes', tub='no', either='yes')
0.1
sum(phi1, inplace=True)[source]

DiscreteFactor sum with phi1.

Parameters:
  • phi1 (float or DiscreteFactor instance.) – If float, the value is added to each value in the factor. DiscreteFactor to be added.

  • inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns:

Summed factor – If inplace=True (default) returns None else returns a new DiscreteFactor instance.

Return type:

pgmpy.factors.discrete.DiscreteFactor or None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> phi1.sum(phi2, inplace=True)
>>> phi1.variables
['x1', 'x2', 'x3', 'x4']
>>> phi1.cardinality
array([2, 3, 2, 2])
>>> phi1.values
array([[[[ 0.,  2.],
         [ 5.,  7.]],
        [[ 2.,  4.],
         [ 7.,  9.]],
        [[ 4.,  6.],
         [ 9., 11.]]],
       [[[ 7., 9.],
         [12., 14.]],
        [[ 9., 11.],
         [14., 16.]],
        [[11., 13.],
         [16., 18.]]]])
class pgmpy.factors.discrete.DiscreteFactor.State(var, state)
state

Alias for field number 1

var

Alias for field number 0

Joint Probability Distribution

class pgmpy.factors.discrete.JointProbabilityDistribution.JointProbabilityDistribution(variables, cardinality, values)[source]

Base class for Joint Probability Distribution

check_independence(event1, event2, event3=None, condition_random_variable=False)[source]

Check if the Joint Probability Distribution satisfies the given independence condition.

Parameters:
  • event1 (list) – random variable whose independence is to be checked.

  • event2 (list) – random variable from which event1 is independent.

  • values (2D array or list like or 1D array or list like) – A 2D list of tuples of the form (variable_name, variable_state). A 1D list or array-like to condition over randome variables (condition_random_variable must be True) The values on which to condition the Joint Probability Distribution.

  • condition_random_variable (Boolean (Default false)) – If true and event3 is not None than will check independence condition over random variable.

  • X (For random variables say)

  • Y

  • Z. (event3 should)

  • Y. (event1 should be either X or)

  • X. (event2 should be either Y or)

  • Z.

Examples

>>> from pgmpy.factors.discrete import JointProbabilityDistribution as JPD
>>> prob = JPD(['I','D','G'],[2,2,3],
               [0.126,0.168,0.126,0.009,0.045,0.126,0.252,0.0224,0.0056,0.06,0.036,0.024])
>>> prob.check_independence(['I'], ['D'])
True
>>> prob.check_independence(['I'], ['D'], [('G', 1)])  # Conditioning over G_1
False
>>> # Conditioning over random variable G
>>> prob.check_independence(['I'], ['D'], ('G',), condition_random_variable=True)
False
conditional_distribution(values, inplace=True)[source]

Returns Conditional Probability Distribution after setting values to 1.

Parameters:
  • values (list or array_like) – A list of tuples of the form (variable_name, variable_state). The values on which to condition the Joint Probability Distribution.

  • inplace (Boolean (default True)) – If False returns a new instance of JointProbabilityDistribution

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 2, 2], np.ones(8)/8)
>>> prob.conditional_distribution([('x1', 1)])
>>> print(prob)
x2    x3      P(x2,x3)
----  ----  ----------
x2_0  x3_0      0.2500
x2_0  x3_1      0.2500
x2_1  x3_0      0.2500
x2_1  x3_1      0.2500
copy()[source]

Returns A copy of JointProbabilityDistribution object

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12)/12)
>>> prob_copy = prob.copy()
>>> prob_copy.values == prob.values
True
>>> prob_copy.variables == prob.variables
True
>>> prob_copy.variables[1] = 'y'
>>> prob_copy.variables == prob.variables
False
get_independencies(condition=None)[source]

Returns the independent variables in the joint probability distribution. Returns marginally independent variables if condition=None. Returns conditionally independent variables if condition!=None

Parameters:

condition (array_like) – Random Variable on which to condition the Joint Probability Distribution.

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12)/12)
>>> prob.get_independencies()
(x1 ⟂ x2)
(x1 ⟂ x3)
(x2 ⟂ x3)
is_imap(model)[source]

Checks whether the given BayesianNetwork is Imap of JointProbabilityDistribution

Parameters:

model (An instance of BayesianNetwork Class, for which you want to) – check the Imap

Returns:

Is IMAP – True if given Bayesian Network is Imap for Joint Probability Distribution False otherwise

Return type:

bool

Examples

>>> from pgmpy.models import BayesianNetwork
>>> from pgmpy.factors.discrete import TabularCPD
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> bm = BayesianNetwork([('diff', 'grade'), ('intel', 'grade')])
>>> diff_cpd = TabularCPD('diff', 2, [[0.2], [0.8]])
>>> intel_cpd = TabularCPD('intel', 3, [[0.5], [0.3], [0.2]])
>>> grade_cpd = TabularCPD('grade', 3,
...                        [[0.1,0.1,0.1,0.1,0.1,0.1],
...                         [0.1,0.1,0.1,0.1,0.1,0.1],
...                         [0.8,0.8,0.8,0.8,0.8,0.8]],
...                        evidence=['diff', 'intel'],
...                        evidence_card=[2, 3])
>>> bm.add_cpds(diff_cpd, intel_cpd, grade_cpd)
>>> val = [0.01, 0.01, 0.08, 0.006, 0.006, 0.048, 0.004, 0.004, 0.032,
...        0.04, 0.04, 0.32, 0.024, 0.024, 0.192, 0.016, 0.016, 0.128]
>>> JPD = JointProbabilityDistribution(['diff', 'intel', 'grade'], [2, 3, 3], val)
>>> JPD.is_imap(bm)
True
marginal_distribution(variables, inplace=True)[source]

Returns the marginal distribution over variables.

Parameters:
  • variables (string, list, tuple, set, dict) – Variable or list of variables over which marginal distribution needs to be calculated

  • inplace (Boolean (default True)) – If False return a new instance of JointProbabilityDistribution

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> values = np.random.rand(12)
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], values/np.sum(values))
>>> prob.marginal_distribution(['x1', 'x2'])
>>> print(prob)
x1    x2      P(x1,x2)
----  ----  ----------
x1_0  x2_0      0.1502
x1_0  x2_1      0.1626
x1_0  x2_2      0.1197
x1_1  x2_0      0.2339
x1_1  x2_1      0.1996
x1_1  x2_2      0.1340
minimal_imap(order)[source]

Returns a Bayesian Model which is minimal IMap of the Joint Probability Distribution considering the order of the variables.

Parameters:

order (array-like) – The order of the random variables.

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12)/12)
>>> bayesian_model = prob.minimal_imap(order=['x2', 'x1', 'x3'])
>>> bayesian_model
<pgmpy.models.models.models at 0x7fd7440a9320>
>>> bayesian_model.edges()
[('x1', 'x3'), ('x2', 'x3')]
to_factor()[source]

Returns JointProbabilityDistribution as a DiscreteFactor object

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12)/12)
>>> phi = prob.to_factor()
>>> type(phi)
pgmpy.factors.DiscreteFactor.DiscreteFactor