# Factor¶

## Discrete¶

### TabularCPD¶

Contains the different formats of CPDs used in PGM

class pgmpy.factors.discrete.CPD.TabularCPD(variable, variable_card, values, evidence=None, evidence_card=None, state_names={})[source]

Defines the conditional probability distribution table (cpd table)

Parameters
• variable (int, string (any hashable python object)) – The variable whose CPD is defined.

• variable_card (integer) – cardinality of variable

• values (2d array, 2d list or 2d tuple) – values of the cpd table

• evidence (array-like) – evidences(if any) w.r.t. which cpd is defined

• evidence_card (integer, array-like) – cardinality of evidences (if any)

Examples

For a distribution of P(grade|diff, intel)

 diff easy hard intel dumb avg smart dumb avg smart gradeA 0.1 0.1 0.1 0.1 0.1 0.1 gradeB 0.1 0.1 0.1 0.1 0.1 0.1 gradeC 0.8 0.8 0.8 0.8 0.8 0.8

values should be [[0.1,0.1,0.1,0.1,0.1,0.1], [0.1,0.1,0.1,0.1,0.1,0.1], [0.8,0.8,0.8,0.8,0.8,0.8]]

>>> cpd = TabularCPD('grade',3,[[0.1,0.1,0.1,0.1,0.1,0.1],
[0.1,0.1,0.1,0.1,0.1,0.1],
[0.8,0.8,0.8,0.8,0.8,0.8]],
evidence=['diff', 'intel'], evidence_card=[2,3])
>>> print(cpd)
+---------+---------+---------+---------+---------+---------+---------+
| diff    | diff_0  | diff_0  | diff_0  | diff_1  | diff_1  | diff_1  |
+---------+---------+---------+---------+---------+---------+---------+
| intel   | intel_0 | intel_1 | intel_2 | intel_0 | intel_1 | intel_2 |
+---------+---------+---------+---------+---------+---------+---------+
| grade_0 | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     |
+---------+---------+---------+---------+---------+---------+---------+
| grade_1 | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     |
+---------+---------+---------+---------+---------+---------+---------+
| grade_2 | 0.8     | 0.8     | 0.8     | 0.8     | 0.8     | 0.8     |
+---------+---------+---------+---------+---------+---------+---------+
>>> cpd.values
array([[[ 0.1,  0.1,  0.1],
[ 0.1,  0.1,  0.1]],
[[ 0.1,  0.1,  0.1],
[ 0.1,  0.1,  0.1]],
[[ 0.8,  0.8,  0.8],
[ 0.8,  0.8,  0.8]]])
>>> cpd.variables
>>> cpd.cardinality
array([3, 2, 3])
>>> cpd.variable
>>> cpd.variable_card
3

copy()[source]

Returns a copy of the TabularCPD object.

Examples

>>> from pgmpy.factors.discrete import TabularCPD
...                  [[0.7, 0.6, 0.6, 0.2],[0.3, 0.4, 0.4, 0.8]],
...                  ['intel', 'diff'], [2, 2])
>>> copy = cpd.copy()
>>> copy.variable
>>> copy.variable_card
2
>>> copy.evidence
['intel', 'diff']
>>> copy.values
array([[[ 0.7,  0.6],
[ 0.6,  0.2]],

[[ 0.3, 0.4],

[ 0.4, 0.8]]])

get_values()[source]

Returns the cpd

Examples

>>> from pgmpy.factors.discrete import TabularCPD
>>> cpd = TabularCPD('grade', 3, [[0.1, 0.1],
...                               [0.1, 0.1],
...                               [0.8, 0.8]],
...                  evidence='evi1', evidence_card=2)
>>> cpd.get_values()
array([[ 0.1,  0.1],
[ 0.1,  0.1],
[ 0.8,  0.8]])

marginalize(variables, inplace=True)[source]

Modifies the cpd table with marginalized values.

Parameters
• variables (list, array-like) – list of variable to be marginalized

• inplace (boolean) – If inplace=True it will modify the CPD itself, else would return a new CPD

Examples

>>> from pgmpy.factors.discrete import TabularCPD
...                        [[0.7, 0.6, 0.6, 0.2],[0.3, 0.4, 0.4, 0.8]],
...                        ['intel', 'diff'], [2, 2])
>>> cpd_table.marginalize(['diff'])
>>> cpd_table.get_values()
array([[ 0.65,  0.4 ],
[ 0.35,  0.6 ]])

normalize(inplace=True)[source]

Normalizes the cpd table.

Parameters

inplace (boolean) – If inplace=True it will modify the CPD itself, else would return a new CPD

Examples

>>> from pgmpy.factors.discrete import TabularCPD
...                        [[0.7, 0.2, 0.6, 0.2],[0.4, 0.4, 0.4, 0.8]],
...                        ['intel', 'diff'], [2, 2])
>>> cpd_table.normalize()
>>> cpd_table.get_values()
array([[ 0.63636364,  0.33333333,  0.6       ,  0.2       ],
[ 0.36363636,  0.66666667,  0.4       ,  0.8       ]])

reduce(values, inplace=True)[source]

Reduces the cpd table to the context of given variable values.

Parameters
• values (list, array-like) – A list of tuples of the form (variable_name, variable_state).

• inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Examples

>>> from pgmpy.factors.discrete import TabularCPD
...                        [[0.7, 0.6, 0.6, 0.2],[0.3, 0.4, 0.4, 0.8]],
...                        ['intel', 'diff'], [2, 2])
>>> cpd_table.reduce([('diff', 0)])
>>> cpd_table.get_values()
array([[ 0.7,  0.6],
[ 0.3,  0.4]])

reorder_parents(new_order, inplace=True)[source]

Returns a new cpd table according to provided order.

Parameters
• new_order (list) – list of new ordering of variables

• inplace (boolean) – If inplace == True it will modify the CPD itself otherwise new value will be returned without affecting old values

Examples

Consider a CPD P(grade| diff, intel)

>>> cpd = TabularCPD('grade',3,[[0.1,0.1,0.1,0.1,0.1,0.1],
...                             [0.1,0.1,0.1,0.1,0.1,0.1],
...                             [0.8,0.8,0.8,0.8,0.8,0.8]],
...                  evidence=['diff', 'intel'], evidence_card=[2,3])
>>> print(cpd)
+---------+---------+---------+---------+---------+---------+---------+
| diff    | diff_0  | diff_0  | diff_0  | diff_1  | diff_1  | diff_1  |
+---------+---------+---------+---------+---------+---------+---------+
| intel   | intel_0 | intel_1 | intel_2 | intel_0 | intel_1 | intel_2 |
+---------+---------+---------+---------+---------+---------+---------+
| grade_0 | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     |
+---------+---------+---------+---------+---------+---------+---------+
| grade_1 | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     |
+---------+---------+---------+---------+---------+---------+---------+
| grade_2 | 0.8     | 0.8     | 0.8     | 0.8     | 0.8     | 0.8     |
+---------+---------+---------+---------+---------+---------+---------+

>>> cpd.values
array([[[ 0.1,  0.1,  0.1],
[ 0.1,  0.1,  0.1]],
[[ 0.1,  0.1,  0.1],
[ 0.1,  0.1,  0.1]],
[[ 0.8,  0.8,  0.8],
[ 0.8,  0.8,  0.8]]])
>>> cpd.variables
>>> cpd.cardinality
array([3, 2, 3])
>>> cpd.variable
>>> cpd.variable_card
3
>>> cpd.reorder_parents(['intel', 'diff'])
array([[ 0.1,  0.1,  0.2,  0.2,  0.1,  0.1],
[ 0.1,  0.1,  0.1,  0.1,  0.1,  0.1],
[ 0.8,  0.8,  0.7,  0.7,  0.8,  0.8]])
>>> print(cpd)
+---------+---------+---------+---------+---------+---------+---------+
| intel   | intel_0 | intel_0 | intel_1 | intel_1 | intel_2 | intel_2 |
+---------+---------+---------+---------+---------+---------+---------+
| diff    | diff_0  | diff_1  | diff_0  | diff_1  | diff_0  | diff_1  |
+---------+---------+---------+---------+---------+---------+---------+
| grade_0 | 0.1     | 0.1     | 0.2     | 0.2     | 0.1     | 0.1     |
+---------+---------+---------+---------+---------+---------+---------+
| grade_1 | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     | 0.1     |
+---------+---------+---------+---------+---------+---------+---------+
| grade_2 | 0.8     | 0.8     | 0.7     | 0.7     | 0.8     | 0.8     |
+---------+---------+---------+---------+---------+---------+---------+

>>> cpd.values
array([[[ 0.1,  0.1],
[ 0.2,  0.2],
[ 0.1,  0.1]],
[[ 0.1,  0.1],
[ 0.1,  0.1],
[ 0.1,  0.1]],
[[ 0.8,  0.8],
[ 0.7,  0.7],
[ 0.8,  0.8]]])

>>> cpd.variables
>>> cpd.cardinality
array([3, 3, 2])
>>> cpd.variable
>>> cpd.variable_card
3

to_factor()[source]

Returns an equivalent factor with the same variables, cardinality, values as that of the cpd

Examples

>>> from pgmpy.factors.discrete import TabularCPD
>>> cpd = TabularCPD('grade', 3, [[0.1, 0.1],
...                               [0.1, 0.1],
...                               [0.8, 0.8]],
...                  evidence='evi1', evidence_card=2)
>>> factor = cpd.to_factor()
>>> factor
<DiscreteFactor representing phi(grade:3, evi1:2) at 0x7f847a4f2d68>


### Discrete Factor¶

class pgmpy.factors.discrete.DiscreteFactor.DiscreteFactor(variables, cardinality, values, state_names={})[source]

Base class for DiscreteFactor.

assignment(index)[source]

Returns a list of assignments for the corresponding index.

Parameters

index (list, array-like) – List of indices whose assignment is to be computed

Returns

list

Return type

Returns a list of full assignments of all the variables of the factor.

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['diff', 'intel'], [2, 2], np.ones(4))
>>> phi.assignment([1, 2])
[[('diff', 0), ('intel', 1)], [('diff', 1), ('intel', 0)]]

copy()[source]

Returns a copy of the factor.

Returns

DiscreteFactor

Return type

copy of the factor

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 3], np.arange(18))
>>> phi_copy = phi.copy()
>>> phi_copy.variables
['x1', 'x2', 'x3']
>>> phi_copy.cardinality
array([2, 3, 3])
>>> phi_copy.values
array([[[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8]],

[[ 9, 10, 11],

[12, 13, 14], [15, 16, 17]]])

divide(phi1, inplace=True)[source]

DiscreteFactor division by phi1.

Parameters
• phi1 (DiscreteFactor instance) – The denominator for division.

• inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns

DiscreteFactor or None – if inplace=False returns a new DiscreteFactor instance.

Return type

if inplace=True (default) returns None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x1'], [2, 2], range(1, 5)])
>>> phi1.divide(phi2)
>>> phi1.variables
['x1', 'x2', 'x3']
>>> phi1.cardinality
array([2, 3, 2])
>>> phi1.values
array([[[ 0.        ,  0.33333333],
[ 2.        ,  1.        ],
[ 4.        ,  1.66666667]],

[[ 3. , 1.75 ],

[ 4. , 2.25 ], [ 5. , 2.75 ]]])

get_cardinality(variables)[source]

Returns cardinality of a given variable

Parameters

variables (list, array-like) – A list of variable names.

Returns

dict

Return type

Dictionary of the form {variable: variable_cardinality}

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi.get_cardinality(['x1'])
{'x1': 2}
>>> phi.get_cardinality(['x1', 'x2'])
{'x1': 2, 'x2': 3}

identity_factor()[source]

Returns the identity factor.

Def: The identity factor of a factor has the same scope and cardinality as the original factor,

but the values for all the assignments is 1. When the identity factor is multiplied with the factor it returns the factor itself.

Returns

DiscreteFactor

Return type

The identity factor.

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi_identity = phi.identity_factor()
>>> phi_identity.variables
['x1', 'x2', 'x3']
>>> phi_identity.values
array([[[ 1.,  1.],
[ 1.,  1.],
[ 1.,  1.]],

[[ 1., 1.],

[ 1., 1.], [ 1., 1.]]])

marginalize(variables, inplace=True)[source]

Modifies the factor with marginalized values.

Parameters
• variables (list, array-like) – List of variables over which to marginalize.

• inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns

DiscreteFactor or None – if inplace=False returns a new DiscreteFactor instance.

Return type

if inplace=True (default) returns None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi.marginalize(['x1', 'x3'])
>>> phi.values
array([ 14.,  22.,  30.])
>>> phi.variables
['x2']

maximize(variables, inplace=True)[source]

Maximizes the factor with respect to variables.

Parameters
• variables (list, array-like) – List of variables with respect to which factor is to be maximized

• inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns

DiscreteFactor or None – if inplace=False returns a new DiscreteFactor instance.

Return type

if inplace=True (default) returns None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [3, 2, 2], [0.25, 0.35, 0.08, 0.16, 0.05, 0.07,
...                                              0.00, 0.00, 0.15, 0.21, 0.09, 0.18])
>>> phi.variables
['x1','x2','x3']
>>> phi.maximize(['x2'])
>>> phi.variables
['x1', 'x3']
>>> phi.cardinality
array([3, 2])
>>> phi.values
array([[ 0.25,  0.35],
[ 0.05,  0.07],
[ 0.15,  0.21]])

normalize(inplace=True)[source]

Normalizes the values of factor so that they sum to 1.

Parameters

inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor

Returns

DiscreteFactor or None – if inplace=False returns a new DiscreteFactor instance.

Return type

if inplace=True (default) returns None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi.values
array([[[ 0,  1],
[ 2,  3],
[ 4,  5]],
[[ 6,  7],
[ 8,  9],
[10, 11]]])
>>> phi.normalize()
>>> phi.variables
['x1', 'x2', 'x3']
>>> phi.cardinality
array([2, 3, 2])
>>> phi.values
array([[[ 0.        ,  0.01515152],
[ 0.03030303,  0.04545455],
[ 0.06060606,  0.07575758]],
[[ 0.09090909,  0.10606061],
[ 0.12121212,  0.13636364],
[ 0.15151515,  0.16666667]]])

product(phi1, inplace=True)[source]

DiscreteFactor product with phi1.

Parameters
• phi1 (DiscreteFactor instance) – DiscreteFactor to be multiplied.

• inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns

DiscreteFactor or None – if inplace=False returns a new DiscreteFactor instance.

Return type

if inplace=True (default) returns None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> phi1.product(phi2, inplace=True)
>>> phi1.variables
['x1', 'x2', 'x3', 'x4']
>>> phi1.cardinality
array([2, 3, 2, 2])
>>> phi1.values
array([[[[ 0,  0],
[ 4,  6]],

[[ 0, 4],

[12, 18]],

[[ 0, 8],

[20, 30]]],

[[[ 6, 18],

[35, 49]],

[[ 8, 24],

[45, 63]],

[[10, 30],

[55, 77]]]]

reduce(values, inplace=True)[source]

Reduces the factor to the context of given variable values.

Parameters
• values (list, array-like) – A list of tuples of the form (variable_name, variable_state).

• inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns

DiscreteFactor or None – if inplace=False returns a new DiscreteFactor instance.

Return type

if inplace=True (default) returns None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi.reduce([('x1', 0), ('x2', 0)])
>>> phi.variables
['x3']
>>> phi.cardinality
array()
>>> phi.values
array([0., 1.])

scope()[source]

Returns the scope of the factor.

Returns

list

Return type

List of variable names in the scope of the factor.

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12))
>>> phi.scope()
['x1', 'x2', 'x3']

sum(phi1, inplace=True)[source]

DiscreteFactor sum with phi1.

Parameters
• phi1 (DiscreteFactor instance.) – DiscreteFactor to be added.

• inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns

DiscreteFactor or None – if inplace=False returns a new DiscreteFactor instance.

Return type

if inplace=True (default) returns None

Examples

>>> from pgmpy.factors.discrete import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> phi1.sum(phi2, inplace=True)
>>> phi1.variables
['x1', 'x2', 'x3', 'x4']
>>> phi1.cardinality
array([2, 3, 2, 2])
>>> phi1.values
array([[[[ 0,  0],
[ 4,  6]],

[[ 0, 4],

[12, 18]],

[[ 0, 8],

[20, 30]]],

[[[ 6, 18],

[35, 49]],

[[ 8, 24],

[45, 63]],

[[10, 30],

[55, 77]]]])

class pgmpy.factors.discrete.DiscreteFactor.State(var, state)
property state

Alias for field number 1

property var

Alias for field number 0

### Joint Probability Distribution¶

class pgmpy.factors.discrete.JointProbabilityDistribution.JointProbabilityDistribution(variables, cardinality, values)[source]

Base class for Joint Probability Distribution

check_independence(event1, event2, event3=None, condition_random_variable=False)[source]

Check if the Joint Probability Distribution satisfies the given independence condition.

Parameters
• event1 (list) – random variable whose independence is to be checked.

• event2 (list) – random variable from which event1 is independent.

• values (2D array or list like or 1D array or list like) – A 2D list of tuples of the form (variable_name, variable_state). A 1D list or array-like to condition over randome variables (condition_random_variable must be True) The values on which to condition the Joint Probability Distribution.

• condition_random_variable (Boolean (Default false)) – If true and event3 is not None than will check independence condition over random variable.

• random variables say X, Y, Z to check if X is independent of Y given Z. (For) –

• should be either X or Y. (event1) –

• should be either Y or X. (event2) –

• should Z. (event3) –

Examples

>>> from pgmpy.factors.discrete import JointProbabilityDistribution as JPD
>>> prob = JPD(['I','D','G'],[2,2,3],
[0.126,0.168,0.126,0.009,0.045,0.126,0.252,0.0224,0.0056,0.06,0.036,0.024])
>>> prob.check_independence(['I'], ['D'])
True
>>> prob.check_independence(['I'], ['D'], [('G', 1)])  # Conditioning over G_1
False
>>> # Conditioning over random variable G
>>> prob.check_independence(['I'], ['D'], ('G',), condition_random_variable=True)
False

conditional_distribution(values, inplace=True)[source]

Returns Conditional Probability Distribution after setting values to 1.

Parameters
• values (list or array_like) – A list of tuples of the form (variable_name, variable_state). The values on which to condition the Joint Probability Distribution.

• inplace (Boolean (default True)) – If False returns a new instance of JointProbabilityDistribution

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 2, 2], np.ones(8)/8)
>>> prob.conditional_distribution([('x1', 1)])
>>> print(prob)
x2    x3      P(x2,x3)
----  ----  ----------
x2_0  x3_0      0.2500
x2_0  x3_1      0.2500
x2_1  x3_0      0.2500
x2_1  x3_1      0.2500

copy()[source]

Returns A copy of JointProbabilityDistribution object

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12)/12)
>>> prob_copy = prob.copy()
>>> prob_copy.values == prob.values
True
>>> prob_copy.variables == prob.variables
True
>>> prob_copy.variables = 'y'
>>> prob_copy.variables == prob.variables
False

get_independencies(condition=None)[source]

Returns the independent variables in the joint probability distribution. Returns marginally independent variables if condition=None. Returns conditionally independent variables if condition!=None

Parameters

condition (array_like) – Random Variable on which to condition the Joint Probability Distribution.

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12)/12)
>>> prob.get_independencies()
(x1 _|_ x2)
(x1 _|_ x3)
(x2 _|_ x3)

is_imap(model)[source]

Checks whether the given BayesianModel is Imap of JointProbabilityDistribution

Parameters

model (An instance of BayesianModel Class, for which you want to) – check the Imap

Returns

boolean – False otherwise

Return type

True if given bayesian model is Imap for Joint Probability Distribution

Examples

>>> from pgmpy.models import BayesianModel
>>> from pgmpy.factors.discrete import TabularCPD
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> diff_cpd = TabularCPD('diff', 2, [[0.2], [0.8]])
>>> intel_cpd = TabularCPD('intel', 3, [[0.5], [0.3], [0.2]])
...                        [[0.1,0.1,0.1,0.1,0.1,0.1],
...                         [0.1,0.1,0.1,0.1,0.1,0.1],
...                         [0.8,0.8,0.8,0.8,0.8,0.8]],
...                        evidence=['diff', 'intel'],
...                        evidence_card=[2, 3])
>>> val = [0.01, 0.01, 0.08, 0.006, 0.006, 0.048, 0.004, 0.004, 0.032,
...        0.04, 0.04, 0.32, 0.024, 0.024, 0.192, 0.016, 0.016, 0.128]
>>> JPD = JointProbabilityDistribution(['diff', 'intel', 'grade'], [2, 3, 3], val)
>>> JPD.is_imap(bm)
True

marginal_distribution(variables, inplace=True)[source]

Returns the marginal distribution over variables.

Parameters
• variables (string, list, tuple, set, dict) – Variable or list of variables over which marginal distribution needs to be calculated

• inplace (Boolean (default True)) – If False return a new instance of JointProbabilityDistribution

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> values = np.random.rand(12)
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], values/np.sum(values))
>>> prob.marginal_distribution(['x1', 'x2'])
>>> print(prob)
x1    x2      P(x1,x2)
----  ----  ----------
x1_0  x2_0      0.1502
x1_0  x2_1      0.1626
x1_0  x2_2      0.1197
x1_1  x2_0      0.2339
x1_1  x2_1      0.1996
x1_1  x2_2      0.1340

minimal_imap(order)[source]

Returns a Bayesian Model which is minimal IMap of the Joint Probability Distribution considering the order of the variables.

Parameters

order (array-like) – The order of the random variables.

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12)/12)
>>> bayesian_model = prob.minimal_imap(order=['x2', 'x1', 'x3'])
>>> bayesian_model
<pgmpy.models.models.models at 0x7fd7440a9320>
>>> bayesian_model.edges()
[('x1', 'x3'), ('x2', 'x3')]

to_factor()[source]

Returns JointProbabilityDistribution as a DiscreteFactor object

Examples

>>> import numpy as np
>>> from pgmpy.factors.discrete import JointProbabilityDistribution
>>> prob = JointProbabilityDistribution(['x1', 'x2', 'x3'], [2, 3, 2], np.ones(12)/12)
>>> phi = prob.to_factor()
>>> type(phi)
pgmpy.factors.DiscreteFactor.DiscreteFactor


## Continuous¶

### Canonical Factor¶

The intermediate factors in a Gaussian network can be described compactly using a simple parametric representation called the canonical form. This representation is closed under the basic operations used in inference: factor product, factor division, factor reduction, and marginalization. Thus, we define this CanonicalDistribution class that allows the inference process to be performed on joint Gaussian networks.

A canonical form C (X; K,h,g) is defined as

C (X; K,h,g) = exp( ((-1/2) * X.T * K * X) + (h.T * X) + g)

References

Probabilistic Graphical Models, Principles and Techniques, Daphne Koller and Nir Friedman, Section 14.2, Chapter 14.

### Continuous Factor¶

class pgmpy.factors.continuous.ContinuousFactor.ContinuousFactor(variables, pdf, *args, **kwargs)[source]

Base class for factors representing various multivariate representations.

assignment(*args)[source]

Returns a list of pdf assignments for the corresponding values.

Parameters

*args (values) – Values whose assignment is to be computed.

Examples

>>> from pgmpy.factors.continuous import ContinuousFactor
>>> from scipy.stats import multivariate_normal
>>> normal_pdf = lambda x1, x2: multivariate_normal.pdf((x1, x2), [0, 0], [[1, 0], [0, 1]])
>>> phi = ContinuousFactor(['x1', 'x2'], normal_pdf)
>>> phi.assignment(1, 2)
0.013064233284684921

copy()[source]

Return a copy of the distribution.

Returns

ContinuousFactor object

Return type

copy of the distribution

Examples

>>> import numpy as np
>>> from scipy.special import beta
>>> from pgmpy.factors.continuous import ContinuousFactor
# Two variable dirichlet distribution with alpha = (1,2)
>>> def dirichlet_pdf(x, y):
...     return (np.power(x, 1) * np.power(y, 2)) / beta(x, y)
>>> dirichlet_factor = ContinuousFactor(['x', 'y'], dirichlet_pdf)
>>> dirichlet_factor.variables
['x', 'y']
>>> copy_factor = dirichlet_factor.copy()
>>> copy_factor.variables
['x', 'y']

discretize(method, *args, **kwargs)[source]

Discretizes the continuous distribution into discrete probability masses using various methods.

Parameters
• method (A Discretizer Class from pgmpy.discretize) –

• **kwargs (*args,) –

The parameters to be given to the Discretizer Class.

Returns

• An n-D array or a DiscreteFactor object according to the discretiztion

• method used.

Examples

>>> import numpy as np
>>> from scipy.special import beta
>>> from pgmpy.factors.continuous import ContinuousFactor
>>> from pgmpy.factors.continuous import RoundingDiscretizer
>>> def dirichlet_pdf(x, y):
...     return (np.power(x, 1) * np.power(y, 2)) / beta(x, y)
>>> dirichlet_factor = ContinuousFactor(['x', 'y'], dirichlet_pdf)
>>> dirichlet_factor.discretize(RoundingDiscretizer, low=1, high=2, cardinality=5)
# TODO: finish this

divide(other, inplace=True)[source]

Gives the ContinuousFactor divide with the other factor.

Parameters

other (ContinuousFactor) – The ContinuousFactor to be divided.

Returns

if inplace=True (default) returns None if inplace=False returns a new ContinuousFactor instance.

Return type

Examples

>>> from pgmpy.factors.continuous import ContinuousFactor
>>> from scipy.stats import multivariate_normal
>>> sn_pdf1 = lambda x: multivariate_normal.pdf([x], , [])
>>> sn_pdf2 = lambda x1,x2: multivariate_normal.pdf([x1, x2], [0, 0], [[1, 0], [0, 1]])
>>> sn1 = ContinuousFactor(['x2'], sn_pdf1)
>>> sn2 = ContinuousFactor(['x1', 'x2'], sn_pdf2)

>>> sn4 = sn2.divide(sn1, inplace=False)
>>> sn4.assignment(0, 0)
0.3989422804014327

>>> sn4 = sn2 / sn1
>>> sn4.assignment(0, 0)
0.3989422804014327

marginalize(variables, inplace=True)[source]

Marginalize the factor with respect to the given variables.

Parameters
• variables (list, array-like) – List of variables with respect to which factor is to be maximized.

• inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new ContinuousFactor instance.

Returns

DiscreteFactor or None – if inplace=False returns a new ContinuousFactor instance.

Return type

if inplace=True (default) returns None

Examples

>>> from pgmpy.factors.continuous import ContinuousFactor
>>> from scipy.stats import multivariate_normal
>>> std_normal_pdf = lambda *x: multivariate_normal.pdf(x, [0, 0], [[1, 0], [0, 1]])
>>> std_normal = ContinuousFactor(['x1', 'x2'], std_normal_pdf)
>>> std_normal.scope()
['x1', 'x2']
>>> std_normal.assignment([1, 1])
0.058549831524319168
>>> std_normal.marginalize(['x2'])
>>> std_normal.scope()
['x1']
>>> std_normal.assignment(1)

normalize(inplace=True)[source]

Normalizes the pdf of the continuous factor so that it integrates to 1 over all the variables.

Parameters

inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new factor.

Returns

if inplace=True (default) returns None if inplace=False returns a new ContinuousFactor instance.

Return type

Examples

>>> from pgmpy.factors.continuous import ContinuousFactor
>>> from scipy.stats import multivariate_normal
>>> std_normal_pdf = lambda x: 2 * multivariate_normal.pdf(x, [0, 0], [[1, 0], [0, 1]])
>>> std_normal = ContinuousFactor(['x1', 'x2'], std_normal_pdf)
>>> std_normal.assignment(1, 1)
0.117099663049
>>> std_normal.normalize()
>>> std_normal.assignment(1, 1)
0.0585498315243

property pdf

Returns the pdf of the ContinuousFactor.

product(other, inplace=True)[source]

Gives the ContinuousFactor product with the other factor.

Parameters

other (ContinuousFactor) – The ContinuousFactor to be multiplied.

Returns

if inplace=True (default) returns None if inplace=False returns a new ContinuousFactor instance.

Return type

Examples

>>> from pgmpy.factors.continuous import ContinuousFactor
>>> from scipy.stats import multivariate_normal
>>> sn_pdf1 = lambda x: multivariate_normal.pdf([x], , [])
>>> sn_pdf2 = lambda x1,x2: multivariate_normal.pdf([x1, x2], [0, 0], [[1, 0], [0, 1]])
>>> sn1 = ContinuousFactor(['x2'], sn_pdf1)
>>> sn2 = ContinuousFactor(['x1', 'x2'], sn_pdf2)

>>> sn3 = sn1.product(sn2, inplace=False)
>>> sn3.assignment(0, 0)
0.063493635934240983

>>> sn3 = sn1 * sn2
>>> sn3.assignment(0, 0)
0.063493635934240983

reduce(values, inplace=True)[source]

Reduces the factor to the context of the given variable values.

Parameters
• values (list, array-like) – A list of tuples of the form (variable_name, variable_value).

• inplace (boolean) – If inplace=True it will modify the factor itself, else would return a new ContinuosFactor object.

Returns

ContinuousFactor or None – if inplace=False returns a new ContinuousFactor instance.

Return type

if inplace=True (default) returns None

Examples

>>> import numpy as np
>>> from scipy.special import beta
>>> from pgmpy.factors.continuous import ContinuousFactor
>>> def custom_pdf(x, y, z):
...     return z*(np.power(x, 1) * np.power(y, 2)) / beta(x, y)
>>> custom_factor = ContinuousFactor(['x', 'y', 'z'], custom_pdf)
>>> custom_factor.variables
['x', 'y', 'z']
>>> custom_factor.assignment(1, 2, 3)
24.0

>>> custom_factor.reduce([('y', 2)])
>>> custom_factor.variables
['x', 'z']
>>> custom_factor.assignment(1, 3)
24.0

scope()[source]

Returns the scope of the factor.

Returns

list

Return type

List of variable names in the scope of the factor.

Examples

>>> from pgmpy.factors.continuous import ContinuousFactor
>>> from scipy.stats import multivariate_normal
>>> normal_pdf = lambda x: multivariate_normal(x, [0, 0], [[1, 0], [0, 1]])
>>> phi = ContinuousFactor(['x1', 'x2'], normal_pdf)
>>> phi.scope()
['x1', 'x2']


### Linear Gaussian CPD¶

class pgmpy.factors.continuous.LinearGaussianCPD.LinearGaussianCPD(variable, evidence_mean, evidence_variance, evidence=[], beta=None)[source]

For, X -> Y the Linear Gaussian model assumes that the mean of Y is a linear function of mean of X and the variance of Y does not depend on X.

For example, Here, is the mean of the variable .

Let be a continuous variable with continuous parents . We say that has a linear Gaussian CPD if there are parameters

System Message: WARNING/2 (eta_0, eta_1, ..., eta_k)

latex exited with error [stdout] This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) (preloaded format=latex) restricted \write18 enabled. entering extended mode (./math.tex LaTeX2e <2018-12-01> (/usr/share/texlive/texmf-dist/tex/latex/base/article.cls Document Class: article 2018/09/03 v1.4i Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size12.clo)) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the ?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amscls/amsthm.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/latex/anyfontsize/anyfontsize.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/bm.sty) No file math.aux. (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.13 \fontsize{12}{14}\selectfont $^^H eta_0, ^^Heta_1, ..., ^^Heta_k$ ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.13 \fontsize{12}{14}\selectfont $^^Heta_0, ^^H eta_1, ..., ^^Heta_k$ ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.13 ...}\selectfont $^^Heta_0, ^^Heta_1, ..., ^^H eta_k$  (./math.aux) ) (see the transcript file for additional information) Output written on math.dvi (1 page, 336 bytes). Transcript written on math.log.

and such that,

System Message: WARNING/2 (p(Y |x1, x2, ..., xk) = \mathcal{N}(eta_0 + x1*eta_1 + ......... + xk*eta_k ; \sigma_2) )

latex exited with error [stdout] This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) (preloaded format=latex) restricted \write18 enabled. entering extended mode (./math.tex LaTeX2e <2018-12-01> (/usr/share/texlive/texmf-dist/tex/latex/base/article.cls Document Class: article 2018/09/03 v1.4i Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size12.clo)) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the ?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amscls/amsthm.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/latex/anyfontsize/anyfontsize.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/bm.sty) (./math.aux) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....... + xk*^^Heta_k ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....... + xk*^^Heta_k ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....... + xk*^^Heta_k ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ; (U+037E) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....... + xk*^^Heta_k ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....... + xk*^^Heta_k ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....... + xk*^^Heta_k ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....... + xk*^^Heta_k ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ; (U+037E) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....... + xk*^^Heta_k ; \sigma_2)\end{split}  (./math.aux) ) (see the transcript file for additional information) Output written on math.dvi (1 page, 556 bytes). Transcript written on math.log.

In vector notation,

System Message: WARNING/2 (p(Y |x) = \mathcal{N}(eta_0 + oldmath{β}.T * oldmath{x} ; \sigma_2) )

latex exited with error [stdout] This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) (preloaded format=latex) restricted \write18 enabled. entering extended mode (./math.tex LaTeX2e <2018-12-01> (/usr/share/texlive/texmf-dist/tex/latex/base/article.cls Document Class: article 2018/09/03 v1.4i Standard LaTeX document class (/usr/share/texlive/texmf-dist/tex/latex/base/size12.clo)) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty For additional information on amsmath, use the ?' option. (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty)) (/usr/share/texlive/texmf-dist/tex/latex/amscls/amsthm.sty) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amssymb.sty (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/amsfonts.sty)) (/usr/share/texlive/texmf-dist/tex/latex/anyfontsize/anyfontsize.sty) (/usr/share/texlive/texmf-dist/tex/latex/tools/bm.sty) (./math.aux) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsa.fd) (/usr/share/texlive/texmf-dist/tex/latex/amsfonts/umsb.fd) ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character β (U+03B2) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ; (U+037E) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character β (U+03B2) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ^^H (U+0008) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split} ! Package inputenc Error: Unicode character ; (U+037E) (inputenc) not set up for use with LaTeX. See the inputenc package documentation for explanation. Type H <return> for immediate help. ... l.14 ....T * ^^Holdmath{x} ; \sigma_2)\end{split}  (./math.aux) ) (see the transcript file for additional information) Output written on math.dvi (1 page, 456 bytes). Transcript written on math.log.

References

1

https://cedar.buffalo.edu/~srihari/CSE574/Chap8/Ch8-PGM-GaussianBNs/8.5%20GaussianBNs.pdf

copy()[source]

Returns a copy of the distribution.

Returns

LinearGaussianCPD

Return type

copy of the distribution

Examples

>>> from pgmpy.factors.continuous import LinearGaussianCPD
>>> cpd = LinearGaussianCPD('Y',  [0.2, -2, 3, 7], 9.6, ['X1', 'X2', 'X3'])
>>> copy_cpd = cpd.copy()
>>> copy_cpd.variable
'Y'
>>> copy_cpd.evidence
['X1', 'X2', 'X3']

fit(data, states, estimator=None, complete_samples_only=True, **kwargs)[source]

Determine βs from data

Parameters
• data (pandas.DataFrame) – Dataframe containing samples from the conditional distribution, p(Y|X) estimator: ‘MLE’ or ‘MAP’

• completely_samples_only (boolean (True or False)) – Are they downsampled or complete? Defaults to True

maximum_likelihood_estimator(data, states)[source]

Fit using MLE method.

Parameters
• data (pandas.DataFrame or 2D array) – Dataframe of values containing samples from the conditional distribution, (Y|X) and corresponding X values.

• states (All the input states that are jointly gaussian.) –

Returns

beta, variance (tuple)

Return type

Returns estimated betas and the variance.

## Discretizing Methods¶

class pgmpy.factors.continuous.discretize.BaseDiscretizer(factor, low, high, cardinality)[source]

Base class for the discretizer classes in pgmpy. The discretizer classes are used to discretize a continuous random variable distribution into discrete probability masses.

Parameters
• factor (A ContinuousNode or a ContinuousFactor object) – the continuous node or factor representing the distribution to be discretized.

• high (low,) – the range over which the function will be discretized.

• cardinality (int) – the number of states required in the discretized output.

Examples

>>> from scipy.stats import norm
>>> from pgmpy.factors.continuous import ContinuousNode
>>> normal = ContinuousNode(norm(0, 1).pdf)
>>> from pgmpy.discretize import BaseDiscretizer
>>> class ChildDiscretizer(BaseDiscretizer):
...     def get_discrete_values(self):
...         pass
>>> discretizer = ChildDiscretizer(normal, -3, 3, 10)
>>> discretizer.factor
<pgmpy.factors.continuous.ContinuousNode.ContinuousNode object at 0x04C98190>
>>> discretizer.cardinality
10
>>> discretizer.get_labels()
['x=-3.0', 'x=-2.4', 'x=-1.8', 'x=-1.2', 'x=-0.6', 'x=0.0', 'x=0.6', 'x=1.2', 'x=1.8', 'x=2.4']

abstract get_discrete_values()[source]

This method implements the algorithm to discretize the given continuous distribution.

It must be implemented by all the subclasses of BaseDiscretizer.

Returns

Return type

A list of discrete values or a DiscreteFactor object.

get_labels()[source]

Returns a list of strings representing the values about which the discretization method calculates the probabilty masses.

Default value is the points - [low, low+step, low+2*step, ……… , high-step] unless the method is overridden by a subclass.

Examples

>>> from pgmpy.factors import ContinuousNode
>>> from pgmpy.discretize import BaseDiscretizer
>>> class ChildDiscretizer(BaseDiscretizer):
...     def get_discrete_values(self):
...         pass
>>> from scipy.stats import norm
>>> node = ContinuousNode(norm(0).pdf)
>>> child = ChildDiscretizer(node, -5, 5, 20)
>>> chld.get_labels()
['x=-5.0', 'x=-4.5', 'x=-4.0', 'x=-3.5', 'x=-3.0', 'x=-2.5',
'x=-2.0', 'x=-1.5', 'x=-1.0', 'x=-0.5', 'x=0.0', 'x=0.5', 'x=1.0',
'x=1.5', 'x=2.0', 'x=2.5', 'x=3.0', 'x=3.5', 'x=4.0', 'x=4.5']

class pgmpy.factors.continuous.discretize.RoundingDiscretizer(factor, low, high, cardinality)[source]

This class uses the rounding method for discretizing the given continuous distribution.

For the rounding method,

The probability mass is, cdf(x+step/2)-cdf(x), for x = low

cdf(x+step/2)-cdf(x-step/2), for low < x <= high

where, cdf is the cumulative density function of the distribution and step = (high-low)/cardinality.

Examples

>>> import numpy as np
>>> from pgmpy.factors.continuous import ContinuousNode
>>> from pgmpy.factors.continuous import RoundingDiscretizer
>>> std_normal_pdf = lambda x : np.exp(-x*x/2) / (np.sqrt(2*np.pi))
>>> std_normal = ContinuousNode(std_normal_pdf)
>>> std_normal.discretize(RoundingDiscretizer, low=-3, high=3,
...                       cardinality=12)
[0.001629865203424451, 0.009244709419989363, 0.027834684208773178,
0.065590616803038182, 0.120977578710013, 0.17466632194020804,
0.19741265136584729, 0.17466632194020937, 0.12097757871001302,
0.065590616803036905, 0.027834684208772664, 0.0092447094199902269]

get_discrete_values()[source]

This method implements the algorithm to discretize the given continuous distribution.

It must be implemented by all the subclasses of BaseDiscretizer.

Returns

Return type

A list of discrete values or a DiscreteFactor object.

class pgmpy.factors.continuous.discretize.UnbiasedDiscretizer(factor, low, high, cardinality)[source]

This class uses the unbiased method for discretizing the given continuous distribution.

The unbiased method for discretization is the matching of the first moment method. It involves calculating the first order limited moment of the distribution which is done by the _lim_moment method.

For this method,

The probability mass is, (E(x) - E(x + step))/step + 1 - cdf(x), for x = low

(2 * E(x) - E(x - step) - E(x + step))/step, for low < x < high

(E(x) - E(x - step))/step - 1 + cdf(x), for x = high

where, E(x) is the first limiting moment of the distribution about the point x, cdf is the cumulative density function and step = (high-low)/cardinality.

References

Klugman, S. A., Panjer, H. H. and Willmot, G. E., Loss Models, From Data to Decisions, Fourth Edition, Wiley, section 9.6.5.2 (Method of local monment matching) and exercise 9.41.

Examples

>>> import numpy as np
>>> from pgmpy.factors import ContinuousNode
>>> from pgmpy.factors.continuous import UnbiasedDiscretizer
# exponential distribution with rate = 2
>>> exp_pdf = lambda x: 2*np.exp(-2*x) if x>=0 else 0
>>> exp_node = ContinuousNode(exp_pdf)
>>> exp_node.discretize(UnbiasedDiscretizer, low=0, high=5, cardinality=10)
[0.39627368905806137, 0.4049838434034298, 0.13331784003148325,
0.043887287876647259, 0.014447413395300212, 0.0047559685431339703,
0.0015656350182896128, 0.00051540201980112557, 0.00016965346326140994,
3.7867260839208328e-05]

get_discrete_values()[source]

This method implements the algorithm to discretize the given continuous distribution.

It must be implemented by all the subclasses of BaseDiscretizer.

Returns

Return type

A list of discrete values or a DiscreteFactor object.

get_labels()[source]

Returns a list of strings representing the values about which the discretization method calculates the probabilty masses.

Default value is the points - [low, low+step, low+2*step, ……… , high-step] unless the method is overridden by a subclass.

Examples

>>> from pgmpy.factors import ContinuousNode
>>> from pgmpy.discretize import BaseDiscretizer
>>> class ChildDiscretizer(BaseDiscretizer):
...     def get_discrete_values(self):
...         pass
>>> from scipy.stats import norm
>>> node = ContinuousNode(norm(0).pdf)
>>> child = ChildDiscretizer(node, -5, 5, 20)
>>> chld.get_labels()
['x=-5.0', 'x=-4.5', 'x=-4.0', 'x=-3.5', 'x=-3.0', 'x=-2.5',
'x=-2.0', 'x=-1.5', 'x=-1.0', 'x=-0.5', 'x=0.0', 'x=0.5', 'x=1.0',
'x=1.5', 'x=2.0', 'x=2.5', 'x=3.0', 'x=3.5', 'x=4.0', 'x=4.5']
`