Maximum Likelihood Estimator

class pgmpy.estimators.MLE.MaximumLikelihoodEstimator(model, data, **kwargs)[source]
estimate_cpd(node, weighted=False)[source]

Method to estimate the CPD for a given variable.

Parameters
  • node (int, string (any hashable python object)) – The name of the variable for which the CPD is to be estimated.

  • weighted (bool) – If weighted=True, the data must contain a _weight column specifying the weight of each datapoint (row). If False, assigns an equal weight to each datapoint.

Returns

CPD

Return type

TabularCPD

Examples

>>> import pandas as pd
>>> from pgmpy.models import BayesianNetwork
>>> from pgmpy.estimators import MaximumLikelihoodEstimator
>>> data = pd.DataFrame(data={'A': [0, 0, 1], 'B': [0, 1, 0], 'C': [1, 1, 0]})
>>> model = BayesianNetwork([('A', 'C'), ('B', 'C')])
>>> cpd_A = MaximumLikelihoodEstimator(model, data).estimate_cpd('A')
>>> print(cpd_A)
╒══════╤══════════╕
│ A(0) │ 0.666667 │
├──────┼──────────┤
│ A(1) │ 0.333333 │
╘══════╧══════════╛
>>> cpd_C = MaximumLikelihoodEstimator(model, data).estimate_cpd('C')
>>> print(cpd_C)
╒══════╤══════╤══════╤══════╤══════╕
│ A    │ A(0) │ A(0) │ A(1) │ A(1) │
├──────┼──────┼──────┼──────┼──────┤
│ B    │ B(0) │ B(1) │ B(0) │ B(1) │
├──────┼──────┼──────┼──────┼──────┤
│ C(0) │ 0.0  │ 0.0  │ 1.0  │ 0.5  │
├──────┼──────┼──────┼──────┼──────┤
│ C(1) │ 1.0  │ 1.0  │ 0.0  │ 0.5  │
╘══════╧══════╧══════╧══════╧══════╛
get_parameters(n_jobs=- 1, weighted=False)[source]

Method to estimate the model parameters (CPDs) using Maximum Likelihood Estimation.

Parameters
  • n_jobs (int (default: -1)) – Number of jobs to run in parallel. Default: -1 uses all the processors.

  • weighted (bool) – If weighted=True, the data must contain a _weight column specifying the weight of each datapoint (row). If False, assigns an equal weight to each datapoint.

Returns

  • parameters (list) – List of TabularCPDs, one for each variable of the model

  • n_jobs (int) – Number of processes to spawn

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.models import BayesianNetwork
>>> from pgmpy.estimators import MaximumLikelihoodEstimator
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 4)),
...                       columns=['A', 'B', 'C', 'D'])
>>> model = BayesianNetwork([('A', 'B'), ('C', 'B'), ('C', 'D')])
>>> estimator = MaximumLikelihoodEstimator(model, values)
>>> estimator.get_parameters()
[<TabularCPD representing P(C:2) at 0x7f7b534251d0>,
<TabularCPD representing P(B:2 | C:2, A:2) at 0x7f7b4dfd4da0>,
<TabularCPD representing P(A:2) at 0x7f7b4dfd4fd0>,
<TabularCPD representing P(D:2 | C:2) at 0x7f7b4df822b0>]