Estimators for Parameter and Structure Learning

Bayesian Estimator

class pgmpy.estimators.BayesianEstimator.BayesianEstimator(model, data, **kwargs)[source]
estimate_cpd(node, prior_type='BDeu', pseudo_counts=[], equivalent_sample_size=5)[source]

Method to estimate the CPD for a given variable.

node: int, string (any hashable python object)
The name of the variable for which the CPD is to be estimated.
prior_type: ‘dirichlet’, ‘BDeu’, ‘K2’,

string indicting which type of prior to use for the model parameters. - If ‘prior_type’ is ‘dirichlet’, the following must be provided:

‘pseudo_counts’ = dirichlet hyperparameters; a list or dict
with a “virtual” count for each variable state. The virtual counts are added to the actual state counts found in the data. (if a list is provided, a lexicographic ordering of states is assumed)
  • If ‘prior_type’ is ‘BDeu’, then an ‘equivalent_sample_size’

    must be specified instead of ‘pseudo_counts’. This is equivalent to ‘prior_type=dirichlet’ and using uniform ‘pseudo_counts’ of equivalent_sample_size/(node_cardinality*np.prod(parents_cardinalities)).

  • A prior_type of ‘K2’ is a shorthand for ‘dirichlet’ + setting every pseudo_count to 1,

    regardless of the cardinality of the variable.

CPD: TabularCPD

>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import BayesianEstimator
>>> data = pd.DataFrame(data={'A': [0, 0, 1], 'B': [0, 1, 0], 'C': [1, 1, 0]})
>>> model = BayesianModel([('A', 'C'), ('B', 'C')])
>>> estimator = BayesianEstimator(model, data)
>>> cpd_C = estimator.estimate_cpd('C', prior_type="dirichlet", pseudo_counts=[1, 2])
>>> print(cpd_C)
╒══════╤══════╤══════╤══════╤════════════════════╕
│ A    │ A(0) │ A(0) │ A(1) │ A(1)               │
├──────┼──────┼──────┼──────┼────────────────────┤
│ B    │ B(0) │ B(1) │ B(0) │ B(1)               │
├──────┼──────┼──────┼──────┼────────────────────┤
│ C(0) │ 0.25 │ 0.25 │ 0.5  │ 0.3333333333333333 │
├──────┼──────┼──────┼──────┼────────────────────┤
│ C(1) │ 0.75 │ 0.75 │ 0.5  │ 0.6666666666666666 │
╘══════╧══════╧══════╧══════╧════════════════════╛
get_parameters(prior_type='BDeu', equivalent_sample_size=5, pseudo_counts=None)[source]

Method to estimate the model parameters (CPDs).

prior_type: ‘dirichlet’, ‘BDeu’, or ‘K2’

string indicting which type of prior to use for the model parameters. - If ‘prior_type’ is ‘dirichlet’, the following must be provided:

‘pseudo_counts’ = dirichlet hyperparameters; a dict containing, for each variable, a list
with a “virtual” count for each variable state, that is added to the state counts. (lexicographic ordering of states assumed)
  • If ‘prior_type’ is ‘BDeu’, then an ‘equivalent_sample_size’

    must be specified instead of ‘pseudo_counts’. This is equivalent to ‘prior_type=dirichlet’ and using uniform ‘pseudo_counts’ of equivalent_sample_size/(node_cardinality*np.prod(parents_cardinalities)) for each node. ‘equivalent_sample_size’ can either be a numerical value or a dict that specifies the size for each variable seperately.

  • A prior_type of ‘K2’ is a shorthand for ‘dirichlet’ + setting every pseudo_count to 1,

    regardless of the cardinality of the variable.

parameters: list
List of TabularCPDs, one for each variable of the model
>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import BayesianEstimator
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 4)),
...                       columns=['A', 'B', 'C', 'D'])
>>> model = BayesianModel([('A', 'B'), ('C', 'B'), ('C', 'D')])
>>> estimator = BayesianEstimator(model, values)
>>> estimator.get_parameters(prior_type='BDeu', equivalent_sample_size=5)
[<TabularCPD representing P(C:2) at 0x7f7b534251d0>,
<TabularCPD representing P(B:2 | C:2, A:2) at 0x7f7b4dfd4da0>,
<TabularCPD representing P(A:2) at 0x7f7b4dfd4fd0>,
<TabularCPD representing P(D:2 | C:2) at 0x7f7b4df822b0>]

Bdeu Score

class pgmpy.estimators.BdeuScore.BdeuScore(data, equivalent_sample_size=10, **kwargs)[source]
local_score(variable, parents)[source]

Computes a score that measures how much a given variable is “influenced” by a given list of potential parents.

Bic Score

class pgmpy.estimators.BicScore.BicScore(data, **kwargs)[source]
local_score(variable, parents)[source]

Computes a score that measures how much a given variable is “influenced” by a given list of potential parents.

Contraint Based Estimator

class pgmpy.estimators.ConstraintBasedEstimator.ConstraintBasedEstimator(data, **kwargs)[source]
static build_skeleton(nodes, independencies)[source]

Estimates a graph skeleton (UndirectedGraph) from a set of independencies using (the first part of) the PC algorithm. The independencies can either be provided as an instance of the Independencies-class or by passing a decision function that decides any conditional independency assertion. Returns a tuple (skeleton, separating_sets).

If an Independencies-instance is passed, the contained IndependenceAssertions have to admit a faithful BN representation. This is the case if they are obtained as a set of d-seperations of some Bayesian network or if the independence assertions are closed under the semi-graphoid axioms. Otherwise the procedure may fail to identify the correct structure.

nodes: list, array-like
A list of node/variable names of the network skeleton.
independencies: Independencies-instance or function.
The source of independency information from which to build the skeleton. The provided Independencies should admit a faithful representation. Can either be provided as an Independencies()-instance or by passing a function f(X, Y, Zs) that returns True when X _|_ Y | Zs, otherwise False. (X, Y being individual nodes and Zs a list of nodes).
skeleton: UndirectedGraph
An estimate for the undirected graph skeleton of the BN underlying the data.
separating_sets: dict
A dict containing for each pair of not directly connected nodes a separating set (“witnessing set”) of variables that makes then conditionally independent. (needed for edge orientation procedures)
[1] Neapolitan, Learning Bayesian Networks, Section 10.1.2, Algorithm 10.2 (page 550)
http://www.cs.technion.ac.il/~dang/books/Learning%20Bayesian%20Networks(Neapolitan,%20Richard).pdf
[2] Koller & Friedman, Probabilistic Graphical Models - Principles and Techniques, 2009
Section 3.4.2.1 (page 85), Algorithm 3.3
>>> from pgmpy.estimators import ConstraintBasedEstimator
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.independencies import Independencies
>>> # build skeleton from list of independencies:
... ind = Independencies(['B', 'C'], ['A', ['B', 'C'], 'D'])
>>> # we need to compute closure, otherwise this set of independencies doesn't
... # admit a faithful representation:
... ind = ind.closure()
>>> skel, sep_sets = ConstraintBasedEstimator.build_skeleton("ABCD", ind)
>>> print(skel.edges())
[('A', 'D'), ('B', 'D'), ('C', 'D')]
>>> # build skeleton from d-seperations of BayesianModel:
... model = BayesianModel([('A', 'C'), ('B', 'C'), ('B', 'D'), ('C', 'E')])
>>> skel, sep_sets = ConstraintBasedEstimator.build_skeleton(model.nodes(), model.get_independencies())
>>> print(skel.edges())
[('A', 'C'), ('B', 'C'), ('B', 'D'), ('C', 'E')]
estimate(significance_level=0.01)[source]

Estimates a BayesianModel for the data set, using the PC contraint-based structure learning algorithm. Independencies are identified from the data set using a chi-squared statistic with the acceptance threshold of significance_level. PC identifies a partially directed acyclic graph (PDAG), given that the tested independencies admit a faithful Bayesian network representation. This method returns a BayesianModel that is a completion of this PDAG.

significance_level: float, default: 0.01

The significance level to use for conditional independence tests in the data set.

significance_level is the desired Type 1 error probability of falsely rejecting the null hypothesis that variables are independent, given that they are. The lower significance_level, the less likely we are to accept dependencies, resulting in a sparser graph.

model: BayesianModel()-instance
An estimate for the BayesianModel for the data set (not yet parametrized).

Neapolitan, Learning Bayesian Networks, Section 10.1.2, Algorithm 10.2 (page 550) http://www.cs.technion.ac.il/~dang/books/Learning%20Bayesian%20Networks(Neapolitan,%20Richard).pdf

>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.estimators import ConstraintBasedEstimator
>>> data = pd.DataFrame(np.random.randint(0, 5, size=(2500, 3)), columns=list('XYZ'))
>>> data['sum'] = data.sum(axis=1)
>>> print(data)
      X  Y  Z  sum
0     3  0  1    4
1     1  4  3    8
2     0  0  3    3
3     0  2  3    5
4     2  1  1    4
...  .. .. ..  ...
2495  2  3  0    5
2496  1  1  2    4
2497  0  4  2    6
2498  0  0  0    0
2499  2  4  0    6

[2500 rows x 4 columns] >>> c = ConstraintBasedEstimator(data) >>> model = c.estimate() >>> print(model.edges()) [(‘Z’, ‘sum’), (‘X’, ‘sum’), (‘Y’, ‘sum’)]

static estimate_from_independencies(nodes, independencies)[source]

Estimates a BayesianModel from an Independencies()-object or a decision function for conditional independencies. This requires that the set of independencies admits a faithful representation (e.g. is a set of d-seperation for some BN or is closed under the semi-graphoid axioms). See build_skeleton, skeleton_to_pdag, pdag_to_dag for details.

nodes: list, array-like
A list of node/variable names of the network skeleton.
independencies: Independencies-instance or function.
The source of independency information from which to build the skeleton. The provided Independencies should admit a faithful representation. Can either be provided as an Independencies()-instance or by passing a function f(X, Y, Zs) that returns True when X _|_ Y | Zs, otherwise False. (X, Y being individual nodes and Zs a list of nodes).

model: BayesianModel instance

>>> from pgmpy.estimators import ConstraintBasedEstimator
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.independencies import Independencies
>>> ind = Independencies(['B', 'C'], ['A', ['B', 'C'], 'D'])
>>> ind = ind.closure()
>>> skel = ConstraintBasedEstimator.estimate_from_independencies("ABCD", ind)
>>> print(skel.edges())
[('B', 'D'), ('A', 'D'), ('C', 'D')]
>>> model = BayesianModel([('A', 'C'), ('B', 'C'), ('B', 'D'), ('C', 'E')])
>>> skel = ConstraintBasedEstimator.estimate_from_independencies(model.nodes(), model.get_independencies())
>>> print(skel.edges())
[('B', 'C'), ('A', 'C'), ('C', 'E'), ('D', 'B')]
>>> # note that ('D', 'B') is flipped compared to the original network;
>>> # Both networks belong to the same PDAG/are I-equivalent
estimate_skeleton(significance_level=0.01)[source]

Estimates a graph skeleton (UndirectedGraph) for the data set. Uses the build_skeleton method (PC algorithm); independencies are determined using a chisquare statistic with the acceptance threshold of significance_level. Returns a tuple `(skeleton, separating_sets).

significance_level: float, default: 0.01

The significance level to use for conditional independence tests in the data set.

significance_level is the desired Type 1 error probability of falsely rejecting the null hypothesis that variables are independent, given that they are. The lower significance_level, the less likely we are to accept dependencies, resulting in a sparser graph.

skeleton: UndirectedGraph
An estimate for the undirected graph skeleton of the BN underlying the data.
separating_sets: dict
A dict containing for each pair of not directly connected nodes a separating set of variables that makes then conditionally independent. (needed for edge orientation procedures)
[1] Neapolitan, Learning Bayesian Networks, Section 10.1.2, Algorithm 10.2 (page 550)
http://www.cs.technion.ac.il/~dang/books/Learning%20Bayesian%20Networks(Neapolitan,%20Richard).pdf

[2] Chi-square test https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Test_of_independence

>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.estimators import ConstraintBasedEstimator
>>>
>>> data = pd.DataFrame(np.random.randint(0, 2, size=(5000, 5)), columns=list('ABCDE'))
>>> data['F'] = data['A'] + data['B'] + data ['C']
>>> est = ConstraintBasedEstimator(data)
>>> skel, sep_sets = est.estimate_skeleton()
>>> skel.edges()
[('A', 'F'), ('B', 'F'), ('C', 'F')]
>>> # all independencies are unconditional:
>>> sep_sets
{('D', 'A'): (), ('C', 'A'): (), ('C', 'E'): (), ('E', 'F'): (), ('B', 'D'): (),
 ('B', 'E'): (), ('D', 'F'): (), ('D', 'E'): (), ('A', 'E'): (), ('B', 'A'): (),
 ('B', 'C'): (), ('C', 'D'): ()}
>>>
>>> data = pd.DataFrame(np.random.randint(0, 2, size=(5000, 3)), columns=list('XYZ'))
>>> data['X'] += data['Z']
>>> data['Y'] += data['Z']
>>> est = ConstraintBasedEstimator(data)
>>> skel, sep_sets = est.estimate_skeleton()
>>> skel.edges()
[('X', 'Z'), ('Y', 'Z')]
>>> # X, Y dependent, but conditionally independent given Z:
>>> sep_sets
{('X', 'Y'): ('Z',)}
static model_to_pdag(model)[source]

Construct the DAG pattern (representing the I-equivalence class) for a given BayesianModel. This is the “inverse” to pdag_to_dag.

static pdag_to_dag(pdag)[source]

Completes a PDAG to a DAG, without adding v-structures, if such a completion exists. If no faithful extension is possible, some fully oriented DAG that corresponds to the PDAG is returned and a warning is generated. This is a static method.

pdag: DirectedGraph
A directed acyclic graph pattern, consisting in (acyclic) directed edges as well as “undirected” edges, represented as both-way edges between nodes.
dag: BayesianModel
A faithful orientation of pdag, if one exists. Otherwise any fully orientated DAG/BayesianModel with the structure of pdag.
[1] Chickering, Learning Equivalence Classes of Bayesian-Network Structures,
2002; See page 454 (last paragraph) for the algorithm pdag_to_dag http://www.jmlr.org/papers/volume2/chickering02a/chickering02a.pdf
[2] Dor & Tarsi, A simple algorithm to construct a consistent extension
of a partially oriented graph, 1992, http://ftp.cs.ucla.edu/pub/stat_ser/r185-dor-tarsi.pdf
>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.base import DirectedGraph
>>> from pgmpy.estimators import ConstraintBasedEstimator
>>> data = pd.DataFrame(np.random.randint(0, 4, size=(5000, 3)), columns=list('ABD'))
>>> data['C'] = data['A'] - data['B']
>>> data['D'] += data['A']
>>> c = ConstraintBasedEstimator(data)
>>> pdag = c.skeleton_to_pdag(*c.estimate_skeleton())
>>> pdag.edges()
[('B', 'C'), ('D', 'A'), ('A', 'D'), ('A', 'C')]
>>> c.pdag_to_dag(pdag).edges()
[('B', 'C'), ('A', 'D'), ('A', 'C')]
>>> # pdag_to_dag is static:
... pdag1 = DirectedGraph([('A', 'B'), ('C', 'B'), ('C', 'D'), ('D', 'C'), ('D', 'A'), ('A', 'D')])
>>> ConstraintBasedEstimator.pdag_to_dag(pdag1).edges()
[('D', 'C'), ('C', 'B'), ('A', 'B'), ('A', 'D')]
>>> # example of a pdag with no faithful extension:
... pdag2 = DirectedGraph([('A', 'B'), ('A', 'C'), ('B', 'C'), ('C', 'B')])
>>> ConstraintBasedEstimator.pdag_to_dag(pdag2).edges()
UserWarning: PDAG has no faithful extension (= no oriented DAG with the same v-structures as PDAG).
Remaining undirected PDAG edges oriented arbitrarily.
[('B', 'C'), ('A', 'B'), ('A', 'C')]
static skeleton_to_pdag(skel, separating_sets)[source]

Orients the edges of a graph skeleton based on information from separating_sets to form a DAG pattern (DirectedGraph).

skel: UndirectedGraph
An undirected graph skeleton as e.g. produced by the estimate_skeleton method.
separating_sets: dict
A dict containing for each pair of not directly connected nodes a separating set (“witnessing set”) of variables that makes then conditionally independent. (needed for edge orientation)
pdag: DirectedGraph
An estimate for the DAG pattern of the BN underlying the data. The graph might contain some nodes with both-way edges (X->Y and Y->X). Any completion by (removing one of the both-way edges for each such pair) results in a I-equivalent Bayesian network DAG.

Neapolitan, Learning Bayesian Networks, Section 10.1.2, Algorithm 10.2 (page 550) http://www.cs.technion.ac.il/~dang/books/Learning%20Bayesian%20Networks(Neapolitan,%20Richard).pdf

>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.estimators import ConstraintBasedEstimator
>>> data = pd.DataFrame(np.random.randint(0, 4, size=(5000, 3)), columns=list('ABD'))
>>> data['C'] = data['A'] - data['B']
>>> data['D'] += data['A']
>>> c = ConstraintBasedEstimator(data)
>>> pdag = c.skeleton_to_pdag(*c.estimate_skeleton())
>>> pdag.edges() # edges: A->C, B->C, A--D (not directed)
[('B', 'C'), ('A', 'C'), ('A', 'D'), ('D', 'A')]

K2 Score

class pgmpy.estimators.K2Score.K2Score(data, **kwargs)[source]
local_score(variable, parents)[source]

Computes a score that measures how much a given variable is “influenced” by a given list of potential parents.

Maximum Likelihood Estimator

class pgmpy.estimators.MLE.MaximumLikelihoodEstimator(model, data, **kwargs)[source]
estimate_cpd(node)[source]

Method to estimate the CPD for a given variable.

node: int, string (any hashable python object)
The name of the variable for which the CPD is to be estimated.

CPD: TabularCPD

>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import MaximumLikelihoodEstimator
>>> data = pd.DataFrame(data={'A': [0, 0, 1], 'B': [0, 1, 0], 'C': [1, 1, 0]})
>>> model = BayesianModel([('A', 'C'), ('B', 'C')])
>>> cpd_A = MaximumLikelihoodEstimator(model, data).estimate_cpd('A')
>>> print(cpd_A)
╒══════╤══════════╕
│ A(0) │ 0.666667 │
├──────┼──────────┤
│ A(1) │ 0.333333 │
╘══════╧══════════╛
>>> cpd_C = MaximumLikelihoodEstimator(model, data).estimate_cpd('C')
>>> print(cpd_C)
╒══════╤══════╤══════╤══════╤══════╕
│ A    │ A(0) │ A(0) │ A(1) │ A(1) │
├──────┼──────┼──────┼──────┼──────┤
│ B    │ B(0) │ B(1) │ B(0) │ B(1) │
├──────┼──────┼──────┼──────┼──────┤
│ C(0) │ 0.0  │ 0.0  │ 1.0  │ 0.5  │
├──────┼──────┼──────┼──────┼──────┤
│ C(1) │ 1.0  │ 1.0  │ 0.0  │ 0.5  │
╘══════╧══════╧══════╧══════╧══════╛
get_parameters()[source]

Method to estimate the model parameters (CPDs) using Maximum Likelihood Estimation.

parameters: list
List of TabularCPDs, one for each variable of the model
>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import MaximumLikelihoodEstimator
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 4)),
...                       columns=['A', 'B', 'C', 'D'])
>>> model = BayesianModel([('A', 'B'), ('C', 'B'), ('C', 'D'))
>>> estimator = MaximumLikelihoodEstimator(model, values)
>>> estimator.get_parameters()
[<TabularCPD representing P(C:2) at 0x7f7b534251d0>,
<TabularCPD representing P(B:2 | C:2, A:2) at 0x7f7b4dfd4da0>,
<TabularCPD representing P(A:2) at 0x7f7b4dfd4fd0>,
<TabularCPD representing P(D:2 | C:2) at 0x7f7b4df822b0>]

Structure Score

class pgmpy.estimators.StructureScore.StructureScore(data, **kwargs)[source]
score(model)[source]

Computes a score to measure how well the given BayesianModel fits to the data set. (This method relies on the local_score-method that is implemented in each subclass.)

model: BayesianModel instance
The Bayesian network that is to be scored. Nodes of the BayesianModel need to coincide with column names of data set.
score: float
A number indicating the degree of fit between data and model
>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.estimators import K2Score
>>> # create random data sample with 3 variables, where B and C are identical:
>>> data = pd.DataFrame(np.random.randint(0, 5, size=(5000, 2)), columns=list('AB'))
>>> data['C'] = data['B']
>>> K2Score(data).score(BayesianModel([['A','B'], ['A','C']]))
-24242.367348745247
>>> K2Score(data).score(BayesianModel([['A','B'], ['B','C']]))
-16273.793897051042
structure_prior(model)[source]

A (log) prior distribution over models. Currently unused (= uniform).