Estimators for Parameter and Structure Learning

Base Estimator Classes

class pgmpy.estimators.base.BaseEstimator(data=None, state_names=None, complete_samples_only=True)[source]
class pgmpy.estimators.base.ParameterEstimator(model, data, **kwargs)[source]
state_counts(variable, **kwargs)[source]

Return counts how often each state of ‘variable’ occurred in the data. If the variable has parents, counting is done conditionally for each state configuration of the parents.

Parameters
  • variable (string) – Name of the variable for which the state count is to be done.

  • complete_samples_only (bool) – Specifies how to deal with missing data, if present. If set to True all rows that contain np.NaN somewhere are ignored. If False then every row where neither the variable nor its parents are np.NaN is used. Desired default behavior can be passed to the class constructor.

Returns

state_counts – Table with state counts for ‘variable’

Return type

pandas.DataFrame

Examples

>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import ParameterEstimator
>>> model = BayesianModel([('A', 'C'), ('B', 'C')])
>>> data = pd.DataFrame(data={'A': ['a1', 'a1', 'a2'],
                              'B': ['b1', 'b2', 'b1'],
                              'C': ['c1', 'c1', 'c2']})
>>> estimator = ParameterEstimator(model, data)
>>> estimator.state_counts('A')
    A
a1  2
a2  1
>>> estimator.state_counts('C')
A  a1      a2
B  b1  b2  b1  b2
C
c1  1   1   0   0
c2  0   0   1   0
class pgmpy.estimators.base.StructureEstimator(data=None, independencies=None, **kwargs)[source]

Bayesian Estimator

class pgmpy.estimators.BayesianEstimator(model, data, **kwargs)[source]
estimate_cpd(node, prior_type='BDeu', pseudo_counts=[], equivalent_sample_size=5)[source]

Method to estimate the CPD for a given variable.

Parameters
  • node (int, string (any hashable python object)) – The name of the variable for which the CPD is to be estimated.

  • prior_type ('dirichlet', 'BDeu', 'K2',) –

    string indicting which type of prior to use for the model parameters. - If ‘prior_type’ is ‘dirichlet’, the following must be provided:

    ’pseudo_counts’ = dirichlet hyperparameters; a single number or 2-D array

    of shape (node_card, product of parents_card) with a “virtual” count for each variable state in the CPD. The virtual counts are added to the actual state counts found in the data. (if a list is provided, a lexicographic ordering of states is assumed)

    • If ‘prior_type’ is ‘BDeu’, then an ‘equivalent_sample_size’

      must be specified instead of ‘pseudo_counts’. This is equivalent to ‘prior_type=dirichlet’ and using uniform ‘pseudo_counts’ of equivalent_sample_size/(node_cardinality*np.prod(parents_cardinalities)).

    • A prior_type of ‘K2’ is a shorthand for ‘dirichlet’ + setting every pseudo_count to 1, regardless of the cardinality of the variable.

Returns

CPD

Return type

TabularCPD

Examples

>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import BayesianEstimator
>>> data = pd.DataFrame(data={'A': [0, 0, 1], 'B': [0, 1, 0], 'C': [1, 1, 0]})
>>> model = BayesianModel([('A', 'C'), ('B', 'C')])
>>> estimator = BayesianEstimator(model, data)
>>> cpd_C = estimator.estimate_cpd('C', prior_type="dirichlet",
...                                pseudo_counts=[[1, 1, 1, 1],
...                                               [2, 2, 2, 2]])
>>> print(cpd_C)
╒══════╤══════╤══════╤══════╤════════════════════╕
│ A    │ A(0) │ A(0) │ A(1) │ A(1)               │
├──────┼──────┼──────┼──────┼────────────────────┤
│ B    │ B(0) │ B(1) │ B(0) │ B(1)               │
├──────┼──────┼──────┼──────┼────────────────────┤
│ C(0) │ 0.25 │ 0.25 │ 0.5  │ 0.3333333333333333 │
├──────┼──────┼──────┼──────┼────────────────────┤
│ C(1) │ 0.75 │ 0.75 │ 0.5  │ 0.6666666666666666 │
╘══════╧══════╧══════╧══════╧════════════════════╛
get_parameters(prior_type='BDeu', equivalent_sample_size=5, pseudo_counts=None)[source]

Method to estimate the model parameters (CPDs).

Parameters

prior_type ('dirichlet', 'BDeu', or 'K2') –

string indicting which type of prior to use for the model parameters. - If ‘prior_type’ is ‘dirichlet’, the following must be provided:

’pseudo_counts’ = dirichlet hyperparameters; a single number or a dict containing, for each

variable, a 2-D array of the shape (node_card, product of parents_card) with a “virtual” count for each variable state in the CPD, that is added to the state counts. (lexicographic ordering of states assumed)

  • If ‘prior_type’ is ‘BDeu’, then an ‘equivalent_sample_size’

    must be specified instead of ‘pseudo_counts’. This is equivalent to ‘prior_type=dirichlet’ and using uniform ‘pseudo_counts’ of equivalent_sample_size/(node_cardinality*np.prod(parents_cardinalities)) for each node. ‘equivalent_sample_size’ can either be a numerical value or a dict that specifies the size for each variable separately.

  • A prior_type of ‘K2’ is a shorthand for ‘dirichlet’ + setting every pseudo_count to 1,

    regardless of the cardinality of the variable.

Returns

parameters – List of TabularCPDs, one for each variable of the model

Return type

list

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import BayesianEstimator
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 4)),
...                       columns=['A', 'B', 'C', 'D'])
>>> model = BayesianModel([('A', 'B'), ('C', 'B'), ('C', 'D')])
>>> estimator = BayesianEstimator(model, values)
>>> estimator.get_parameters(prior_type='BDeu', equivalent_sample_size=5)
[<TabularCPD representing P(C:2) at 0x7f7b534251d0>,
<TabularCPD representing P(B:2 | C:2, A:2) at 0x7f7b4dfd4da0>,
<TabularCPD representing P(A:2) at 0x7f7b4dfd4fd0>,
<TabularCPD representing P(D:2 | C:2) at 0x7f7b4df822b0>]

BDeu Score

class pgmpy.estimators.BDeuScore(data, equivalent_sample_size=10, **kwargs)[source]
local_score(variable, parents)[source]

Computes a score that measures how much a given variable is “influenced” by a given list of potential parents.

Bic Score

class pgmpy.estimators.BicScore(data, **kwargs)[source]
local_score(variable, parents)[source]

Computes a score that measures how much a given variable is “influenced” by a given list of potential parents.

Contraint Based Estimator

class pgmpy.estimators.PC(data=None, independencies=None, **kwargs)[source]
build_skeleton(ci_test='chi_square', max_cond_vars=5, significance_level=0.01, variant='stable', n_jobs=- 1, show_progress=True, **kwargs)[source]

Estimates a graph skeleton (UndirectedGraph) from a set of independencies using (the first part of) the PC algorithm. The independencies can either be provided as an instance of the Independencies-class or by passing a decision function that decides any conditional independency assertion. Returns a tuple (skeleton, separating_sets).

If an Independencies-instance is passed, the contained IndependenceAssertions have to admit a faithful BN representation. This is the case if they are obtained as a set of d-seperations of some Bayesian network or if the independence assertions are closed under the semi-graphoid axioms. Otherwise the procedure may fail to identify the correct structure.

Returns

  • skeleton (UndirectedGraph) – An estimate for the undirected graph skeleton of the BN underlying the data.

  • separating_sets (dict) – A dict containing for each pair of not directly connected nodes a separating set (“witnessing set”) of variables that makes then conditionally independent. (needed for edge orientation procedures)

References

[1] Neapolitan, Learning Bayesian Networks, Section 10.1.2, Algorithm 10.2 (page 550)

http://www.cs.technion.ac.il/~dang/books/Learning%20Bayesian%20Networks(Neapolitan,%20Richard).pdf

[2] Koller & Friedman, Probabilistic Graphical Models - Principles and Techniques, 2009

Section 3.4.2.1 (page 85), Algorithm 3.3

Examples

>>> from pgmpy.estimators import PC
>>> from pgmpy.base import DAG
>>> from pgmpy.independencies import Independencies
>>> # build skeleton from list of independencies:
... ind = Independencies(['B', 'C'], ['A', ['B', 'C'], 'D'])
>>> # we need to compute closure, otherwise this set of independencies doesn't
... # admit a faithful representation:
... ind = ind.closure()
>>> skel, sep_sets = PC(independencies=ind).build_skeleton("ABCD", ind)
>>> print(skel.edges())
[('A', 'D'), ('B', 'D'), ('C', 'D')]
>>> # build skeleton from d-seperations of DAG:
... model = DAG([('A', 'C'), ('B', 'C'), ('B', 'D'), ('C', 'E')])
>>> skel, sep_sets = PC.build_skeleton(model.nodes(), model.get_independencies())
>>> print(skel.edges())
[('A', 'C'), ('B', 'C'), ('B', 'D'), ('C', 'E')]
estimate(variant='stable', ci_test='chi_square', max_cond_vars=5, return_type='dag', significance_level=0.01, n_jobs=- 1, show_progress=True, **kwargs)[source]

Estimates a DAG/PDAG from the given dataset using the PC algorithm which is a constraint-based structure learning algorithm[1]. The independencies in the dataset are identified by doing statistical independece test. This method returns a DAG/PDAG structure which is faithful to the independencies implied by the dataset

Parameters
  • variant (str (one of "orig", "stable", "parallel")) –

    The variant of PC algorithm to run. “orig”: The original PC algorithm. Might not give the same

    results in different runs but does less independence tests compared to stable.

    ”stable”: Gives the same result in every run but does needs to

    do more statistical independence tests.

    ”parallel”: Parallel version of PC Stable. Can run on multiple

    cores with the same result on each run.

  • ci_test (str or fun) –

    The statistical test to use for testing conditional independence in the dataset. If str values should be one of:

    ”independence_match”: If using this option, an additional parameter

    independencies must be specified.

    ”chi_square”: Uses the Chi-Square independence test. This works

    only for discrete datasets.

    ”pearsonr”: Uses the pertial correlation based on pearson

    correlation coefficient to test independence. This works only for continuous datasets.

  • max_cond_vars (int) – The maximum number of conditional variables allowed to do the statistical test with.

  • return_type (str (one of "dag", "cpdag", "pdag", "skeleton")) –

    The type of structure to return.

    If return_type=pdag or return_type=cpdag: a partially directed structure is returned. If return_type=dag, a fully directed structure is returned if it

    is possible to orient all the edges.

    If `return_type=”skeleton”, returns an undirected graph along

    with the separating sets.

  • significance_level (float (default: 0.01)) –

    The statistical tests use this value to compare with the p-value of the test to decide whether the tested variables are independent or not. Different tests can treat this parameter differently:

    1. Chi-Square: If p-value > significance_level, it assumes that the

      independence condition satisfied in the data.

    2. pearsonr: If p-value > significance_level, it assumes that the

      independence condition satisfied in the data.

Returns

model – The estimated model structure, can be a partially directed graph (PDAG) or a fully directed graph (DAG), or (Undirected Graph, separating sets) depending on the value of return_type argument.

Return type

DAG-instance, PDAG-instance, or (networkx.UndirectedGraph, dict)

References

[1] Original PC: P. Spirtes, C. Glymour, and R. Scheines, Causation,

Prediction, and Search, 2nd ed. Cambridge, MA: MIT Press, 2000.

[2] Stable PC: D. Colombo and M. H. Maathuis, “A modification of the PC algorithm

yielding order-independent skeletons,” ArXiv e-prints, Nov. 2012.

[3] Parallel PC: Le, Thuc, et al. “A fast PC algorithm for high dimensional causal

discovery with multi-core PCs.” IEEE/ACM transactions on computational biology and bioinformatics (2016).

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.estimators import PC
>>> data = pd.DataFrame(np.random.randint(0, 5, size=(2500, 3)), columns=list('XYZ'))
>>> data['sum'] = data.sum(axis=1)
>>> print(data)
      X  Y  Z  sum
0     3  0  1    4
1     1  4  3    8
2     0  0  3    3
3     0  2  3    5
4     2  1  1    4
...  .. .. ..  ...
2495  2  3  0    5
2496  1  1  2    4
2497  0  4  2    6
2498  0  0  0    0
2499  2  4  0    6
[2500 rows x 4 columns]
>>> c = PC(data)
>>> model = c.estimate()
>>> print(model.edges())
[('Z', 'sum'), ('X', 'sum'), ('Y', 'sum')]
static skeleton_to_pdag(skeleton, separating_sets)[source]

Orients the edges of a graph skeleton based on information from separating_sets to form a DAG pattern (DAG).

Parameters
  • skeleton (UndirectedGraph) – An undirected graph skeleton as e.g. produced by the estimate_skeleton method.

  • separating_sets (dict) – A dict containing for each pair of not directly connected nodes a separating set (“witnessing set”) of variables that makes then conditionally independent. (needed for edge orientation)

Returns

pdag – An estimate for the DAG pattern of the BN underlying the data. The graph might contain some nodes with both-way edges (X->Y and Y->X). Any completion by (removing one of the both-way edges for each such pair) results in a I-equivalent Bayesian network DAG.

Return type

DAG

References

Neapolitan, Learning Bayesian Networks, Section 10.1.2, Algorithm 10.2 (page 550) http://www.cs.technion.ac.il/~dang/books/Learning%20Bayesian%20Networks(Neapolitan,%20Richard).pdf

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.estimators import PC
>>> data = pd.DataFrame(np.random.randint(0, 4, size=(5000, 3)), columns=list('ABD'))
>>> data['C'] = data['A'] - data['B']
>>> data['D'] += data['A']
>>> c = PC(data)
>>> pdag = c.skeleton_to_pdag(*c.build_skeleton())
>>> pdag.edges() # edges: A->C, B->C, A--D (not directed)
[('B', 'C'), ('A', 'C'), ('A', 'D'), ('D', 'A')]

K2 Score

class pgmpy.estimators.K2Score(data, **kwargs)[source]
local_score(variable, parents)[source]

Computes a score that measures how much a given variable is “influenced” by a given list of potential parents.

Maximum Likelihood Estimator

class pgmpy.estimators.MLE.MaximumLikelihoodEstimator(model, data, **kwargs)[source]
estimate_cpd(node)[source]

Method to estimate the CPD for a given variable.

Parameters

node (int, string (any hashable python object)) – The name of the variable for which the CPD is to be estimated.

Returns

CPD

Return type

TabularCPD

Examples

>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import MaximumLikelihoodEstimator
>>> data = pd.DataFrame(data={'A': [0, 0, 1], 'B': [0, 1, 0], 'C': [1, 1, 0]})
>>> model = BayesianModel([('A', 'C'), ('B', 'C')])
>>> cpd_A = MaximumLikelihoodEstimator(model, data).estimate_cpd('A')
>>> print(cpd_A)
╒══════╤══════════╕
│ A(0) │ 0.666667 │
├──────┼──────────┤
│ A(1) │ 0.333333 │
╘══════╧══════════╛
>>> cpd_C = MaximumLikelihoodEstimator(model, data).estimate_cpd('C')
>>> print(cpd_C)
╒══════╤══════╤══════╤══════╤══════╕
│ A    │ A(0) │ A(0) │ A(1) │ A(1) │
├──────┼──────┼──────┼──────┼──────┤
│ B    │ B(0) │ B(1) │ B(0) │ B(1) │
├──────┼──────┼──────┼──────┼──────┤
│ C(0) │ 0.0  │ 0.0  │ 1.0  │ 0.5  │
├──────┼──────┼──────┼──────┼──────┤
│ C(1) │ 1.0  │ 1.0  │ 0.0  │ 0.5  │
╘══════╧══════╧══════╧══════╧══════╛
get_parameters()[source]

Method to estimate the model parameters (CPDs) using Maximum Likelihood Estimation.

Returns

parameters – List of TabularCPDs, one for each variable of the model

Return type

list

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import MaximumLikelihoodEstimator
>>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 4)),
...                       columns=['A', 'B', 'C', 'D'])
>>> model = BayesianModel([('A', 'B'), ('C', 'B'), ('C', 'D')])
>>> estimator = MaximumLikelihoodEstimator(model, values)
>>> estimator.get_parameters()
[<TabularCPD representing P(C:2) at 0x7f7b534251d0>,
<TabularCPD representing P(B:2 | C:2, A:2) at 0x7f7b4dfd4da0>,
<TabularCPD representing P(A:2) at 0x7f7b4dfd4fd0>,
<TabularCPD representing P(D:2 | C:2) at 0x7f7b4df822b0>]

Structure Score

class pgmpy.estimators.StructureScore(data, **kwargs)[source]
score(model)[source]

Computes a score to measure how well the given BayesianModel fits to the data set. (This method relies on the local_score-method that is implemented in each subclass.)

Parameters

model (BayesianModel instance) – The Bayesian network that is to be scored. Nodes of the BayesianModel need to coincide with column names of data set.

Returns

score – A number indicating the degree of fit between data and model

Return type

float

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import K2Score
>>> # create random data sample with 3 variables, where B and C are identical:
>>> data = pd.DataFrame(np.random.randint(0, 5, size=(5000, 2)), columns=list('AB'))
>>> data['C'] = data['B']
>>> K2Score(data).score(BayesianModel([['A','B'], ['A','C']]))
-24242.367348745247
>>> K2Score(data).score(BayesianModel([['A','B'], ['B','C']]))
-16273.793897051042
structure_prior(model)[source]

A (log) prior distribution over models. Currently unused (= uniform).

Structural Equation Model Estimators

class pgmpy.estimators.IVEstimator(model)[source]

Implements Instrumental Variable (IV) based estimator.

fit(X, Y, data, ivs=None, civs=None)[source]

Estimates the parameter X -> Y.

Parameters
  • X (str) – The covariate variable of the parameter being estimated.

  • Y (str) – The predictor variable of the parameter being estimated.

  • data (pd.DataFrame) – The data from which to learn the parameter.

  • ivs (List (default: None)) – List of variable names which should be used as Instrumental Variables (IV). If not specified, tries to find the IVs from the model structure, fails if can’t find either IV or Conditional IV.

  • civs (List of tuples (tuple form: (var, coditional_var))) – List of conditional IVs to use for estimation. If not specified, tries to find the IVs from the model structure, fails if can’t find either IV or Conditional IVs.

Examples

>>> from pgmpy.estimators import IVEstimator # TODO: Finish example.
class pgmpy.estimators.SEMEstimator(model)[source]

Base class of SEM estimators. All the estimators inherit this class.

fit(data, method, opt='adam', init_values='random', exit_delta=0.0001, max_iter=1000, **kwargs)[source]

Estimate the parameters of the model from the data.

Parameters
  • data (pandas DataFrame or pgmpy.data.Data instance) – The data from which to estimate the parameters of the model.

  • method (str ("ml"|"uls"|"gls"|"2sls")) – The fitting function to use. ML : Maximum Likelihood ULS: Unweighted Least Squares GLS: Generalized Least Squares 2sls: 2-SLS estimator

  • init_values (str or dict) – Options for str: random | std | iv dict: dictionary with keys B and zeta.

  • **kwargs (dict) –

    Extra parameters required in case of some estimators. GLS:

    W: np.array (n x n) where n is the number of observe variables.

    2sls:

    x: y:

Returns

pgmpy.model.SEM instance

Return type

Instance of the model with estimated parameters

References

1

Bollen, K. A. (2010). Structural equations with latent variables. New York: Wiley.

get_init_values(data, method)[source]

Computes the starting values for the optimizer.

1

Table 4C.1: Bollen, K. (2014). Structural Equations with Latent Variables. New York, NY: John Wiley & Sons.

gls_loss(params, loss_args)[source]

Method to compute the Weighted Least Squares fitting function. The optimizer calls this method after each iteration with updated params to compute the new loss.

The fitting function for ML is: .. math:: F_{ULS} = tr { [(S - Sigma(theta)) W^{-1}]^2 }

Parameters
  • params (dict) – params contain all the variables which are updated in each iteration of the optimization.

  • loss_args (dict) – loss_args contain all the variable which are not updated in each iteration but are required to compute the loss.

Returns

torch.tensor

Return type

The loss value for the given params and loss_args

ml_loss(params, loss_args)[source]

Method to compute the Maximum Likelihood loss function. The optimizer calls this method after each iteration with updated params to compute the new loss.

The fitting function for ML is: .. math:: F_{ML} = log |\Sigma(\theta)| + tr(S Sigma^{-1}(theta)) - log S - (p+q)

Parameters
  • params (dict) – params contain all the variables which are updated in each iteration of the optimization.

  • loss_args (dict) – loss_args contain all the variable which are not updated in each iteration but are required to compute the loss.

Returns

torch.tensor

Return type

The loss value for the given params and loss_args

uls_loss(params, loss_args)[source]

Method to compute the Unweighted Least Squares fitting function. The optimizer calls this method after each iteration with updated params to compute the new loss.

The fitting function for ML is: .. math:: F_{ULS} = tr[(S - Sigma(theta))^2]

Parameters
  • params (dict) – params contain all the variables which are updated in each iteration of the optimization.

  • loss_args (dict) – loss_args contain all the variable which are not updated in each iteration but are required to compute the loss.

Returns

torch.tensor

Return type

The loss value for the given params and loss_args

Mmhc Estimator

class pgmpy.estimators.MmhcEstimator(data, **kwargs)[source]
estimate(scoring_method=None, tabu_length=10, significance_level=0.01)[source]

Estimates a BayesianModel for the data set, using MMHC. First estimates a graph skeleton using MMPC and then orients the edges using score-based local search (hill climbing).

Parameters
  • significance_level (float, default: 0.01) – The significance level to use for conditional independence tests in the data set. See mmpc-method.

  • scoring_method (instance of a Scoring method (default: BDeuScore)) – The method to use for scoring during Hill Climb Search. Can be an instance of any of the scoring methods implemented in pgmpy.

  • tabu_length (int) – If provided, the last tabu_length graph modifications cannot be reversed during the search procedure. This serves to enforce a wider exploration of the search space. Default value: 100.

Returns

  • model (BayesianModel()-instance, not yet parametrized.)

  • Reference

  • ———

  • Tsamardinos et al., The max-min hill-climbing Bayesian network structure learning algorithm (2005),

  • Algorithm 3

  • http (//www.dsl-lab.org/supplements/mmhc_paper/paper_online.pdf)

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.estimators import PC
>>> data = pd.DataFrame(np.random.randint(0, 2, size=(2500, 4)), columns=list('XYZW'))
>>> data['sum'] = data.sum(axis=1)
>>> est = MmhcEstimator(data)
>>> model = est.estimate()
>>> print(model.edges())
[('Z', 'sum'), ('X', 'sum'), ('W', 'sum'), ('Y', 'sum')]
mmpc(significance_level=0.01)[source]

Estimates a graph skeleton (UndirectedGraph) for the data set, using then MMPC (max-min parents-and-children) algorithm.

Parameters

significance_level (float, default=0.01) –

The significance level to use for conditional independence tests in the data set.

significance_level is the desired Type 1 error probability of falsely rejecting the null hypothesis that variables are independent, given that they are. The lower significance_level, the less likely we are to accept dependencies, resulting in a sparser graph.

Returns

  • skeleton (UndirectedGraph) – An estimate for the undirected graph skeleton of the BN underlying the data.

  • seperating_sets (dict) – A dict containing for each pair of not directly connected nodes a seperating set (“witnessing set”) of variables that makes then conditionally independent. (needed for edge orientation)

References

Tsamardinos et al., The max-min hill-climbing Bayesian network structure learning algorithm (2005), Algorithm 1 & 2 http://www.dsl-lab.org/supplements/mmhc_paper/paper_online.pdf

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from pgmpy.estimators import PC
>>> data = pd.DataFrame(np.random.randint(0, 2, size=(5000, 5)), columns=list('ABCDE'))
>>> data['F'] = data['A'] + data['B'] + data ['C']
>>> est = PC(data)
>>> skel, sep_sets = est.estimate_skeleton()
>>> skel.edges()
[('A', 'F'), ('B', 'F'), ('C', 'F')]
>>> # all independencies are unconditional:
>>> sep_sets
{('D', 'A'): (), ('C', 'A'): (), ('C', 'E'): (), ('E', 'F'): (), ('B', 'D'): (),
 ('B', 'E'): (), ('D', 'F'): (), ('D', 'E'): (), ('A', 'E'): (), ('B', 'A'): (),
 ('B', 'C'): (), ('C', 'D'): ()}
>>> data = pd.DataFrame(np.random.randint(0, 2, size=(5000, 3)), columns=list('XYZ'))
>>> data['X'] += data['Z']
>>> data['Y'] += data['Z']
>>> est = PC(data)
>>> skel, sep_sets = est.estimate_skeleton()
>>> skel.edges()
[('X', 'Z'), ('Y', 'Z')]
>>> # X, Y dependent, but conditionally independent given Z:
>>> sep_sets
{('X', 'Y'): ('Z',)}