BDs#

class pgmpy.structure_score.BDs(data, equivalent_sample_size=10, state_names=None)[source]#

Bases: BDeu

BDs structure score for discrete Bayesian networks.

BDs is a sparse-data variant of BDeu that reallocates the equivalent sample size over the observed parent configurations instead of all possible configurations. This makes it better suited to discrete datasets with many unobserved parent configurations. The local score computed as:

\[\operatorname{BDs}(X_i, \Pi_i) = \left[ \sum_{j \in \mathcal{O}_i} \sum_{k=1}^{r_i} \log \Gamma(N_{ijk} + \beta) + (q_i - \tilde{q}_i) r_i \log \Gamma(\beta) \right] - \left[ \sum_{j \in \mathcal{O}_i} \log \Gamma(N_{ij} + \alpha) + (q_i - \tilde{q}_i) \log \Gamma(\alpha) \right] + \tilde{q}_i \log \Gamma(\alpha) - q_i r_i \log \Gamma(\beta),\]

where \(\mathcal{O}_i\) is the set of observed parent configurations, \(\tilde{q}_i = |\mathcal{O}_i|\), \(q_i\) is the total number of parent configurations, \(r_i\) is the cardinality of \(X_i\), \(\alpha = \text{equivalent_sample_size} / \tilde{q}_i\), \(\beta = \text{equivalent_sample_size} / (r_i q_i)\), and \(N_{ij} = \sum_{k=1}^{r_i} N_{ijk}\).

In the implementation, state_counts(…, reindex=False) keeps only the observed parent configurations. The gamma_counts_adj and gamma_conds_adj terms restore the missing contributions from the unobserved ones so the returned score matches the full BDs formula. This class also uses the marginal uniform graph prior from Scutari (2016).

Parameters:
datapandas.DataFrame

DataFrame where each column represents a discrete variable. Missing values should be set to numpy.nan.

equivalent_sample_sizeint, optional

Equivalent sample size used to define the Dirichlet hyperparameters.

state_namesdict, optional

Dictionary mapping each variable to its discrete states. If not specified, the unique values observed in the data are used.

Raises:
ValueError

If the data contains non-discrete variables, or if the model variables are not present in the data.

References

[1]

Scutari, Marco. An Empirical-Bayes Score for Discrete Bayesian Networks. Journal of Machine Learning Research, 2016, pp. 438-48.

Examples

>>> import pandas as pd
>>> from pgmpy.models import DiscreteBayesianNetwork
>>> from pgmpy.structure_score import BDs
>>> data = pd.DataFrame(
...     {"A": [0, 1, 1, 0], "B": [1, 0, 1, 0], "C": [1, 1, 1, 0]}
... )
>>> model = DiscreteBayesianNetwork([("A", "B"), ("A", "C")])
>>> score = BDs(data, equivalent_sample_size=5)
>>> round(score.score(model), 3)
np.float64(-12.857)
>>> round(score.local_score("B", ("A",)), 3)
np.float64(-3.446)
structure_prior(model) float[source]#

Compute the marginal uniform prior for a structure.

structure_prior_ratio(operation) float[source]#

Compute the prior ratio for a graph edit.