BICGauss#

class pgmpy.structure_score.BICGauss(data, state_names=None)[source]#

Bases: LogLikelihoodGauss

BIC structure score for Gaussian Bayesian networks.

This score penalizes the Gaussian log-likelihood to discourage overfitting. The local score is computed as:

\[\operatorname{BIC}(X_i, \Pi_i) = \ell(X_i, \Pi_i) - \frac{d_i}{2} \log n,\]

where \(\ell(X_i, \Pi_i)\) is the fitted Gaussian log-likelihood, \(d_i = \text{df\_model} + 2\) is the effective parameter count used by the implementation, and \(n\) is the number of rows in self.data.

Here df_model is the statsmodels degree-of-freedom count for the fitted regressors and excludes the intercept. The additional + 2 accounts for one intercept parameter and one Gaussian variance parameter.

Parameters:
datapandas.DataFrame

DataFrame where each column represents a continuous variable.

state_namesdict, optional

Accepted for API consistency but not typically used for Gaussian networks.

Raises:
ValueError

If the model cannot be fitted because the data contains incompatible or non-numeric variables.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.structure_score import BICGauss
>>> rng = np.random.default_rng(0)
>>> data = pd.DataFrame(
...     {
...         "A": rng.normal(size=100),
...         "B": rng.normal(size=100),
...         "C": rng.normal(size=100),
...     }
... )
>>> score = BICGauss(data)
>>> round(score.local_score("B", ("A", "C")), 3)
np.float64(-146.37)