Linear Gaussian Bayesian Network¶
- class pgmpy.models.LinearGaussianBayesianNetwork.LinearGaussianBayesianNetwork(ebunch=None, latents={})[source]¶
A Linear Gaussian Bayesian Network is a Bayesian Network, all of whose variables are continuous, and where all of the CPDs are linear Gaussians.
An important result is that the linear Gaussian Bayesian Networks are an alternative representation for the class of multivariate Gaussian distributions.
- add_cpds(*cpds)[source]¶
Add linear Gaussian CPD (Conditional Probability Distribution) to the Bayesian Network.
- Parameters:
cpds (instances of LinearGaussianCPD) – List of LinearGaussianCPDs which will be associated with the model
Examples
>>> from pgmpy.models import LinearGaussianBayesianNetwork >>> from pgmpy.factors.continuous import LinearGaussianCPD >>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')]) >>> cpd1 = LinearGaussianCPD('x1', [1], 4) >>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1']) >>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2']) >>> model.add_cpds(cpd1, cpd2, cpd3) >>> for cpd in model.cpds: ... print(cpd)
P(x1) = N(1; 4) P(x2| x1) = N(0.5*x1_mu); -5) P(x3| x2) = N(-1*x2_mu); 4)
- check_model()[source]¶
Checks the model for various errors. This method checks for the following error -
Checks if the CPDs associated with nodes are consistent with their parents.
- Returns:
check – True if all the checks pass.
- Return type:
boolean
- fit(data, method='mle')[source]¶
Estimates the parameters of the model using the given data.
- Parameters:
data (pd.DataFrame) – A pandas DataFrame with the data to which to fit the model structure. All variables must be continuous valued.
- Returns:
None – be accessed using model.cpds.
- Return type:
The estimated LinearGaussianCPDs are added to the model. They can
Examples
>>> import numpy as np >>> import pandas as pd >>> from pgmpy.models import LinearGaussianBayesianNetwork >>> df = pd.DataFrame(np.random.normal(0, 1, (100, 3)), columns=['x1', 'x2', 'x3']) >>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')]) >>> model.fit(df) >>> model.cpds [<LinearGaussianCPD: P(x1) = N(-0.114; 0.911) at 0x7eb77d30cec0, <LinearGaussianCPD: P(x2 | x1) = N(0.07*x1 + -0.075; 1.172) at 0x7eb77171fb60, <LinearGaussianCPD: P(x3 | x2) = N(0.006*x2 + -0.1; 0.922) at 0x7eb6abbdba10]
- get_cpds(node=None)[source]¶
Returns the cpd of the node. If node is not specified returns all the CPDs that have been added till now to the graph
Parameter¶
- node: any hashable python object (optional)
The node whose CPD we want. If node not specified returns all the CPDs added to the model.
- rtype:
A list of linear Gaussian CPDs.
Examples
>>> from pgmpy.models import LinearGaussianBayesianNetwork >>> from pgmpy.factors.continuous import LinearGaussianCPD >>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')]) >>> cpd1 = LinearGaussianCPD('x1', [1], 4) >>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1']) >>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2']) >>> model.add_cpds(cpd1, cpd2, cpd3) >>> model.get_cpds()
- get_random_cpds(loc=0, scale=1, seed=None)[source]¶
Generates random Linear Gaussian CPDs for the model. The coefficients are sampled from a normal distribution with mean loc and standard deviation scale.
- is_imap(JPD)[source]¶
For now, is_imap method has not been implemented for LinearGaussianBayesianNetwork.
- predict(data, distribution='joint')[source]¶
Predicts the distribution of the missing variable (i.e. missing columns) in the given dataset.
- Parameters:
data (pandas.DataFrame) – The dataframe with missing variable which to predict.
- Returns:
variables (list) – The list of variables on which the returned conditional distribution is defined on.
mu (np.array) – The mean array of the conditional joint distribution over the missing variables corresponding to each row of data.
cov (np.array) – The covariance of the conditional joint distribution over the missing variables.
Examples
>>>
- remove_cpds(*cpds)[source]¶
Removes the cpds that are provided in the argument.
- Parameters:
*cpds (LinearGaussianCPD object) – A LinearGaussianCPD object on any subset of the variables of the model which is to be associated with the model.
Examples
>>> from pgmpy.models import LinearGaussianBayesianNetwork >>> from pgmpy.factors.continuous import LinearGaussianCPD >>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')]) >>> cpd1 = LinearGaussianCPD('x1', [1], 4) >>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1']) >>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2']) >>> model.add_cpds(cpd1, cpd2, cpd3) >>> for cpd in model.get_cpds(): ... print(cpd)
P(x1) = N(1; 4) P(x2| x1) = N(0.5*x1_mu); -5) P(x3| x2) = N(-1*x2_mu); 4)
>>> model.remove_cpds(cpd2, cpd3) >>> for cpd in model.get_cpds(): ... print(cpd)
P(x1) = N(1; 4)
- simulate(n=1000, seed=None)[source]¶
Simulates data from the given model.
- Parameters:
- Returns:
pandas.DataFrame – A pandas data frame with the generated samples.
- Return type:
generated samples
Examples
>>> from pgmpy.models import LinearGaussianBayesianNetwork >>> from pgmpy.factors.continuous import LinearGaussianCPD >>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')]) >>> cpd1 = LinearGaussianCPD('x1', [1], 4) >>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1']) >>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2']) >>> model.add_cpds(cpd1, cpd2, cpd3) >>> model.simulate(n=500, seed=42)
- to_joint_gaussian()[source]¶
Linear Gaussian Bayesian Networks can be represented using a joint Gaussian distribution over all the variables. This method gives the mean and covariance of this equivalent joint gaussian distribution.
- Returns:
mean, cov – The mean and the covariance matrix of the joint gaussian distribution.
- Return type:
np.ndarray, np.ndarray
Examples
>>> from pgmpy.models import LinearGaussianBayesianNetwork >>> from pgmpy.factors.continuous import LinearGaussianCPD >>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')]) >>> cpd1 = LinearGaussianCPD('x1', [1], 4) >>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1']) >>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2']) >>> model.add_cpds(cpd1, cpd2, cpd3) >>> mean, cov = model.to_joint_gaussian() >>> mean array([ 1. ], [-4.5], [ 8.5]) >>> cov array([[ 4., 2., -2.], [ 2., 5., -5.], [-2., -5., 8.]])