Linear Gaussian Bayesian Network

class pgmpy.models.LinearGaussianBayesianNetwork.LinearGaussianBayesianNetwork(ebunch=None, latents={})[source]

A Linear Gaussian Bayesian Network is a Bayesian Network, all of whose variables are continuous, and where all of the CPDs are linear Gaussians.

An important result is that the linear Gaussian Bayesian Networks are an alternative representation for the class of multivariate Gaussian distributions.

add_cpds(*cpds)[source]

Add linear Gaussian CPD (Conditional Probability Distribution) to the Bayesian Network.

Parameters:

cpds (instances of LinearGaussianCPD) – List of LinearGaussianCPDs which will be associated with the model

Examples

>>> from pgmpy.models import LinearGaussianBayesianNetwork
>>> from pgmpy.factors.continuous import LinearGaussianCPD
>>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')])
>>> cpd1 = LinearGaussianCPD('x1', [1], 4)
>>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1'])
>>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2'])
>>> model.add_cpds(cpd1, cpd2, cpd3)
>>> for cpd in model.cpds:
...     print(cpd)

P(x1) = N(1; 4) P(x2| x1) = N(0.5*x1_mu); -5) P(x3| x2) = N(-1*x2_mu); 4)

check_model()[source]

Checks the model for various errors. This method checks for the following error -

  • Checks if the CPDs associated with nodes are consistent with their parents.

Returns:

check – True if all the checks pass.

Return type:

boolean

fit(data, method='mle')[source]

Estimates the parameters of the model using the given data.

Parameters:

data (pd.DataFrame) – A pandas DataFrame with the data to which to fit the model structure. All variables must be continuous valued.

Returns:

None – be accessed using model.cpds.

Return type:

The estimated LinearGaussianCPDs are added to the model. They can

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pgmpy.models import LinearGaussianBayesianNetwork
>>> df = pd.DataFrame(np.random.normal(0, 1, (100, 3)), columns=['x1', 'x2', 'x3'])
>>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')])
>>> model.fit(df)
>>> model.cpds
[<LinearGaussianCPD: P(x1) = N(-0.114; 0.911) at 0x7eb77d30cec0,
 <LinearGaussianCPD: P(x2 | x1) = N(0.07*x1 + -0.075; 1.172) at 0x7eb77171fb60,
 <LinearGaussianCPD: P(x3 | x2) = N(0.006*x2 + -0.1; 0.922) at 0x7eb6abbdba10]
get_cardinality(node)[source]

Cardinality is not defined for continuous variables.

get_cpds(node=None)[source]

Returns the cpd of the node. If node is not specified returns all the CPDs that have been added till now to the graph

Parameter

node: any hashable python object (optional)

The node whose CPD we want. If node not specified returns all the CPDs added to the model.

rtype:

A list of linear Gaussian CPDs.

Examples

>>> from pgmpy.models import LinearGaussianBayesianNetwork
>>> from pgmpy.factors.continuous import LinearGaussianCPD
>>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')])
>>> cpd1 = LinearGaussianCPD('x1', [1], 4)
>>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1'])
>>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2'])
>>> model.add_cpds(cpd1, cpd2, cpd3)
>>> model.get_cpds()
static get_random(n_nodes=5, edge_prob=0.5, node_names=None, latents=False, loc=0, scale=1, seed=None)[source]

Returns a randomly generated Linear Gaussian Bayesian Network on n_nodes variables with edge probabiliy of edge_prob between variables.

Parameters:
  • n_nodes (int) – The number of nodes in the randomly generated DAG.

  • edge_prob (float) – The probability of edge between any two nodes in the topologically sorted DAG.

  • node_names (list (default: None)) – A list of variables names to use in the random graph. If None, the node names are integer values starting from 0.

  • latents (bool (default: False)) – If True, also creates latent variables.

  • loc (float) – The mean of the normal distribution from which the coefficients are sampled.

  • scale (float) – The standard deviation of the normal distribution from which the coefficients are sampled.

  • seed (int) – The seed for the random number generator.

Returns:

Random DAG – The randomly generated DAG.

Return type:

pgmpy.base.DAG

Examples

>>> from pgmpy.models import LinearGaussianBayesianNetwork
>>> model = LinearGaussianBayesianNetwork.get_random(n_nodes=5)
>>> model.nodes()
NodeView((0, 3, 1, 2, 4))
>>> model.edges()
OutEdgeView([(0, 3), (3, 4), (1, 3), (2, 4)])
>>> model.cpds
[<LinearGaussianCPD: P(0) = N(1.764; 1.613) at 0x2732f41aae0,
<LinearGaussianCPD: P(3 | 0, 1) = N(-0.721*0 + -0.079*1 + 0.943; 0.12) at 0x2732f16db20,
<LinearGaussianCPD: P(1) = N(-0.534; 0.208) at 0x2732f320b30,
<LinearGaussianCPD: P(2) = N(-0.023; 0.166) at 0x2732d8d5f40,
<LinearGaussianCPD: P(4 | 2, 3) = N(-0.24*2 + -0.907*3 + 0.625; 0.48) at 0x2737fecdaf0]
get_random_cpds(loc=0, scale=1, inplace=False, seed=None)[source]

Generates random Linear Gaussian CPDs for the model. The coefficients are sampled from a normal distribution with mean loc and standard deviation scale.

Parameters:
  • loc (float) – The mean of the normal distribution from which the coefficients are sampled.

  • scale (float) – The standard deviation of the normal distribution from which the coefficients are sampled.

  • inplace (bool (default: False)) – If inplace=True, adds the generated LinearGaussianCPDs to model itself, else creates a copy of the model.

  • seed (int) – The seed for the random number generator.

is_imap(JPD)[source]

For now, is_imap method has not been implemented for LinearGaussianBayesianNetwork.

predict(data, distribution='joint')[source]

Predicts the distribution of the missing variable (i.e. missing columns) in the given dataset.

Parameters:

data (pandas.DataFrame) – The dataframe with missing variable which to predict.

Returns:

  • variables (list) – The list of variables on which the returned conditional distribution is defined on.

  • mu (np.array) – The mean array of the conditional joint distribution over the missing variables corresponding to each row of data.

  • cov (np.array) – The covariance of the conditional joint distribution over the missing variables.

Examples

>>>
remove_cpds(*cpds)[source]

Removes the cpds that are provided in the argument.

Parameters:

*cpds (LinearGaussianCPD object) – A LinearGaussianCPD object on any subset of the variables of the model which is to be associated with the model.

Examples

>>> from pgmpy.models import LinearGaussianBayesianNetwork
>>> from pgmpy.factors.continuous import LinearGaussianCPD
>>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')])
>>> cpd1 = LinearGaussianCPD('x1', [1], 4)
>>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1'])
>>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2'])
>>> model.add_cpds(cpd1, cpd2, cpd3)
>>> for cpd in model.get_cpds():
...     print(cpd)

P(x1) = N(1; 4) P(x2| x1) = N(0.5*x1_mu); -5) P(x3| x2) = N(-1*x2_mu); 4)

>>> model.remove_cpds(cpd2, cpd3)
>>> for cpd in model.get_cpds():
...     print(cpd)

P(x1) = N(1; 4)

simulate(n=1000, seed=None)[source]

Simulates data from the given model.

Parameters:
  • n (int) – The number of samples to draw from the model.

  • seed (int (default: None)) – Seed for the random number generator.

Returns:

pandas.DataFrame – A pandas data frame with the generated samples.

Return type:

generated samples

Examples

>>> from pgmpy.models import LinearGaussianBayesianNetwork
>>> from pgmpy.factors.continuous import LinearGaussianCPD
>>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')])
>>> cpd1 = LinearGaussianCPD('x1', [1], 4)
>>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1'])
>>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2'])
>>> model.add_cpds(cpd1, cpd2, cpd3)
>>> model.simulate(n=500, seed=42)
to_joint_gaussian()[source]

Linear Gaussian Bayesian Networks can be represented using a joint Gaussian distribution over all the variables. This method gives the mean and covariance of this equivalent joint gaussian distribution.

Returns:

mean, cov – The mean and the covariance matrix of the joint gaussian distribution.

Return type:

np.ndarray, np.ndarray

Examples

>>> from pgmpy.models import LinearGaussianBayesianNetwork
>>> from pgmpy.factors.continuous import LinearGaussianCPD
>>> model = LinearGaussianBayesianNetwork([('x1', 'x2'), ('x2', 'x3')])
>>> cpd1 = LinearGaussianCPD('x1', [1], 4)
>>> cpd2 = LinearGaussianCPD('x2', [-5, 0.5], 4, ['x1'])
>>> cpd3 = LinearGaussianCPD('x3', [4, -1], 3, ['x2'])
>>> model.add_cpds(cpd1, cpd2, cpd3)
>>> mean, cov = model.to_joint_gaussian()
>>> mean
array([ 1. ], [-4.5], [ 8.5])
>>> cov
array([[ 4.,  2., -2.],
       [ 2.,  5., -5.],
       [-2., -5.,  8.]])
to_markov_model()[source]

For now, to_markov_model method has not been implemented for LinearGaussianBayesianNetwork.