SEM#

class pgmpy.models.SEM(syntax, **kwargs)[source]#

Bases: SEMGraph

Class for representing Structural Equation Models. This class is a wrapper over SEMGraph and SEMAlg to provide a consistent API over the different representations.

Attributes:
model: SEMGraph instance

A graphical representation of the model.

fit()[source]#
classmethod from_RAM(variables, B, zeta, observed=None, wedge_y=None, fixed_values=None)[source]#

Initializes a SEM instance using Reticular Action Model(RAM) notation. The model is defined as:

..math:

\mathbf{\eta} = \mathbf{B \eta} + \mathbf{\epsilon} \\
\mathbf{\y} = \wedge_y \mathbf{\eta}
\zeta = COV(\mathbf{\epsilon})

where \(\mathbf{\eta}\) is the set of variables (both latent and observed), \(\mathbf{\epsilon}\) are the error terms, \(\mathbf{y}\) is the set of observed variables, \(\wedge_y\) is a boolean array of the shape (no of observed variables, no of total variables).

Parameters:
variables: list, array-like

List of variables (both latent and observed) in the model.

B: 2-D boolean array (shape: `len(variables)` x `len(variables)`)

The non-zero parameters in \(B\) matrix. Refer model definition in docstring for details.

zeta: 2-D boolean array (shape: `len(variables)` x `len(variables)`)

The non-zero parameters in \(\zeta\) (error covariance) matrix. Refer model definition in docstring for details.

observed: list, array-like (optional: Either `observed` or `wedge_y` needs to be specified)

List of observed variables in the model.

wedge_y: 2-D array (shape: no. observed x total vars) (optional: Either `observed` or `wedge_y`)

The \(\wedge_y\) matrix. Refer model definition in docstring for details.

fixed_values: dict (optional)

If specified, fixes the parameter values and are not changed during estimation. A dict with the keys B, zeta.

Returns:
pgmpy.models.SEM instance: An instance of the object with initialized values.

Examples

>>> from pgmpy.models import SEM
>>> SEM.from_RAM  # TODO: Finish this
classmethod from_graph(ebunch, latents=[], err_corr=[], err_var={})[source]#

Initializes a SEM instance using graphical structure.

Parameters:
ebunch: list/array-like
List of edges in form of tuples. Each tuple can be of two possible shape:
  1. (u, v): This would add an edge from u to v without setting any parameter

    for the edge.

  2. (u, v, parameter): This would add an edge from u to v and set the edge’s

    parameter to parameter.

latents: list/array-like

List of nodes which are latent. All other variables are considered observed.

err_corr: list/array-like
List of tuples representing edges between error terms. It can be of the following forms:
  1. (u, v): Add correlation between error terms of u and v. Doesn’t set any variance or

    covariance values.

  2. (u, v, covar): Adds correlation between the error terms of u and v and sets the

    parameter to covar.

err_var: dict

Dict of the form (var: variance).

References

[1] McDonald, A, J., & Clelland, D. A. (1984). Textile Workers and Union Sentiment.

Social Forces, 63(2), 502–521

[2] https://en.wikipedia.org/wiki/Structural_equation_modeling#/

media/File:Example_Structural_equation_model.svg

Examples

Defining a model (Union sentiment model[1]) without setting any paramaters.

>>> from pgmpy.models import SEM
>>> sem = SEM.from_graph(
...     ebunch=[
...         ("deferenc", "unionsen"),
...         ("laboract", "unionsen"),
...         ("yrsmill", "unionsen"),
...         ("age", "deferenc"),
...         ("age", "laboract"),
...         ("deferenc", "laboract"),
...     ],
...     latents=[],
...     err_corr=[("yrsmill", "age")],
...     err_var={},
... )

Defining a model (Education [2]) with all the parameters set. For not setting any parameter np.nan can be explicitly passed.

>>> sem_edu = SEM.from_graph(
...     ebunch=[
...         ("intelligence", "academic", 0.8),
...         ("intelligence", "scale_1", 0.7),
...         ("intelligence", "scale_2", 0.64),
...         ("intelligence", "scale_3", 0.73),
...         ("intelligence", "scale_4", 0.82),
...         ("academic", "SAT_score", 0.98),
...         ("academic", "High_school_gpa", 0.75),
...         ("academic", "ACT_score", 0.87),
...     ],
...     latents=["intelligence", "academic"],
...     err_corr=[],
...     err_var={},
... )
classmethod from_lavaan(string=None, filename=None)[source]#

Initializes a SEM instance using lavaan syntax.

Parameters:
string: str (default: None)

A lavaan style multiline set of regression equation representing the model. Refer http://lavaan.ugent.be/tutorial/syntax1.html for details.

filename: str (default: None)

The filename of the file containing the model in lavaan syntax.

classmethod from_lisrel(var_names, params, fixed_masks=None)[source]#

Initializes a SEM instance using LISREL notation. The LISREL notation is defined as: ..math:

\mathbf{\eta} = \mathbf{B \eta} + \mathbf{\Gamma \xi} + mathbf{\zeta} \\
\mathbf{y} = \mathbf{\wedge_y \eta} + \mathbf{\epsilon} \\
\mathbf{x} = \mathbf{\wedge_x \xi} + \mathbf{\delta}

where \(\mathbf{\eta}\) is the set of endogenous variables, \(\mathbf{\xi}\) is the set of exogeneous variables, \(\mathbf{y}\) and \(\mathbf{x}\) are the set of measurement variables for \(\mathbf{\eta}\) and \(\mathbf{\xi}\) respectively. \(\mathbf{\zeta}\), \(\mathbf{\epsilon}\), and \(\mathbf{\delta}\) are the error terms for \(\mathbf{\eta}\), \(\mathbf{y}\), and \(\mathbf{x}\) respectively.

Parameters:
str_model: str (default: None)

A lavaan style multiline set of regression equation representing the model. Refer http://lavaan.ugent.be/tutorial/syntax1.html for details.

If None requires var_names and params to be specified.

var_names: dict (default: None)

A dict with the keys: eta, xi, y, and x. Each keys should have a list as the value with the name of variables.

params: dict (default: None)

A dict of LISREL representation non-zero parameters. Must contain the following keys: B, gamma, wedge_y, wedge_x, phi, theta_e, theta_del, and psi.

If None str_model must be specified.

fixed_params: dict (default: None)

A dict of fixed values for parameters. The shape of the parameters should be same as params.

If None all the parameters are learnable.

Returns:
pgmpy.models.SEM instance: An instance of the object with initalized values.

Examples

>>> from pgmpy.models import SEMAlg
# TODO: Finish this example