SEM#
- class pgmpy.models.SEM(syntax, **kwargs)[source]#
Bases:
SEMGraphClass for representing Structural Equation Models. This class is a wrapper over SEMGraph and SEMAlg to provide a consistent API over the different representations.
- Attributes:
- model: SEMGraph instance
A graphical representation of the model.
- classmethod from_RAM(variables, B, zeta, observed=None, wedge_y=None, fixed_values=None)[source]#
Initializes a SEM instance using Reticular Action Model(RAM) notation. The model is defined as:
..math:
\mathbf{\eta} = \mathbf{B \eta} + \mathbf{\epsilon} \\ \mathbf{\y} = \wedge_y \mathbf{\eta} \zeta = COV(\mathbf{\epsilon})
where \(\mathbf{\eta}\) is the set of variables (both latent and observed), \(\mathbf{\epsilon}\) are the error terms, \(\mathbf{y}\) is the set of observed variables, \(\wedge_y\) is a boolean array of the shape (no of observed variables, no of total variables).
- Parameters:
- variables: list, array-like
List of variables (both latent and observed) in the model.
- B: 2-D boolean array (shape: `len(variables)` x `len(variables)`)
The non-zero parameters in \(B\) matrix. Refer model definition in docstring for details.
- zeta: 2-D boolean array (shape: `len(variables)` x `len(variables)`)
The non-zero parameters in \(\zeta\) (error covariance) matrix. Refer model definition in docstring for details.
- observed: list, array-like (optional: Either `observed` or `wedge_y` needs to be specified)
List of observed variables in the model.
- wedge_y: 2-D array (shape: no. observed x total vars) (optional: Either `observed` or `wedge_y`)
The \(\wedge_y\) matrix. Refer model definition in docstring for details.
- fixed_values: dict (optional)
If specified, fixes the parameter values and are not changed during estimation. A dict with the keys B, zeta.
- Returns:
- pgmpy.models.SEM instance: An instance of the object with initialized values.
Examples
>>> from pgmpy.models import SEM >>> SEM.from_RAM # TODO: Finish this
- classmethod from_graph(ebunch, latents=[], err_corr=[], err_var={})[source]#
Initializes a SEM instance using graphical structure.
- Parameters:
- ebunch: list/array-like
- List of edges in form of tuples. Each tuple can be of two possible shape:
- (u, v): This would add an edge from u to v without setting any parameter
for the edge.
- (u, v, parameter): This would add an edge from u to v and set the edge’s
parameter to parameter.
- latents: list/array-like
List of nodes which are latent. All other variables are considered observed.
- err_corr: list/array-like
- List of tuples representing edges between error terms. It can be of the following forms:
- (u, v): Add correlation between error terms of u and v. Doesn’t set any variance or
covariance values.
- (u, v, covar): Adds correlation between the error terms of u and v and sets the
parameter to covar.
- err_var: dict
Dict of the form (var: variance).
References
- [1] McDonald, A, J., & Clelland, D. A. (1984). Textile Workers and Union Sentiment.
Social Forces, 63(2), 502–521
- [2] https://en.wikipedia.org/wiki/Structural_equation_modeling#/
Examples
Defining a model (Union sentiment model[1]) without setting any paramaters.
>>> from pgmpy.models import SEM >>> sem = SEM.from_graph( ... ebunch=[ ... ("deferenc", "unionsen"), ... ("laboract", "unionsen"), ... ("yrsmill", "unionsen"), ... ("age", "deferenc"), ... ("age", "laboract"), ... ("deferenc", "laboract"), ... ], ... latents=[], ... err_corr=[("yrsmill", "age")], ... err_var={}, ... )
Defining a model (Education [2]) with all the parameters set. For not setting any parameter np.nan can be explicitly passed.
>>> sem_edu = SEM.from_graph( ... ebunch=[ ... ("intelligence", "academic", 0.8), ... ("intelligence", "scale_1", 0.7), ... ("intelligence", "scale_2", 0.64), ... ("intelligence", "scale_3", 0.73), ... ("intelligence", "scale_4", 0.82), ... ("academic", "SAT_score", 0.98), ... ("academic", "High_school_gpa", 0.75), ... ("academic", "ACT_score", 0.87), ... ], ... latents=["intelligence", "academic"], ... err_corr=[], ... err_var={}, ... )
- classmethod from_lavaan(string=None, filename=None)[source]#
Initializes a SEM instance using lavaan syntax.
- Parameters:
- string: str (default: None)
A lavaan style multiline set of regression equation representing the model. Refer http://lavaan.ugent.be/tutorial/syntax1.html for details.
- filename: str (default: None)
The filename of the file containing the model in lavaan syntax.
- classmethod from_lisrel(var_names, params, fixed_masks=None)[source]#
Initializes a SEM instance using LISREL notation. The LISREL notation is defined as: ..math:
\mathbf{\eta} = \mathbf{B \eta} + \mathbf{\Gamma \xi} + mathbf{\zeta} \\ \mathbf{y} = \mathbf{\wedge_y \eta} + \mathbf{\epsilon} \\ \mathbf{x} = \mathbf{\wedge_x \xi} + \mathbf{\delta}
where \(\mathbf{\eta}\) is the set of endogenous variables, \(\mathbf{\xi}\) is the set of exogeneous variables, \(\mathbf{y}\) and \(\mathbf{x}\) are the set of measurement variables for \(\mathbf{\eta}\) and \(\mathbf{\xi}\) respectively. \(\mathbf{\zeta}\), \(\mathbf{\epsilon}\), and \(\mathbf{\delta}\) are the error terms for \(\mathbf{\eta}\), \(\mathbf{y}\), and \(\mathbf{x}\) respectively.
- Parameters:
- str_model: str (default: None)
A lavaan style multiline set of regression equation representing the model. Refer http://lavaan.ugent.be/tutorial/syntax1.html for details.
If None requires var_names and params to be specified.
- var_names: dict (default: None)
A dict with the keys: eta, xi, y, and x. Each keys should have a list as the value with the name of variables.
- params: dict (default: None)
A dict of LISREL representation non-zero parameters. Must contain the following keys: B, gamma, wedge_y, wedge_x, phi, theta_e, theta_del, and psi.
If None str_model must be specified.
- fixed_params: dict (default: None)
A dict of fixed values for parameters. The shape of the parameters should be same as params.
If None all the parameters are learnable.
- Returns:
- pgmpy.models.SEM instance: An instance of the object with initalized values.
Examples
>>> from pgmpy.models import SEMAlg # TODO: Finish this example