Expectation Maximization (EM)¶
- class pgmpy.estimators.ExpectationMaximization(model, data, **kwargs)[source]¶
- get_parameters(latent_card=None, max_iter=100, atol=1e-08, n_jobs=- 1, seed=None, show_progress=True)[source]¶
Method to estimate all model parameters (CPDs) using Expecation Maximization.
- Parameters:
latent_card (dict (default: None)) – A dictionary of the form {latent_var: cardinality} specifying the cardinality (number of states) of each latent variable. If None, assumes 2 states for each latent variable.
max_iter (int (default: 100)) – The maximum number of iterations the algorithm is allowed to run for. If max_iter is reached, return the last value of parameters.
atol (int (default: 1e-08)) – The absolute accepted tolerance for checking convergence. If the parameters change is less than atol in an iteration, the algorithm will exit.
n_jobs (int (default: -1)) – Number of jobs to run in parallel. Default: -1 uses all the processors.
seed (int) – The random seed to use for generating the intial values.
show_progress (boolean (default: True)) – Whether to show a progress bar for iterations.
- Returns:
Estimated paramters (CPDs) – A list of estimated CPDs for the model.
- Return type:
Examples
>>> import numpy as np >>> import pandas as pd >>> from pgmpy.models import BayesianNetwork >>> from pgmpy.estimators import ExpectationMaximization as EM >>> data = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 3)), ... columns=['A', 'C', 'D']) >>> model = BayesianNetwork([('A', 'B'), ('C', 'B'), ('C', 'D')], latents={'B'}) >>> estimator = EM(model, data) >>> estimator.get_parameters(latent_card={'B': 3}) [<TabularCPD representing P(C:2) at 0x7f7b534251d0>, <TabularCPD representing P(B:3 | C:2, A:2) at 0x7f7b4dfd4da0>, <TabularCPD representing P(A:2) at 0x7f7b4dfd4fd0>, <TabularCPD representing P(D:2 | C:2) at 0x7f7b4df822b0>]