GES#
- class pgmpy.causal_discovery.GES(scoring_method: str | BaseStructureScore | None = None, return_type: str = 'pdag', min_improvement: float = 1e-06)[source]#
Bases:
_ScoreMixin,_BaseCausalDiscoveryScore-based causal discovery using Greedy Equivalence Search (GES).
This class implements the GES algorithm [1] for causal discovery. Given a tabular dataset, the algorithm estimates the causal structure among the variables in the data as a Directed Acyclic Graph (DAG) or Partially Directed Acyclic Graph (PDAG).
- GES works in three phases:
Forward phase: Edges are added to improve the model score.
Backward phase: Edges are removed to improve the model score.
Edge turning phase: Edge orientations are flipped to improve the score.
- Parameters:
- scoring_methodstr or BaseStructureScore instance, default=None
The score to be optimized during structure estimation. Supported structure scores:
Discrete data: ‘k2’, ‘bdeu’, ‘bds’, ‘bic-d’, ‘aic-d’
Continuous data: ‘ll-g’, ‘aic-g’, ‘bic-g’
Mixed data: ‘ll-cg’, ‘aic-cg’, ‘bic-cg’
If None, the appropriate scoring method is automatically selected based on the data type. Also accepts a custom score instance that inherits from BaseStructureScore.
- return_typestr, default=’pdag’
The type of graph to return. Options are:
‘dag’: Returns a directed acyclic graph (DAG).
‘pdag’: Returns a partially directed acyclic graph (PDAG).
- min_improvementfloat, default=1e-6
The minimum score improvement required to perform an operation (edge addition, removal, or flipping). Operations with smaller improvements are not performed.
- Attributes:
- causal_graph_DAG or PDAG
The learned causal graph at a (local) score maximum.
- adjacency_matrix_pd.DataFrame
Adjacency matrix representation of the learned causal graph.
- n_features_in_int
The number of features in the data used to learn the causal graph.
- feature_names_in_np.ndarray
The feature names in the data used to learn the causal graph.
References
[1]Chickering, David Maxwell. “Optimal structure identification with greedy search.” Journal of machine learning research 3.Nov (2002): 507-554.
Examples
Simulate some data to use for causal discovery:
>>> import numpy as np >>> from pgmpy.example_models import load_model >>> np.random.seed(42) >>> model = load_model("bnlearn/alarm") >>> df = model.simulate(n_samples=1000, seed=42)
Use the GES algorithm to learn the causal structure from data:
>>> from pgmpy.causal_discovery import GES >>> ges = GES(scoring_method="bic-d") >>> ges.fit(df) GES(scoring_method='bic-d') >>> ges.causal_graph_ <pgmpy.base.PDAG.PDAG object at 0x...> >>> ges.n_features_in_ 37
- delete(u: Any, v: Any, H: set[Any], current_model: PDAG) PDAG[source]#
Perform delete(u - v) or delete(u -> v) with conditioning set H.
- insert(u: Any, v: Any, T: Iterable[Any], current_model: PDAG) PDAG[source]#
Perform insert(u -> v) with conditioning set T.
- set_score_request(*, metric: bool | None | str = '$UNCHANGED$', true_graph: bool | None | str = '$UNCHANGED$') GES#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- metricstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
metricparameter inscore.- true_graphstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
true_graphparameter inscore.
- Returns:
- selfobject
The updated object.